Design AI Systems
Open-ended AI architecture prompts: RAG, agents, inference, evaluation, safety, and production tradeoffs. Each guide starts from a blank whiteboard—requirements, napkin math, components, and what you would cut under pressure. For classic product interviews (URL shortener, feeds, payments), see How would you Design?.
-
Guide
RAG for financial PDFs
You’ll defend hybrid search plus a structured numeric index—and explain why vector-only retrieval loses millions on one missed figure.
-
Guide
Production RAG pipeline
You’ll design ingest through generation for 10M docs—PDF, Confluence, Slack, DB—and name every failure point and fix.
-
Guide
Real-time RAG sub-200ms
You’ll hit p95 under 200ms on 50M chunks—sharding, HNSW tuning, cache layers, and where you cut recall for speed.
-
Guide
Regulated LLM for banking
You’ll enforce citations, block unverified regulatory numbers, isolate sessions, and log every answer for audit replay.
-
Guide
Live knowledge · Trading desk
You’ll stream filings and news into hot indexes, invalidate caches by ticker, and never answer from a stale retrieval set.
-
Guide
Air-gapped defense assistant
You’ll run local LLMs and on-prem indexes at 99.9% uptime—signed update bundles, zero internet, classified RAG end to end.