An enterprise product is a chain of trust decisions: who is the user, what may they see, how much can we spend, what do auditors need, and how do we recover when a model vendor blips? You carve that into layers with clear jobs so one team does not “accidentally” smuggle secrets, PII, or the wrong model into production.
Experience is the surface users touch: web, mobile, sometimes voice. It streams partial answers, shows citations, handles errors gracefully, and never stores provider API keys in the browser.
API / BFF (backend-for-frontend) is the first trusted server: authentication, tenant or company context, input validation, and rate limits. It shapes “what the user asked” into a structured request the rest of the stack understands.
Orchestration is the workflow brain: should we RAG? Which index? Call a calculator or ticket tool? Block risky flows until a human approves? Product policy lives here—not as megabyte string literals scattered in HTTP handlers.
The LLM gateway is the controlled front door to every model vendor: keys, retries, routing, quotas, redacted logging, and failover. Every internal service should speak to the gateway—not directly to five different cloud APIs with five different retry implementations.
Knowledge & tools is where facts and actions live: vector search, relational data, CRM, and tool executors with per-user permissions. The model does not magically know your company; this layer grounds answers and constrains side effects.
The data platform ingests documents, chunks, embeds, and re-indexes when sources change. Without it, RAG goes stale and answers silently diverge from reality.
Observability & governance covers traces, cost by tenant, audit logs where required, and versioning of prompts and models so you can answer: “Exactly which configuration produced this answer on Tuesday at 2pm?”
Figure 1 — Reference architecture (logical layers)
flowchart TB
subgraph L1["1 Experience"]
UI["Web / mobile / voice"]
end
subgraph L2["2 API / BFF"]
API["Auth, validate, tenant"]
end
subgraph L3["3 Orchestration"]
OR["RAG, agents, policies"]
end
subgraph L4["4 LLM gateway"]
GW["Keys, route, meter, retry"]
end
subgraph L5["5 Knowledge and tools"]
V[("Vector / search")]
DB[("SQL / APIs")]
TX["Tool execution"]
end
subgraph L6["6 Data platform"]
ING["Ingest, chunk, embed"]
end
subgraph L7["7 Observability"]
O["Traces, audit, versions"]
end
UI --> API --> OR --> GW
OR --> V
OR --> DB
OR --> TX
ING --> V
GW --> O
OR --> O
TX --> O