LLM system safe for regulated industries (banking)

A top-tier bank wants an internal AI assistant for employees—not a consumer chatbot. Compliance is not a footer slide: the system must not invent regulatory numbers, must cite every claim, must not leak PII across sessions, and must produce an immutable audit trail for every answer.

Scenario

Design an LLM system that is safe for regulated industries.

A top-tier bank wants to deploy an internal AI assistant. It must never hallucinate regulatory figures, must cite sources for every claim, must never leak PII across user sessions, and must be auditable for every response it generates.

Design the full architecture including guardrails, audit logging, and citation enforcement.

What you should be able to do after reading:

Frame the problem as controlled generation on top of governed retrieval—not “ChatGPT behind VPN.”
Separate session isolation, input/output guardrails, structured regulatory facts, and post-generation verification.
Explain how citation enforcement and numeric verification block answers before they reach the user.
Describe an audit record complete enough for model risk and internal audit replay.

Step 0 — How to open the session

Who uses it? Branch staff, compliance analysts, engineers—each has different data scope and risk.
What is “regulatory figure”? Capital ratios, reserve requirements, fee caps, disclosure deadlines—define the taxonomy.
What sources are authoritative? Only approved policy corpus + regulator bulletins, or also email/Slack?
What does “never hallucinate” mean operationally? Abstain, or only speak from verified structured fields?
Retention and subpoena: how long audit logs live, who can read them, redaction rules.

Step 1 — Clarifying questions

Area	Question	Why
Jurisdiction	US-only, EU GDPR, multi-entity?	Data residency and log handling
Actions	Read-only Q&A or initiate workflows?	Tool use multiplies audit and fraud risk
Customer data	Can the assistant see live account PII?	Often no in v1—policy docs only
Model hosting	On-prem, private VPC, or vendor API?	Contractual no-training, log prohibition
Human review	When must compliance approve an answer template?	High-risk intents get human-in-the-loop
SLA	Latency vs safety—block 30s for verification?	Sets synchronous vs async answer path

Step 2 — The sixty-second answer

I would not let the LLM freestyle regulatory numbers. Authoritative limits and ratios live in a structured regulatory fact store (versioned, effective-dated). The assistant answers from RAG over approved documents plus lookup of structured facts for any number in the response.

Every user query runs through an orchestrator with session-scoped context only (no shared memory), input guardrails (PII detection, prompt injection, policy intent routing), then generation constrained to citation-tagged spans. A verification stage checks: each sentence has a citation, cited chunks support the claim, numeric tokens match the fact store, and outbound text is scrubbed for PII. Only then does the API return—and an append-only audit event captures prompts, retrieval ids, model versions, verifier results, and final text hash.

Phrase that lands well: “In banking, ‘safe LLM’ means deny by default—the default action when verification fails is no answer, not a softer guess.”

Step 3 — Non-negotiable requirements

Requirement	Operational meaning	Architectural lever
No hallucinated regulatory figures	Rates, thresholds, dates must match approved source	Structured fact store + numeric verifier; abstain if no match
Cite every claim	No uncited sentences in user-visible answer	Structured output schema; citation linker; claim–evidence checker
No cross-session PII leak	User A’s customer data never influences User B	Stateless workers; per-session vault; no global chat memory
Full auditability	Replay any answer months later for exam	Immutable audit log + artifact store (chunks, hashes, versions)
Access control	Employee sees only what HR/role allows	Entitlements on retrieval; filter-first search
Data minimization	Logs useful for audit, not a second data lake of secrets	Tokenize PII in logs; store chunk ids not full customer rows

Step 4 — High-level architecture

flowchart TB
  subgraph client [Client]
    UI[Internal assistant UI]
  end
  subgraph edge [Edge and identity]
    GW[API gateway + mTLS]
    IAM[Bank SSO / entitlements]
    SESS[Session service - no shared memory]
  end
  subgraph guard_in [Input guardrails]
    INJ[Injection / jailbreak filter]
    PII_IN[PII detect - block or mask query]
    ROUTE[Intent router - risk tier]
  end
  subgraph knowledge [Governed knowledge]
    CORP[Approved document corpus]
    FACT[Regulatory fact store]
    RAG[ACL-aware hybrid retrieval]
  end
  subgraph gen [Controlled generation]
    ORCH[Orchestrator]
    LLM[Private LLM gateway]
    STRUCT[Structured answer + citations]
  end
  subgraph guard_out [Output guardrails]
    CITE[Citation completeness]
    NUM[Numeric / date verifier]
    PII_OUT[PII scrubber]
    POL[Policy / tone classifier]
  end
  subgraph audit [Audit plane]
    EVT[Append-only audit events]
    ART[Artifact store - chunk snapshots]
    SIEM[SIEM + compliance dashboards]
  end
  UI --> GW --> IAM --> SESS
  SESS --> INJ --> PII_IN --> ROUTE
  ROUTE --> ORCH
  CORP --> RAG
  FACT --> ORCH
  RAG --> ORCH
  ORCH --> LLM --> STRUCT
  STRUCT --> CITE --> NUM --> PII_OUT --> POL
  POL --> GW
  ORCH --> EVT
  RAG --> ART
  STRUCT --> ART
  EVT --> SIEM

Step 5 — Session isolation (PII must not cross users)

Design rules

No global conversation store keyed only by user id that accumulates all past queries—compliance hates “the model remembered my client.”
Session id per browser tab / shift; context window built from that session’s messages only, TTL 8–24 hours, encrypted at rest.
Workers are stateless; session state in a dedicated store with strict ACL (session owner only).
Retrieval is never conditioned on another user’s history. Optional “personal notes” feature is a separate encrypted bucket with explicit opt-in.
Prompt assembly in a trusted service—not client-side—so users cannot inject another session id.

What you log vs what you keep in context

Data	In LLM context	In audit log
Policy PDF chunks	Yes	Chunk ids + hash
Customer account number	Only if role allows; often blocked in v1	Tokenized reference
Prior user’s chat	Never	N/A

Step 6 — Governed knowledge layer

Approved document corpus

Only documents with approval_status=published and effective_date enter the index.
Versioning: Basel memo v3 vs v4 both exist; retrieval filters by as_of date from query or user-selected “policy as of.”
Hybrid search (lexical + vector) with entitlements—same patterns as enterprise RAG.

Regulatory fact store (anti-hallucination for numbers)

Extract or manually curate structured records:

RegulatoryFact {
  fact_id, jurisdiction, regulator,
  metric: "LCR_minimum", value: 1.0, unit: "ratio",
  effective_from, effective_to,
  source_doc_id, source_page, approval_workflow_id
}

When the model mentions a ratio, threshold, or deadline, the numeric verifier resolves the span against this store (and cited chunk text). Mismatch → block response.

Step 7 — Guardrails (input, generation, output)

Input guardrails

Check	Action
Prompt injection / exfiltration	Classifier + allowlist tools; strip “ignore previous instructions” patterns
PII in user query	Detect SSN/account patterns; refuse or mask before logging
High-risk intent	“Wire $1M”, “bypass AML”—route to block or human queue
Out-of-corpus questions	“What will the Fed do tomorrow?”—refuse; no open-web browse in v1

Generation constraints

Force JSON or XML answer schema: claims[] each with text, citation_ids[], confidence.
System prompt: “If no supporting chunk, output abstain with reason code.”
Temperature low; no creative mode for regulatory tier.

Output guardrails (hard gate before user sees text)

Citation completeness: every claim has ≥1 citation id present in retrieval set.
Claim–evidence alignment: NLI or cross-encoder scores claim vs cited chunk; below threshold → drop claim or whole answer.
Numeric verifier: regex + parser on amounts, %, dates; join to RegulatoryFact and cited tables.
PII scrubber: block response if unexpected PII patterns appear in output.
Policy classifier: no investment advice, no legal conclusions presented as fact.

Step 8 — Citation enforcement (not optional formatting)

Citations are a release gate, not UI decoration.

Pipeline

Retrieval returns chunks with stable chunk_id, doc_version, page, deep link.
Model emits claims bound to citation_ids only from that set.
Citation linker validates ids exist and were in the prompt context.
Support checker verifies paraphrase against chunk text (entailment score).
Renderer shows footnotes; audit stores the same mapping.

Failure modes

Failure	User experience	Audit
Missing citation	“Unable to verify answer”—generic safe message	`VERIFY_FAIL:UNCITED_CLAIM`
Citation id forged	Blocked	`VERIFY_FAIL:INVALID_CITE`
Supported but weak	Show answer with “low confidence” banner or human review queue	Score stored
Numeric mismatch	Blocked; offer link to source doc only	`VERIFY_FAIL:NUMERIC_DRIFT`

Step 9 — Audit logging (every response, exam-ready)

One audit event per assistant turn, append-only (WORM storage or ledger table with hash chain).

AuditEvent {
  event_id, timestamp_utc,
  user_id_hash, session_id, entitlements_snapshot,
  query_redacted, query_hash,
  risk_tier, intent_label,
  retrieval: { chunk_ids[], scores[], corpus_version },
  model: { provider, model_id, prompt_template_version },
  generation: { raw_structured_output_hash },
  verification: { cite_pass, numeric_pass, pii_pass, scores[] },
  response_redacted, response_hash,
  latency_ms_per_stage
}

Artifact store

Snapshot retrieved chunk text at answer time (or content hash + pointer)—so policy updates do not rewrite history.
Retain 7 years (typical banking archive policy—confirm with compliance).
Access: internal audit, model risk, security—role-based, dual control for bulk export.

What you deliberately do not log

Full customer PII in plaintext.
Vendor “training” telemetry—contract must prohibit retention on their side too.

Step 10 — Risk tiers and human escalation

Tier	Examples	Path
Low	“Where is the travel policy?”	Auto answer after verification
Medium	“Summarize this compliance memo”	Citations required; optional sample review
High	Capital treatment, AML scenarios, customer-specific advice	Human approval or hard abstain + link to source
Blocked	Fraud enablement, credential harvesting	Refuse; security alert

Step 11 — Failure points and mitigations

Failure	Impact	Mitigation
Stale policy in index	Wrong regulatory answer	Effective-dated corpus; nightly diff alerts; UI “as of” date
Verifier timeout	Pressure to ship unverified text	Fail closed; async “answer pending review”
Session fixation / IDOR	Cross-user data exposure	Bind session to SSO token; server-side session only
Prompt injection via doc body	“Ignore policies” in PDF	Sanitize ingest; treat docs as data not instructions
Over-logging	PII in SIEM	Redact + tokenize; separate security vs compliance views
Model upgrade	Regression in citation format	Golden-set eval gate; shadow traffic before promotion
Employee pastes customer PII	Leak to logs/vendor	Input block; train UI warnings; DLP on egress
“Almost right” number	Exam finding	Structured fact store; no numbers without verification pass

Step 12 — Model risk and compliance hooks

Model inventory: register LLM, embed model, verifiers with owners and validation dates.
Offline eval: hundreds of gold Q&A from compliance; metrics: citation accuracy, numeric exact match, abstain rate on unanswerable.
Online monitoring: verifier fail rate, abstain rate, thumbs-down → weekly review with compliance.
Incident response: kill switch per model version; replay audit events for affected window.

Step 13 — MVP phasing (defensible scope)

Phase 1: Policy Q&A only, no live customer data, citations + audit + abstain.
Phase 2: Regulatory fact store + numeric verifier for ratios and deadlines.
Phase 3: Role-scoped customer context with field-level tokenization and extra human review tier.

Shipping Phase 3 before Phase 1 verification exists is how banks get exam findings.

Step 14 — How to walk through this in a design session

3 min — requirements table (four non-negotiables).
5 min — architecture diagram: identity → guardrails → knowledge → verify → audit.
8 min — citation + numeric verification path (fail closed).
7 min — session isolation and what never goes in shared memory.
7 min — audit event schema and retention.
5 min — failure matrix + MVP phases.
Close — “We optimize for provable correctness, not fluent prose.”

Step 15 — Goals → knobs

Goal	Knob
Safer numbers	Stricter fact store match; block all free-form numerics
Higher coverage	More corpus sources—each adds approval workflow cost
Lower latency	Pre-verify common queries; cache verified answers with TTL
Stronger audit	Chunk snapshots; longer retention; hash-chained events
Less PII risk	No customer context in v1; aggressive input/output DLP

The one line to remember

A bank-safe assistant is a verified reporting pipeline, not a chatbot: retrieve from approved sources, generate with mandatory citations, prove every number and claim before display, isolate every session, and log enough to replay the decision under audit—when verification fails, silence beats fluency.