sharpbyte.dev

LLM system safe for regulated industries (banking)

A top-tier bank wants an internal AI assistant for employees—not a consumer chatbot. Compliance is not a footer slide: the system must not invent regulatory numbers, must cite every claim, must not leak PII across sessions, and must produce an immutable audit trail for every answer.

Scenario

Design an LLM system that is safe for regulated industries.

A top-tier bank wants to deploy an internal AI assistant. It must never hallucinate regulatory figures, must cite sources for every claim, must never leak PII across user sessions, and must be auditable for every response it generates.

Design the full architecture including guardrails, audit logging, and citation enforcement.

What you should be able to do after reading:

Step 0 — How to open the session

  1. Who uses it? Branch staff, compliance analysts, engineers—each has different data scope and risk.
  2. What is “regulatory figure”? Capital ratios, reserve requirements, fee caps, disclosure deadlines—define the taxonomy.
  3. What sources are authoritative? Only approved policy corpus + regulator bulletins, or also email/Slack?
  4. What does “never hallucinate” mean operationally? Abstain, or only speak from verified structured fields?
  5. Retention and subpoena: how long audit logs live, who can read them, redaction rules.

Step 1 — Clarifying questions

AreaQuestionWhy
JurisdictionUS-only, EU GDPR, multi-entity?Data residency and log handling
ActionsRead-only Q&A or initiate workflows?Tool use multiplies audit and fraud risk
Customer dataCan the assistant see live account PII?Often no in v1—policy docs only
Model hostingOn-prem, private VPC, or vendor API?Contractual no-training, log prohibition
Human reviewWhen must compliance approve an answer template?High-risk intents get human-in-the-loop
SLALatency vs safety—block 30s for verification?Sets synchronous vs async answer path

Step 2 — The sixty-second answer

I would not let the LLM freestyle regulatory numbers. Authoritative limits and ratios live in a structured regulatory fact store (versioned, effective-dated). The assistant answers from RAG over approved documents plus lookup of structured facts for any number in the response.

Every user query runs through an orchestrator with session-scoped context only (no shared memory), input guardrails (PII detection, prompt injection, policy intent routing), then generation constrained to citation-tagged spans. A verification stage checks: each sentence has a citation, cited chunks support the claim, numeric tokens match the fact store, and outbound text is scrubbed for PII. Only then does the API return—and an append-only audit event captures prompts, retrieval ids, model versions, verifier results, and final text hash.

Phrase that lands well: “In banking, ‘safe LLM’ means deny by default—the default action when verification fails is no answer, not a softer guess.”

Step 3 — Non-negotiable requirements

RequirementOperational meaningArchitectural lever
No hallucinated regulatory figuresRates, thresholds, dates must match approved sourceStructured fact store + numeric verifier; abstain if no match
Cite every claimNo uncited sentences in user-visible answerStructured output schema; citation linker; claim–evidence checker
No cross-session PII leakUser A’s customer data never influences User BStateless workers; per-session vault; no global chat memory
Full auditabilityReplay any answer months later for examImmutable audit log + artifact store (chunks, hashes, versions)
Access controlEmployee sees only what HR/role allowsEntitlements on retrieval; filter-first search
Data minimizationLogs useful for audit, not a second data lake of secretsTokenize PII in logs; store chunk ids not full customer rows

Step 4 — High-level architecture

flowchart TB
  subgraph client [Client]
    UI[Internal assistant UI]
  end
  subgraph edge [Edge and identity]
    GW[API gateway + mTLS]
    IAM[Bank SSO / entitlements]
    SESS[Session service - no shared memory]
  end
  subgraph guard_in [Input guardrails]
    INJ[Injection / jailbreak filter]
    PII_IN[PII detect - block or mask query]
    ROUTE[Intent router - risk tier]
  end
  subgraph knowledge [Governed knowledge]
    CORP[Approved document corpus]
    FACT[Regulatory fact store]
    RAG[ACL-aware hybrid retrieval]
  end
  subgraph gen [Controlled generation]
    ORCH[Orchestrator]
    LLM[Private LLM gateway]
    STRUCT[Structured answer + citations]
  end
  subgraph guard_out [Output guardrails]
    CITE[Citation completeness]
    NUM[Numeric / date verifier]
    PII_OUT[PII scrubber]
    POL[Policy / tone classifier]
  end
  subgraph audit [Audit plane]
    EVT[Append-only audit events]
    ART[Artifact store - chunk snapshots]
    SIEM[SIEM + compliance dashboards]
  end
  UI --> GW --> IAM --> SESS
  SESS --> INJ --> PII_IN --> ROUTE
  ROUTE --> ORCH
  CORP --> RAG
  FACT --> ORCH
  RAG --> ORCH
  ORCH --> LLM --> STRUCT
  STRUCT --> CITE --> NUM --> PII_OUT --> POL
  POL --> GW
  ORCH --> EVT
  RAG --> ART
  STRUCT --> ART
  EVT --> SIEM
    

Step 5 — Session isolation (PII must not cross users)

Design rules

What you log vs what you keep in context

DataIn LLM contextIn audit log
Policy PDF chunksYesChunk ids + hash
Customer account numberOnly if role allows; often blocked in v1Tokenized reference
Prior user’s chatNeverN/A

Step 6 — Governed knowledge layer

Approved document corpus

Regulatory fact store (anti-hallucination for numbers)

Extract or manually curate structured records:

RegulatoryFact {
  fact_id, jurisdiction, regulator,
  metric: "LCR_minimum", value: 1.0, unit: "ratio",
  effective_from, effective_to,
  source_doc_id, source_page, approval_workflow_id
}

When the model mentions a ratio, threshold, or deadline, the numeric verifier resolves the span against this store (and cited chunk text). Mismatch → block response.

Step 7 — Guardrails (input, generation, output)

Input guardrails

CheckAction
Prompt injection / exfiltrationClassifier + allowlist tools; strip “ignore previous instructions” patterns
PII in user queryDetect SSN/account patterns; refuse or mask before logging
High-risk intent“Wire $1M”, “bypass AML”—route to block or human queue
Out-of-corpus questions“What will the Fed do tomorrow?”—refuse; no open-web browse in v1

Generation constraints

Output guardrails (hard gate before user sees text)

  1. Citation completeness: every claim has ≥1 citation id present in retrieval set.
  2. Claim–evidence alignment: NLI or cross-encoder scores claim vs cited chunk; below threshold → drop claim or whole answer.
  3. Numeric verifier: regex + parser on amounts, %, dates; join to RegulatoryFact and cited tables.
  4. PII scrubber: block response if unexpected PII patterns appear in output.
  5. Policy classifier: no investment advice, no legal conclusions presented as fact.

Step 8 — Citation enforcement (not optional formatting)

Citations are a release gate, not UI decoration.

Pipeline

  1. Retrieval returns chunks with stable chunk_id, doc_version, page, deep link.
  2. Model emits claims bound to citation_ids only from that set.
  3. Citation linker validates ids exist and were in the prompt context.
  4. Support checker verifies paraphrase against chunk text (entailment score).
  5. Renderer shows footnotes; audit stores the same mapping.

Failure modes

FailureUser experienceAudit
Missing citation“Unable to verify answer”—generic safe messageVERIFY_FAIL:UNCITED_CLAIM
Citation id forgedBlockedVERIFY_FAIL:INVALID_CITE
Supported but weakShow answer with “low confidence” banner or human review queueScore stored
Numeric mismatchBlocked; offer link to source doc onlyVERIFY_FAIL:NUMERIC_DRIFT

Step 9 — Audit logging (every response, exam-ready)

One audit event per assistant turn, append-only (WORM storage or ledger table with hash chain).

AuditEvent {
  event_id, timestamp_utc,
  user_id_hash, session_id, entitlements_snapshot,
  query_redacted, query_hash,
  risk_tier, intent_label,
  retrieval: { chunk_ids[], scores[], corpus_version },
  model: { provider, model_id, prompt_template_version },
  generation: { raw_structured_output_hash },
  verification: { cite_pass, numeric_pass, pii_pass, scores[] },
  response_redacted, response_hash,
  latency_ms_per_stage
}

Artifact store

What you deliberately do not log

Step 10 — Risk tiers and human escalation

TierExamplesPath
Low“Where is the travel policy?”Auto answer after verification
Medium“Summarize this compliance memo”Citations required; optional sample review
HighCapital treatment, AML scenarios, customer-specific adviceHuman approval or hard abstain + link to source
BlockedFraud enablement, credential harvestingRefuse; security alert

Step 11 — Failure points and mitigations

FailureImpactMitigation
Stale policy in indexWrong regulatory answerEffective-dated corpus; nightly diff alerts; UI “as of” date
Verifier timeoutPressure to ship unverified textFail closed; async “answer pending review”
Session fixation / IDORCross-user data exposureBind session to SSO token; server-side session only
Prompt injection via doc body“Ignore policies” in PDFSanitize ingest; treat docs as data not instructions
Over-loggingPII in SIEMRedact + tokenize; separate security vs compliance views
Model upgradeRegression in citation formatGolden-set eval gate; shadow traffic before promotion
Employee pastes customer PIILeak to logs/vendorInput block; train UI warnings; DLP on egress
“Almost right” numberExam findingStructured fact store; no numbers without verification pass

Step 12 — Model risk and compliance hooks

Step 13 — MVP phasing (defensible scope)

  1. Phase 1: Policy Q&A only, no live customer data, citations + audit + abstain.
  2. Phase 2: Regulatory fact store + numeric verifier for ratios and deadlines.
  3. Phase 3: Role-scoped customer context with field-level tokenization and extra human review tier.

Shipping Phase 3 before Phase 1 verification exists is how banks get exam findings.

Step 14 — How to walk through this in a design session

  1. 3 min — requirements table (four non-negotiables).
  2. 5 min — architecture diagram: identity → guardrails → knowledge → verify → audit.
  3. 8 min — citation + numeric verification path (fail closed).
  4. 7 min — session isolation and what never goes in shared memory.
  5. 7 min — audit event schema and retention.
  6. 5 min — failure matrix + MVP phases.
  7. Close — “We optimize for provable correctness, not fluent prose.”

Step 15 — Goals → knobs

GoalKnob
Safer numbersStricter fact store match; block all free-form numerics
Higher coverageMore corpus sources—each adds approval workflow cost
Lower latencyPre-verify common queries; cache verified answers with TTL
Stronger auditChunk snapshots; longer retention; hash-chained events
Less PII riskNo customer context in v1; aggressive input/output DLP

The one line to remember

A bank-safe assistant is a verified reporting pipeline, not a chatbot: retrieve from approved sources, generate with mandatory citations, prove every number and claim before display, isolate every session, and log enough to replay the decision under audit—when verification fails, silence beats fluency.