Interview ready · Design · Section 8

Security & compliance

Fifteen staff-depth scenarios on operating LLMs under abuse and regulation: trust-boundary hardening and tiered data deployment; structural injection controls with toolbox governance; hardened indirect-ingestion pipelines; ensemble PII layers with jurisdictional routing; exhaustive GDPR maps and erasure fan-out; vendor contract alignment down to API hostnames; HIPAA-style BAAs and clinical guardrails; multi-tenant AuthN/Z with IDOR testing; tamper-evident regulated audit logs; secret-free prompt hygiene; two-way moderation; graduated jailbreak response; directory-grounded RAG RBAC; vault-first API keys; and STRIDE/OWASP-driven risk registers tied to real backlog work.

Interview stance. Security for LLM products is trust-boundary work plus honest residual risk. Name the arrows on your architecture diagram; every arrow is a place data can leak, linger, or be abused for cost attacks. Compliance is contracts + technical controls + how you operate—not a vendor checkbox.

RBAC on RAG is enforced in retrieval filters and gateways—never by asking the model politely to ignore a doc.
Injection defenses are layered and partial: structure, tool policy, monitoring, and response—not ‘better system prompts.’
Threat models are only valuable when they produce a prioritized, funded backlog with owners and measurable completion.
Secrets, logging, and deletion pipelines deserve the same rigor as model choice—breaches rarely come from softmax arithmetic.

116. How would you secure an LLM application that handles sensitive enterprise data?

Trust boundaries. Draw a data-flow diagram first: browser → BFF → retrieval → tools → vendors. Every hop needs encryption (TLS 1.2+, KMS-managed keys at rest), tenant isolation (separate DB schemas or row-level security), and least-privilege IAM scoped to job roles—not ‘*’ service accounts.

Runtime hardening. Distroless/minimal images, signed builds, SBOM, continuous CVE scanning, and enforced mTLS inside the mesh so a compromised microservice cannot freely laterally move.

Data tiers. Tier-0 public marketing copy can hit shared SaaS inference; tier-1 engineering docs stay in VPC with self-hosted or private endpoints; tier-2 HR/legal may disable whole features (agents, browsing) regardless of user complaints.

Key custody. Customer-managed keys (CMK) for regulated tenants; HSM-backed unwrap flows; break-glass procedures rehearsed quarterly.

Culture. Security is not ‘enable CloudTrail’—it is product tradeoffs: slower features vs breach likelihood. Say that aloud in panels.

117. How would you prevent prompt injection attacks in an LLM-powered product?

Structural separation. System/developer messages, tool definitions, retrieved evidence, and user text live in distinct slots the tokenizer enforces; never concatenate untrusted HTML into the system block.

Tool governance. Allowlisted tools with JSON Schema validation on arguments; high-blast tools (email, SQL) require HITL or capability tokens.

Downstream validation. Output checkers: banned actions, URL allowlists, red-team regression suites on golden injection prompts.

Detection. Heuristics + models for exfil patterns, spikes in tool breadth, or retrieval of honeypot docs that scream ‘if you read this, alert SOC’.

Honesty. Tell interviewers defenses are partial; residual risk is managed with monitoring and insurance-like incident response, not denial.

118. How would you implement indirect prompt injection detection in an agent that reads external URLs or files?

Fetch pipeline. Proxied fetches with SSRF protections, content-type sniffing, size caps, and malware scanning; strip scripts/active content; store only sanitized text/markdown in quarantine buckets.

Marking. Wrap ingested content as explicitly untrusted; forbid models from treating it as instructions—pair with deterministic tool policies that ignore prose asking for secrets.

Signals. Entropy spikes, imperative verbs, credential-shaped strings, or sudden requests for high-risk tools after loading a page trigger elevated friction (extra confirmation, slower path).

HITL. First-seen domains or attachment hashes escalate to human reviewers before autonomous actions.

Telemetry. Correlate injection attempts with attacker infrastructure to feed blocklists without overfitting polite webpages.

119. How would you design a PII detection and redaction layer before sending user data to an external LLM API?

Stacked detectors. Regex for formats, transformer NER for names/orgs, org dictionaries for project codenames and internal SKU patterns; ensemble voting lowers false negatives that become regulator letters.

Strategies. Block entirely, redact irreversibly, or tokenize with mapping stored only on customer-controlled VPC—pick per field sensitivity; log redaction coverage metrics per tenant.

Ground truth. Continuously evaluate on sampled production traffic with human labels; languages and code-switching break naive models—budget accordingly.

Contracts. Some customers require ‘no US subprocessors’ regardless of redaction—route those workloads to compliant stacks, do not argue with Legal in prod.

Failure mode. When uncertain, fail closed for regulated modes and ask the user to remove fields explicitly.

120. How would you architect an LLM system that must comply with GDPR (data residency, right to erasure)?

Data map. Inventory every store: prompt logs, vector DB, object storage, analytics warehouse, support tickets, offline eval dumps. Erasure jobs must fan out to all—not only Postgres.

Regional stacks. EU control plane + inference + logging with strict egress; forbid cross-border replication unless SCCs + DPIA approve.

Lawful basis & consent. Separate processing for contract fulfillment vs product analytics; pseudonymous session ids where possible.

Erasure SLA. Idempotent delete pipeline with receipts; tombstone vectors; rehydrate checks to ensure no ghost citations reference erased subjects.

Vendors. DPAs, subprocessors list, DPIA for high-risk use (automated decision-making), and transfer impact assessments when US APIs remain necessary.

121. How would you ensure that user conversation data sent to OpenAI/Anthropic APIs is not used for model training?

Contract → config. Enterprise / zero-retention agreements must match the exact API hostname and org id; a wrong base URL voids the paper contract.

Operational guardrails. Infrastructure-as-code checks that prod cannot accidentally point at consumer endpoints; CI fails if mismatch.

Assurance. Annual vendor questionnaires, SOC2/ISO reviews, and occasional third-party audits—not ‘trust me bro’ screenshots.

Telemetry hygiene. If support tools export chats to third-party ticketing, that is also vendor processing—extend controls.

Transparency. Customer-facing doc states model version, retention, regions; reduces procurement friction and legal exposure.

122. How would you design an LLM system for a HIPAA-compliant healthcare application?

BAA coverage. Every subprocess touching PHI—vector DB, logging vendor, annotation workforce—must sign BAAs or be excluded; shadow IT tools are compliance landmines.

PHI minimization. Strip identifiers before model calls when clinical task allows; use on-prem or private endpoints for residual PHI; consider smaller specialist models fine-tuned on de-identified corpora.

Access controls. Role-based + purpose-of-use logging, MFA, break-glass with post hoc review; immutable audit proving who saw which record.

Human oversight. Clinical outputs are decision support, not autopilot—document handoff UX, liability positioning, and escalation to licensed professionals.

Roadmap realism. Block fancy agents/browsing until legal & clinical safety sign off; deliver incremental value within guardrails.

123. How would you implement authentication and authorization for a multi-tenant LLM API?

Tokens. OAuth2 client credentials for services, Authorization Code + PKCE for users; JWT carries tenant_id, roles, scopes, and session elevation flags parsed only by hardened gateway code.

Enforcement point. Gateway rejects anonymous or cross-tenant paths before retrieval—never ‘wide open’ vector search behind one API key per environment.

Service mesh. mTLS with SPIFFE IDs between internal microservices; per-tenant rate limits and concurrency tokens to stop noisy neighbors.

Key rotation. JWT kid headers, short TTLs, refresh token binding, and automated revocation on breach.

Testing. Regression tests attempt horizontal privilege escalation (tenant A queries tenant B corpus) every release.

124. How would you design audit logging for all LLM interactions in a regulated industry (banking, insurance)?

Evidence model. Append-only log (WORM or hash-chained) capturing actor id, tenant, decision type, policy pack version, model id, retrieval corpus version, tool invocations (redacted args), and cryptographic hashes of prompts/responses when storing raw text is disallowed.

Replayability. Enough metadata to reconstruct why advice differed Tuesday vs Wednesday without necessarily keeping full plaintext forever.

Exports. Self-service regulator bundles with legal hold workflows; PII minimization in exports.

Segregation of duties. Engineers cannot silently delete audit rows; ops changes require dual control.

Latency awareness. Async logging to hot and cold tiers; never drop audit on firehose backpressure—backpressure user traffic instead.

Example. Bank records model versions, policy pack id, and approval chain id per advice response.

125. How would you prevent a user from extracting the system prompt or internal instructions through adversarial prompting?

Assume leakage. Creative users eventually screenshot something resembling instructions—design prompts without crown jewels: no API keys, no internal hostnames, no unreleased strategy.

Distribution. Rotate prompt versions frequently; per-tenant prompt salts complicate large-scale exfil playbooks.

Detection. Rate limit rapid-fire probing patterns; cluster accounts attempting ‘repeat your instructions verbatim’ games.

Response policy. Boring refusal beats witty acknowledgment that invites gamification.

Escalation. Tie repeated abuse to account reputation or step-up verification—not only silent LLM refusals.

126. How would you design a content moderation pipeline that screens both LLM inputs and outputs?

Dual directions. Input screens stop toxic jailbreak fuel and illegal content uploads; output screens catch disallowed completions before UI or downstream tools—even if model vendors claim safety layers.

Stack. Fast blocklists, lightweight classifiers, optional third-party APIs, geo-specific legal rules (EU hate speech vs US First Amendment nuances), and HITL triage for borderline creative workloads.

Latency tiers. Sync checks for obviously bad tokens; async human review for marketing bulk gen with SLA.

Feedback. False positives feed retraining; document appeal path for creators.

Transparency. Log moderation decisions with codes customers can query in disputes.

127. How would you handle a jailbreak attempt detected at runtime in a production LLM system?

Severity ladder. Benign curiosity → warn/soft refuse; repeated policy tests → throttle; automated credential hunting or mass scraping → hard block + SOC alert.

Session hygiene. Terminate tool sessions, rotate tokens, revoke OAuth grants if abuse tied to compromised bridge.

Forensics. Store compact feature vector of attack, not necessarily full harmful payload, respecting retention policies.

Product comms. Honest but non-gamified messaging; avoid Easter eggs that become TikTok challenges.

Post-incident. Tune detectors, add golden tests, consider per-account shadowbans for research abuse.

128. How would you design role-based access control (RBAC) for a RAG system so users only retrieve documents they are authorized to see?

Authoritative directory. Ingestion jobs resolve HRIS/IdP groups to stable principal ids; chunk rows carry allowed_groups, sensitivity labels, and document clearance—never infer permissions from folder names alone.

Query path. Gateway expands JWT groups + dynamic claims and passes mandatory metadata filters to the vector engine and any keyword/BM25 sidecars. Filters are server-side truth; LLM prompts cannot override them.

Air gaps. For defense or finance customers, physically separate indexes per clearance; forbid mixed corpora in one namespace. Export controls may forbid embedding certain corpuses next to others even if ACL math works.

Failure mode. If group expansion stale beyond SLO, fail closed for regulated tenants rather than risk over-retrieval.

Testing. Continuous IDOR fuzzing: random user fixtures trying cross-tenant pulls; log attempted violations for SOC.

RBAC on retrieval

flowchart LR
  U[User JWT groups] --> G[Gateway]
  G --> RET[Retriever query + filter]
  RET --> V[(Vector index)]

129. How would you design a secrets management system for LLM API keys used across multiple services?

Vault-first. Central KMS/HashiCorp/Cloud SM; workloads use IAM-bound roles to fetch short-lived tokens; no plaintext keys in Terraform state or Slack threads.

Rotation. Automated 30–90 day cycles plus instant incident rotation with dual-authorization; track which service consumed which secret version for forensic replay.

Blast radius. Separate keys per env, per region, per vendor account; compromise of dev never mints prod traffic.

Developer ergonomics. Local dev uses clearly fake keys hitting mock servers; CI blocks accidental prod key patterns in diffs.

Break-glass. Time-limited emergency access with mandatory ticket + auto-expiring leases recorded in immutable audit.

130. How would you conduct a security threat model for an LLM-based application? (STRIDE or OWASP LLM Top 10)

Scoping workshop. Bring security, PM, legal, ML platform, and a retrieval engineer—skip vendors until you understand your own data flows.

Diagram-driven STRIDE. For each arrow (user upload, agent browsing, retrieval, tool egress) enumerate spoofing, tampering, repudiation, info disclosure, DoS/cost abuse, elevation via prompt/tool abuse.

OWASP LLM Top 10 mapping. Translate each category into concrete backlog items: insecure output handling becomes schema validators + UI XSS review; excessive agency becomes tool governance + HITL.

Prioritization. Score likelihood × impact × ease of mitigation; feed ranked items into sprint capacity with explicit ‘accepted risk’ entries for execs.

Living artifact. Re-run when models, prompts, or laws change; attach test cases so regressions surface in CI.

Example. Workshop with security + PM + legal yields a living risk register updated each major model or prompt change.

Recap — this section

Q	Takeaway
116	Data-flow-driven controls; KMS + tenant isolation; tiered deployment modes; CMK option; sober tradeoff framing.
117	Structural message slots + tool allowlists; output/action validators; SOC-grade anomaly cues; explicit residual risk.
118	Hardened fetch + quarantine; untrusted framing; behavioral anomaly scoring; HITL for novel sources; intel loop.
119	Ensemble PII detection; tiered block/redact/tokenize; labeled multilingual eval; jurisdictional routing; fail-closed uncertainty.
120	Exhaustive RoPA-style map; regional isolation; lawful-basis hygiene; verifiable erase fan-out; vendor paper trail.
121	Contract-aligned API endpoints; IaC enforcement; recurring assurance; full vendor surface mapping; customer-visible commitments.
122	End-to-end BAA graph; PHI minimization + private inference; strong access audit; human-in-loop care workflows; phased feature rollout.
123	Scoped JWT at edge; mesh mTLS; tenant rate limits; rotation + revocation discipline; IDOR regression tests.
124	Tamper-evident append-only logs; policy/model lineage; regulated export tooling; SoD; audit never best-effort.
125	Secret-free prompts; rotation; abuse-rate limits; boring refusals; account-level escalation.
126	Ingress + egress moderation; composite stack w/ geo nuance; sync vs async paths; appeals + labeled feedback; explainable codes.
127	Graduated enforcement; session/tool revocation; forensic features w/ retention caps; non-gamified comms; continuous detector improvement.
128	Directory-sourced chunk ACLs; mandatory server filters; air-gapped indexes; fail-closed staleness; IDOR regression tests.
129	Short-lived vault tokens; routine + emergency rotation; per-env keys; pre-commit secret scanning; audited break-glass.
130	Cross-functional scoping; STRIDE on LLM data flows; OWASP-to-controls backlog; explicit risk acceptance; regression-linked register.

← Section 7 · This section · Design hub · Section 9 →