Interview ready · Design · Section 14

Domain & vertical LLM

Fifteen staff-depth scenarios on shipping industry-shaped systems: domain pack architecture; legal research boundaries; clinical decision-support patterns; banking suitability stacks; secure coding assistants; ticketing-integrated support; compliant marketing generation; industrial IoT + TSDB truth; education safety and integrity; public-sector deployment constraints; true locale packs; vertical eval harnesses; white-label tenancy; partner marketplace governance; and strategic vertical-vs-platform tradeoffs.

Interview stance. Vertical LLM products are compliance + workflow + evidence sandwiches. The model is rarely the IP—packaged connectors, rubrics, and policy engines win enterprise procurement. Never hand-wave malpractice, suitability, or child-safety obligations.

Ship domain packs as versioned products: connectors, goldens, prompts, guardrails—not a slide deck.
Regulated verticals need human-in-loop thresholds backed by telemetry, not vibes.
Internationalization is locale law + culture, not Google Translate on system prompts.
Partners and white-label multiply your attack surface—conformance tests and support RACI are architecture.

206. How would you architect a shippable ‘domain pack’ (vertical starter kit) for LLM products—connectors, prompts, evals, guardrails?

Bundles. Opinionated ingestion for the vertical’s canonical systems (e.g., EHR, SAP, Jira patterns), golden eval sets, baseline prompts, policy packs, and incident playbooks—customers adopt faster than blank slate.

Versioning. Pack major.minor aligned to regulatory template changes; incompatible upgrades flagged in installer.

Isolation. Packs live in namespaces so one customer’s customizations don’t poison shared defaults.

GTM ops. Solution engineers validate pack against sample tenant in guided POC checklist.

Metrics. Track time-to-first-quality-answer per vertical—your differentiation metric.

Domain pack layers

flowchart TB
  CON[Connectors] --> CORP[Corpus templates]
  CORP --> POL[Policy pack]
  POL --> EVAL[Vertical goldens]
  EVAL --> UX[Guided UX flows]

207. How would you design an LLM system for legal document review, drafting, or research under real malpractice and evidentiary constraints?

Non-advice stance. Product is research assistance—human attorney owns conclusions; immutable citations to sources with page-line pointers.

Privilege workflows. Segregate client matters; prevent cross-matter retrieval; audit every export.

Humans. Partner review on high-stakes outputs; configurable risk tiers by case type.

Data. Retention tied to matter lifecycle; redact PII aggressively before any cloud route if not approved.

Eval. Benchmark on sealed practice-specific sets under NDA—public MMLU irrelevant.

208. How would you architect a healthcare copilot that assists clinicians without crossing into unlicensed medical practice?

BAA & PHI. Every subprocess signed; minimize PHI in prompts; prefer on-prem or private endpoints.

Evidence. Tie suggestions to chart excerpts the clinician can verify; never fabricate dosing.

UI. Prominent ‘verify in source record’ affordances; audit clicks as training signal.

Guardrails. Refusal + escalation paths for emergent symptoms per clinical safety committee.

Liability. Legal defines allowed copy; model card lists coverage limits by locale.

209. How would you design LLM features for retail banking or wealth management under marketing and suitability regulations?

Rule engine sandwich. Deterministic compliance layer wraps probabilistic language—approved disclosures inserted verbatim.

Personalization boundaries. Suitability checks before ‘recommended’ language; log rationale metadata for auditors.

Data. Separate PII enrichment from generation; tokenize account numbers aggressively.

Testing. Red-team for guaranteed-return language and jurisdiction-specific bans.

Human. Licensed advisor review queue for personalized plans beyond thresholds.

210. How would you secure an IDE- or CLI-integrated coding assistant for enterprise customers (IP, secrets, supply chain)?

Secrets. Detect and block keys before cloud send; local-only modes for air-gapped labs.

OSS license hygiene. Surface provenance on suggestions; policy engine blocks copyleft violations when configured.

Tenant controls. Allow/deny specific repos, branches, and file globs; watermark suggestions per org.

Telemetry. Enterprise opt-out of code collection; on-prem model option.

SBOM & updates. Pin extension versions; rapid CVE response channel.

211. How would you design a customer-support LLM stack deeply integrated with ticketing (Zendesk, ServiceNow, Salesforce)?

System of record. Tickets, macros, and KB must be ingested with ACL parity—wrong tenant retrieval destroys trust.

Actions. Tools for status changes require idempotency + confirmation UX for irreversible fields.

Omnichannel. Chat, email, voice summaries share state; session resume on handoff.

QA. Human reviewers sample closed tickets; measure CSAT delta and reopen rate.

Cost. Tier automation: L1 deflection vs human escalation with clear SLAs.

212. How would you handle brand, trademark, and disclosure compliance when LLMs generate public-facing marketing copy?

Style guides as code. Lexical allow/deny lists, claim substantiation hooks linking to source studies.

Geos. Locale-specific legal copy blocks—alcohol, pharma, finance promos differ wildly.

HITL. Marketing counsel approves templates before scale campaigns ship.

Watermarks. Metadata marking AI-assisted assets for platform policies.

Incidents. Rapid takedown workflow if competitor trademarks slip through.

213. How would you combine LLM interfaces with telemetry from plants, vehicles, or IoT devices safely and in near-real time?

Data path. Time-series + event logs normalized for retrieval; numeric truth from TSDB, not paraphrased by embeddings alone.

Latency. Edge preprocessing aggregates noisy sensors before cloud reasoning when seconds matter.

Safety. High-assurance commands (shutdown valves) bypass LLM—deterministic PLC path.

Security. Device auth, mTLS, anomaly detection on prompt injection via sensor spoofing narratives.

Ops. Digital twin version pinned in each answer for auditability.

214. How would you design AI tutoring products with child safety, academic integrity, and curriculum alignment?

Age gating. COPPA/GDPR-KR style flows; minimized child data retention; parental dashboards.

Integrity. Socratic modes vs direct answers configurable by institution; plagiarism-aware nudges.

Alignment. Curriculum-tagged content ensures retrieval matches standards, not random web pages.

Moderation. Aggressive output filters and escalation to human moderators for borderline cases.

Eval. Learning gains measured with Institutional Review Board oversight when involving minors.

215. How would you deploy LLM capabilities for government or defense customers with classification and procurement constraints?

Accreditation. Map to agency IL levels; often no commercial telemetry; bespoke ATO packages.

Deploy modes. Disconnected enclaves, dedicated regions, or on-vehicle edge per mission profile.

Vendors. Subcontractor lists and citizenship requirements dominate model sourcing decisions.

Features. Disable browsing agents by default; strict tool allowlists tied to mission role.

Lifecycle. Patch cadence follows maintenance windows, not agile Friday deploys.

216. How would you ship locale-specific LLM behavior beyond literal translation (jurisdiction, tone, formatting)?

Packs. Per-locale prompt anchors, units, date formats, legal boilerplate snippets, and retrieval corpora in native language.

Eval. Native speaker goldens—not English translated backward as truth.

Models. Route to regional endpoints or tuned adapters if quality demands.

Fallback. Honest ‘best effort in en-US’ banner when locale unsupported.

Continuous. In-market user research quarterly; slang drifts quickly.

217. How would you build evaluation harnesses specific to a vertical (legal, health, finance) that general benchmarks miss?

Expert goldens. Partner with domain SMEs; double-blind adjudication; versioned rubrics.

Negative tests. Disallowed advice, hallucination traps using near-miss citations.

Reg replay. Simulate policy changes—does system adapt without full retrain?

Automation. Deterministic checks (schema, citation presence) layered before subjective grading.

Confidentiality. Air-gapped eval sandboxes for customer-provided corpora.

218. How would you white-label a vertical LLM SaaS for partners (banks, ISVs) while preserving your roadmap velocity?

Layering. Core platform APIs + configurable brand skin + partner-owned policy packs they approve.

Tenancy. Strong isolation, per-partner keys, separate support escalations.

Roadmap contract. Define shared vs forked features; avoid N bespoke codepaths without revenue cover.

Upgrade train. Partner staging environments with contractual adoption windows.

SLA. Branded status pages with honest shared-incident comms templates.

219. How would you design a partner / systems-integrator ecosystem for enterprise LLM deployments (connectors, cert training, revenue share)?

SDK quality. Documented connector contracts, sandbox tenants, and conformance tests partners must pass.

Certification. Badging + annual recert on security baselines.

Rev share. Transparent usage metering to split inference margin without surprise invoices.

Support triage. Clear L1/L2 ownership matrix to prevent buck-passing during outages.

Trust. Partner code scans for data exfil patterns before marketplace listing.

220. How would you decide when to build depth in one vertical versus staying horizontal across industries?

Signal. Repeated six-figure deals asking same connectors + compliance + eval trio indicates vertical pack ROI.

Danger. Premature verticalization fragments engineering; maintain a strong horizontal core first.

Metrics. Win rate, implementation hours, and gross margin by segment tell truth marketing slides hide.

Organize. Platform team owns nucleus; vertical pods own packs and SME relationships.

Interview flair. Show you can articulate both strategic and staffing consequences—not only architecture diagrams.

Recap — this section

Q	Takeaway
206	Opinionated vertical bundle; semver packs; namespaced overlays; SE-ready POC; vertical TTFQA metrics.
207	Decision-support positioning; matter-isolated RAG; HITL for high risk; privilege-safe retention; private goldens.
208	BAA graph + PHI minimization; grounded chart-linked suggestions; clinical guardrails; legal-reviewed UX copy.
209	Rules-first disclosures; suitability gates; pseudonymous pipelines; finance red-team; advisor HITL thresholds.
210	Secret scanning + local mode; license policy engine; repo-scoped context; enterprise telemetry contracts; pinned SBOM.
211	Ticket-system parity RAG; idempotent action tools; unified session state; QA on reopen rate; tiered automation.
212	Substantiation-linked claims; geo policy packs; counsel-approved templates; AI metadata; fast takedown playbooks.
213	TSDB factual layer + narrative RAG; edge aggregation; deterministic control paths; device-origined injection aware.
214	Age-verified minimal retention; institution integrity modes; curriculum-filtered RAG; strong moderation; IRB discipline.
215	IL/ATO alignment; disconnected deploy options; cleared-personnel supply chain; constrained agents; ops cadence realism.
216	Locale packs w/ law+culture; native goldens; regional routing; honest capability gaps; ongoing in-market research.
217	SME-authored goldens + red teams; policy-shift drills; layered auto checks; secure eval enclaves.
218	Themable tenant layers; contractual core vs bespoke; staging trains; joint incident comms.
219	Connector conformance + sandbox; partner badges; usage-transparent rev share; support RACI; marketplace security reviews.
220	Deal-pattern signal; horizontal core before sprawl; segment KPI honesty; platform+vertical team topology.

← Section 13 · This section · Design hub · Section 15 →