Industry & compliance

Building a HIPAA-Ready Architecture for Clinical Decision Support

Umair Khan··10 min read
HIPAABAALLM APIsComplianceArchitecture

Editor's note (May 2026): This post was published April 21, 2026 and predates two decisions that are now settled. The AWS Business Associate Addendum took effect 2026-05-09, and Azure OpenAI (under the Microsoft Online Services BAA) was selected as the LLM path. The architectural principles below still hold, but where the post describes the cloud or LLM provider as "under evaluation," that evaluation is now closed. For UNMIRI's current subprocessors and BAA status, see /security and /security/subprocessors.

Last week a CTO at a regional diagnostic lab emailed me the shortlist his procurement team was evaluating. Six AI vendors pitching some version of "automated NGS interpretation."

For each one he asked the same question: Can you produce an executed BAA that covers the LLM call itself?

Four of them went vague. Two said yes but couldn't name the API tier or the counterparty. Zero produced the actual document before the follow-up call.

That's the state of HIPAA compliance in AI-for-healthcare in April 2026. It's why lab procurement teams are (rightly) skeptical, and it's why this post exists. Consider it a reference you can hand to your compliance officer when they ask what "HIPAA-ready LLM pipeline" should actually mean.

A note on framing. This post describes the planned architecture for UNMIRI's clinical decision support pipeline and the four conditions any HIPAA-ready LLM workflow needs to satisfy. UNMIRI is pre-pilot; specific BAAs and infrastructure controls are in active development. Current production status is tracked at /security and /security/subprocessors. The architectural principles below are what we believe and what we're building toward, not what is operating in production today.

What makes an LLM pipeline HIPAA-ready?

A HIPAA-ready LLM pipeline requires four simultaneous conditions: a BAA with the customer, BAAs with every downstream subprocessor including the LLM provider, contractually enforced zero-retention on LLM calls, and US-only data residency pinned at the configuration layer. Miss any one of them and you are, at best, HIPAA-adjacent.

HIPAA compliance for an LLM pipeline is not a single checkbox. It's four simultaneous conditions. Miss one and you are, at best, HIPAA-adjacent.

  1. A signed BAA between you (the vendor) and your customer (the covered entity or upstream business associate).
  2. A signed BAA between you and every downstream service that touches PHI, which in an LLM pipeline always includes the LLM provider itself, plus hosting, plus any database that stores request metadata.
  3. A contractually enforced zero-retention agreement with the LLM provider. "We don't train on it" is not sufficient. The retention clause governs whether PHI persists in provider logs, caches, or abuse-monitoring pipelines after your request completes.
  4. US-only data residency, verifiable at the configuration layer, not assumed from the provider's marketing copy.

Fail on any of the four and you're building on sand. The rest of this post is about how to get each one right in practice.

Which LLM API tiers support HIPAA compliance?

Only enterprise-tier LLM APIs with signed Business Associate Agreements support HIPAA-ready workflows. As of April 2026, this means OpenAI Enterprise / Scale Tier, Anthropic Claude Enterprise, Azure OpenAI with limited-access approval, AWS Bedrock (model-specific), and Google Vertex AI Enterprise. Consumer ChatGPT, default Claude API access, and Gemini Basic do not qualify.

The uncomfortable truth: the consumer tiers of ChatGPT, Claude, and Gemini are not HIPAA-ready. Neither is the default OpenAI API without enterprise enrollment. Neither is Anthropic's default API without a signed BAA.

Here's the lay of the land as of April 2026. Verify current terms with your counterparty before signing. These policies move.

OpenAI Enterprise / Scale Tier. BAA available. Zero-retention on completions when requested via the API (store: false). Abuse-monitoring retention has a separate, shorter window and a separate legal basis. Data residency is US by contract.

Anthropic Claude Enterprise. BAA available. Zero-retention on API traffic. Models run in customer-isolated regions. Data residency US (and EU regions for GDPR-adjacent work, though that's outside scope for a US-only diagnostic lab).

Azure OpenAI Service. BAA via Microsoft. Lets you pin region. Zero-retention requires approval through the limited-access program; it doesn't flip on automatically with signup.

AWS Bedrock. BAA via AWS. Model-level data-use policies differ. Claude on Bedrock inherits Anthropic's zero-retention stance; other models vary. Read the specific model's data-usage terms separately, not just the Bedrock BAA.

Google Vertex AI / Gemini Enterprise. BAA via Google Workspace or Vertex Enterprise. Zero-retention requires the no-data-training option; the default is 55 days of retention for abuse monitoring unless explicitly disabled.

The practical takeaway: no single "enterprise checkbox" makes a consumer LLM HIPAA-ready. You need to do four things explicitly:

  1. Sign the BAA with the provider.
  2. Confirm the specific retention terms for your tier and endpoint.
  3. Pass the correct flags on every API call to suppress retention.
  4. Build failover to a second BAA-backed provider for outage handling.

The flag in step 3 is not optional when PHI is in the prompt. Code sketch:

# OpenAI Enterprise (as of April 2026; verify current docs)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    store=False,  # suppress completion retention
    # BAA-backed accounts have model training disabled by default
)

# Anthropic Enterprise equivalent
response = anthropic.messages.create(
    model="claude-opus-4-7",
    messages=[...],
    # BAA-backed accounts are zero-retention by default on enterprise tier;
    # no per-call flag is required, but the BAA is what makes it enforceable.
)

If your code doesn't set store=False on every PHI-bearing OpenAI call, you are technically retaining. Audit your codepath.

What infrastructure BAAs does a HIPAA LLM pipeline need?

A HIPAA-ready LLM pipeline needs BAAs with every service that handles PHI: the hosting provider, the database and storage layer, the LLM provider, and any observability tool that ingests request content. UNMIRI is evaluating cloud and LLM providers across AWS and Microsoft Azure for the clinical path; final selection is contingent on customer requirements and BAA terms. Current subprocessor status is published at /security/subprocessors and updated as agreements are signed.

The LLM is not the only service in the pipeline. Anywhere PHI touches, you need a BAA. UNMIRI's planned stack, by category:

  • Marketing site hosting (Vercel): the public unmiri.com site runs on Vercel today. The marketing site does not handle PHI. Clinical product hosting will run on a HIPAA-eligible cloud provider with a signed BAA before any PHI workload is deployed.
  • Managed cloud infrastructure (AWS or Azure, under evaluation): the primary PHI path is built on a HIPAA-eligible cloud. The plan calls for managed PostgreSQL Multi-AZ for structured clinical data, variant annotations, and audit logs; encrypted object storage with customer-managed keys, access logging, and versioning for the persistent document store; a separate transient bucket for the document-extraction input that auto-deletes after processing; and a managed document-extraction service for PDF parsing. US-only data residency by design across the entire path. A BAA will be signed with the selected cloud provider before any PHI workload moves to production.
  • LLM inference (provider TBD): evaluating HIPAA-eligible LLM API tiers (Anthropic Claude API and Azure OpenAI Service). Used narrowly for extraction edge cases and long-tail variant fallback on de-identified data only. A BAA will be signed with the selected provider before any clinical workload moves to production. The integration target is a tier with no training on customer inputs or outputs.
  • Self-managed Neo4j cluster: planned to run in UNMIRI's own US-pinned VPC inside the selected cloud. No external BAA needed because UNMIRI controls the infrastructure end-to-end.

Each is a category in UNMIRI's planned BAA chain. As specific providers are selected and BAAs are signed and verified, the public subprocessor list is updated with vendor names and dates. No PHI workload moves to production until the full chain is executed.

Architectural note: the LLM does not touch the clinical recommendation itself, regardless of which provider gets selected. Clinical-path reasoning happens in the knowledge graph, and the final 2-page cheat sheet is rendered by deterministic templates. No LLM in the output path. The narrow LLM scope (extraction edge cases and long-tail variant fallback only, on de-identified inputs) keeps the BAA surface manageable. UNMIRI is currently evaluating Anthropic Claude API and Azure OpenAI Service for that narrow extraction role; final selection will be driven by design-partner BAA terms and integration requirements.

What does a HIPAA-ready LLM data flow look like?

A HIPAA-ready clinical AI data flow has six internal stages: authentication, normalization, knowledge-graph traversal, narrow LLM extraction (PHI-minimized), deterministic template rendering, and audit-logged response. Every external service boundary is governed by an executed BAA. The diagram below shows UNMIRI's planned production pipeline.

Here's the planned pipeline. Every arrow crossing a service boundary will be BAA-governed before any PHI flows over it.

 ┌────────────────┐      ┌─────────────────────────────────┐
 │  Lab's LIMS    │      │          UNMIRI API             │
 │                │      │  ┌───────────────────────────┐  │
 │ (covered       │ POST │  │ 1. Auth + request log     │  │
 │  entity)       │─────▶│  │    (PHI-minimized log)    │  │
 │                │      │  └──────────────┬────────────┘  │
 │ BAA with       │      │                 ▼               │
 │ UNMIRI ─────┐  │      │  ┌───────────────────────────┐  │
 └────────────┼───┘      │  │ 2. Normalization          │  │
              │          │  │    (in-memory only)       │  │
              │          │  └──────────────┬────────────┘  │
              │          │                 ▼               │
              │          │  ┌───────────────────────────┐  │
              │          │  │ 3. Neo4j graph traversal  │  │
              │          │  │    (US-pinned VPC, ours)  │  │
              │          │  └──────────────┬────────────┘  │
              │          │                 ▼               │
              │          │  ┌───────────────────────────┐  │       ┌───────────────┐
              │          │  │ 4. Narrow LLM extraction  │──┼──────▶│ HIPAA LLM API │
              │          │  │    (PHI-minimized prompt) │  │       │ (provider TBD)│
              │          │  │    edge-case + long-tail  │  │       │ BAA · 0-train │
              │          │  └──────────────┬────────────┘  │       └───────────────┘
              │          │                 ▼               │
              │          │  ┌───────────────────────────┐  │
              │          │  │ 5. Deterministic template │  │
              │          │  │    renders final 2-pager  │  │
              │          │  └──────────────┬────────────┘  │
              │          │                 ▼               │
              │          │  ┌───────────────────────────┐  │       ┌───────────────┐
              │          │  │ 6. Response + audit write │──┼──────▶│ Managed PG +  │
              │          │  │    (PHI redacted from log)│  │       │ Object Store  │
              │          │  │                           │  │       │ BAA · US-only │
              │          │  └──────────────┬────────────┘  │       └───────────────┘
              │ response │                 ▼               │
              └──────────┴─────────────────────────────────┘

 (Planned production pipeline. The BAA labels describe the contractual
  state required before any PHI flows. Current contract status:
  /security/subprocessors)

Six stages inside the planned API. Three external counterparties on the diagram, each scoped for a BAA and zero-retention terms before any PHI moves. Current contract status is tracked at /security/subprocessors.

How should PHI be handled inside an LLM pipeline?

PHI inside an LLM pipeline should be processed in-memory only (never persisted), logged with identifiers redacted, decoupled from your internal patient IDs, and stripped from LLM prompts at the payload layer, so the LLM call contains clinical concepts (variants, drugs, tiers) but no patient identifiers.

The handling rules UNMIRI's design specifies for PHI are conceptually simple and operationally strict.

In-memory processing. Raw NGS reports, extracted variant profiles, and intermediate computations are designed to exist only in request-scoped process memory. No disk writes. No cache persistence. When a request completes, worker memory will be the last place that data lived on UNMIRI's infrastructure.

PHI-redacted audit logs. The audit log records every request: timestamp, authenticated principal, endpoint, status, latency, insight ID, hash of the input payload. It does not record the variant profile, the source patient ID, or any free-text content. The audit log tells you that a report was processed and by whom, not what was in it. This is deliberate. An audit log containing PHI is itself a PHI surface area with its own BAA requirements.

Provenance separate from content. The insight_id returned to the lab is an opaque reference. The lab's own systems hold the mapping between insight_id and their patient record. UNMIRI never needs that mapping, and the design does not store it.

Prompt-level PHI minimization. The prompt sent to the LLM is stripped to only what the formatter needs: variant nomenclature, drug, evidence tier, citations. Patient identifiers never reach the LLM call. This is belt-and-suspenders (the BAA would cover it anyway), but defense in depth matters when the consequence of a leak is a breach notification.

# Illustrative: prompt construction with PHI stripped before send
prompt_input = {
    "variant": graph_result["variant"],     # "EGFR L858R"
    "drug": graph_result["drug"],           # "Osimertinib"
    "evidence_tier": graph_result["tier"],  # "I-A" (AMP/ASCO/CAP)
    "citations": graph_result["citations"], # ["CIViC:EID3017", "FDA label", "FLAURA NEJM 2020"]
    # Not included: patient_id, name, DOB, MRN, report_date, source_file_id
}

If your prompt construction includes a patient identifier (even a hashed one), you're relying solely on your BAA with the LLM provider. Remove identifiers at the prompt layer and you have two independent controls.

What encryption, residency, and audit logging does HIPAA require for an LLM pipeline?

HIPAA expects TLS 1.3 encryption in transit, AES-256 at rest, US-only data residency pinned at configuration, and immutable audit logging covering authentication and PHI-access events with a seven-year retention window aligned to Breach Notification Rule requirements. These are the technical safeguards that the Security Rule operationalizes.

The easier parts, listed for completeness because they still need to hold.

Encryption in transit. TLS 1.3 on every external edge. Internal service-mesh traffic also TLS; no cleartext between services.

Encryption at rest. AES-256 everywhere PHI could conceivably land. In the planned production environment, that includes managed-database encrypted volumes, encrypted object storage with customer-managed keys, Neo4j cluster volumes, and any temporary processing queue. Key management is via the selected cloud provider's KMS with customer-managed keys. Rotation schedule documented and audited.

US-only data residency. Public marketing site: US region pinned via config. Managed cloud infrastructure (AWS or Azure under evaluation): US regions pinned across all PHI-handling resources. HIPAA-eligible LLM API tier: US inference zones. This is a configuration detail that gets verified on every new environment we spin up. We do not assume; we check.

Audit logging. Every PHI-relevant event (authentication, report ingestion, graph query, LLM call, response delivery) is logged with principal, timestamp, correlation ID, and result code. Retention is seven years to align with HIPAA breach notification requirements. Logs are append-only and stored separately from the application database. Access to audit logs is itself logged.

How do you evaluate an AI vendor's HIPAA compliance?

Evaluate an AI vendor's HIPAA posture with ten specific questions: executed BAA on demand, named LLM provider and tier, zero-retention terms in writing, data-flow diagram, BAA chain including hosting and database, US residency verification, audit-log content and exclusions, outage failover also BAA-backed, breach-notification SLA, and SOC-2 status with evidence.

If you're a lab CTO evaluating AI vendors, or an engineer building one, this is the checklist I would hand to procurement. Copy it into your vendor questionnaire. Ten questions.

  1. Can you produce an executed BAA, not a template, on request?
  2. Can you name the specific LLM provider and tier you use, and produce that provider's BAA?
  3. Can you show the zero-retention term in writing, including the separate abuse-monitoring retention policy?
  4. Can you produce a data-flow diagram labeling every service that touches PHI, with BAA status per hop?
  5. Does your BAA chain cover hosting (Vercel / AWS / GCP) and database + storage (AWS RDS + S3 / Supabase / Cloud SQL)?
  6. Is data residency US-only, pinned at configuration, and verifiable in your audit logs?
  7. What does your request-level audit log contain, and what does it specifically not log?
  8. How do you handle LLM provider outages? Is the failover also BAA-backed?
  9. What is your incident-response SLA for a suspected PHI breach? Who notifies whom, in what window?
  10. What is on your SOC-2 roadmap? If you claim SOC-2 today, provide the Type II report; don't accept "we're compliant" as shorthand.

If a vendor hedges on more than one of these, walk. The ones who can answer all ten have already done the work. The ones who can't are hoping you won't ask.

Does UNMIRI have SOC-2 Type II compliance?

No. UNMIRI's SOC-2 Type II audit is a future milestone, not a current one. The control set is being built and documented in parallel with the pilot work; target dates will be published once the audit relationship and evidence-collection cadence are in place. Procurement teams should know the difference between implemented controls and externally audited controls, and the difference between controls being built and controls running in production.

UNMIRI's SOC-2 Type II audit is a future milestone, target date TBD. We're not SOC-2 Type II compliant today, and we don't claim to be. We say that clearly on our security page and in this post because I would rather lose a deal over the timing than win one on a misrepresentation.

What's in active development: BAA-backed infrastructure, zero-retention LLM agreements, US-only residency, audit logging, encryption at rest and in transit, access controls. The Type II audit will validate those controls once they're operating in production. We are not claiming a Type II posture, and we are not claiming the controls are running in production until they are.

If your procurement team requires Type II before pilot, we are not the right fit for this round. If your team is willing to evaluate UNMIRI on its planned architecture and current BAA progress, with the understanding that no PHI moves to production until the chain is executed, we can scope a pilot together. Both answers are honest.

What does "HIPAA-ready" mean for an AI vendor?

"HIPAA-ready" means the architecture, contracts, and operational controls required to handle PHI are in place, documented, and executable. It is not a formal certification (HHS does not certify AI vendors). The term is deliberately narrower than "HIPAA-compliant" because compliance is a continuously-demonstrated property, not a point-in-time attestation.

The term I use is HIPAA-ready, and I'm deliberate about what that word is doing.

HIPAA-certified isn't a real status. HHS doesn't certify AI vendors. HIPAA-compliant is colloquially fine but typically means "we've implemented controls we believe meet the Security and Privacy Rules," which is itself a claim that needs substantiation. Ready means the architecture, contracts, and operational controls required to handle PHI are in place, documented, and executable. That's the bar. Today UNMIRI is in active development against that bar; current status is at /security.

If you're building one of these pipelines, use this post as a starting framework, verify every provider-specific term against the current documentation, and get your BAAs executed before the first byte of PHI crosses the wire. That order matters.

The HIPAA-ready architecture described here is what backs the NGS Interpretation API and the Genomics-aware CDS API for healthtech and EHR vendors that need a parsing-and-decision-support layer with this compliance posture out of the box.

Questions, corrections, or counterexamples: email me at hello@unmiri.com. If your healthtech product is evaluating UNMIRI specifically for an integration, the same address is the fastest path in.

Related references

Frequently asked questions

Is ChatGPT HIPAA-compliant?
No. The consumer ChatGPT tier does not come with a Business Associate Agreement and may retain prompts and completions for service improvement. Only the OpenAI Enterprise tier supports a signed BAA with zero-retention terms, and even then every API request must explicitly set store: false to suppress retention of PHI-bearing content.
What is a Business Associate Agreement (BAA) for LLM APIs?
A BAA is a HIPAA-required contract between a covered entity (or upstream business associate) and any downstream vendor that handles PHI. For LLM APIs, the BAA must cover zero retention of requests and responses, no use of data for model training, and documented security controls at the provider level.
Does zero-retention mean the LLM provider never sees the data?
No. The provider processes each request to generate a response. Zero-retention means the request and response are not persisted in provider storage after the response is returned, and are not used for model training. Transient in-memory processing during response generation is still required and is covered by the BAA.
Can a diagnostic lab start an AI pilot before SOC-2 Type II is complete?
Yes, if the lab's procurement policy allows risk-based vendor acceptance. Most labs accept an interim security package (executed BAAs, documented controls, zero-retention attestations, and a credible SOC-2 roadmap) under a pilot agreement. If procurement strictly requires Type II before engagement, wait for the audit to complete.
Umair Khan

Umair Khan

Founder and CTO, UNMIRI

Building UNMIRI, a precision oncology infrastructure company with four product surfaces: cross-vendor NGS interpretation, genomics-aware decision support, oncology literature intelligence, and a free cross-vendor unification tool for clinicians. Writing here on architecture, clinical data, and HIPAA-ready AI.

Clinical advisors: UNMIRI is in active conversations with multiple board-certified pathologists about formal advisory roles. Public introductions land on the About page once each engagement is formalized and the advisor approves being named.

Related posts

Want to see this architecture in your stack?

UNMIRI is in design-partner phase across the NGS Interpretation API, the Genomics-aware CDS API, the Literature Intelligence platform, and the free Pathologist Tool. Reply within one business day.