Why manual variant annotation doesn't scale
A senior bioinformatician spends 30–90 minutes per complex case cross-referencing OncoKB, ClinVar, and FDA labels to assign a clinical tier. That workload is linear with panel volume. For a lab processing 200 panels per month, that's a full-time bioinformatics headcount spent almost entirely on curation — not on the higher-value work of method development, QA, or edge-case review.
Most of that cross-referencing is deterministic: variant → knowledge base lookup → evidence tier → tier badge. The deterministic part is what UNMIRI automates.
The annotation pipeline, step by step
Every variant flows through six stages. The result is a tier assignment with full provenance — you can trace any classification back to the specific knowledge-base version and entry that produced it.
| Stage | What happens | Data source |
|---|---|---|
| 1. Normalization | HGVS canonicalization, transcript selection (MANE Select preferred) | Ensembl, RefSeq |
| 2. Knowledge-base match | Exact variant lookup across oncology KBs | OncoKB, ClinVar, COSMIC |
| 3. Evidence grading | Map KB evidence → AMP/ASCO/CAP tier (I–IV) | OncoKB Level 1–4, FDA approval status |
| 4. Germline check | If germline, apply ACMG/AMP criteria (PVS1, PS1–4, PM1–6, PP1–5) | ACMG/AMP 2015 framework |
| 5. Novel-variant fallback | In-silico prediction + population frequency + domain analysis | SIFT, PolyPhen-2, REVEL, gnomAD |
| 6. Lab-specific override | Apply any institutional classifications from your lab's override set | Your lab's pinned interpretations |
Why explainability matters for CAP/CLIA labs
CAP inspectors ask: “How did you arrive at this tier?” A black-box LLM answer — “the model said so” — is not an answer. UNMIRI returns the full reasoning chain for every variant: which knowledge-base entry matched, what evidence level applied, and whether any lab override was triggered. This is what makes the output auditable rather than merely plausible.
Output formats
Per variant, you receive: HGVS notation, transcript ID, clinical tier, evidence citations (with knowledge-base version), confidence flag, and any lab-specific override annotations. Output is available as structured JSON for LIMS ingestion, or rendered into the 2-page Actionable Insight alongside treatment recommendations.
Compliance controls — BAA, zero-retention, US-only residency — are documented on the security page. Practical architecture: Building a HIPAA-Ready Architecture for Clinical Decision Support.
How UNMIRI actually does this
UNMIRI classifies variants by traversing a knowledge graph that encodes OncoKB evidence levels, ClinVar clinical significance assertions, and ClinicalTrials.gov eligibility. The AMP/ASCO/CAP tier is produced by deterministic rules, not by an LLM. Every tier assignment traces back to a specific KB entry, and lab-specific overrides are first-class graph edges. More on the architecture.