Reporting pipeline

Variant Annotation Automation for Diagnostic Labs

Automated AMP/ASCO/CAP variant tiering grounded in OncoKB, ClinVar, and openFDA drug labels — not a black-box LLM guess. Every tier assignment is explainable, auditable, and editable. Built for labs running solid-tumor NGS panels at volume.

AMP/ASCO/CAPACMG/AMP germlineOncoKB · ClinVar · COSMICAuditable graph

Why manual variant annotation doesn't scale

A senior bioinformatician spends 30–90 minutes per complex case cross-referencing OncoKB, ClinVar, and FDA labels to assign a clinical tier. That workload is linear with panel volume. For a lab processing 200 panels per month, that's a full-time bioinformatics headcount spent almost entirely on curation — not on the higher-value work of method development, QA, or edge-case review.

Most of that cross-referencing is deterministic: variant → knowledge base lookup → evidence tier → tier badge. The deterministic part is what UNMIRI automates.

The annotation pipeline, step by step

Every variant flows through six stages. The result is a tier assignment with full provenance — you can trace any classification back to the specific knowledge-base version and entry that produced it.

StageWhat happensData source
1. NormalizationHGVS canonicalization, transcript selection (MANE Select preferred)Ensembl, RefSeq
2. Knowledge-base matchExact variant lookup across oncology KBsOncoKB, ClinVar, COSMIC
3. Evidence gradingMap KB evidence → AMP/ASCO/CAP tier (I–IV)OncoKB Level 1–4, FDA approval status
4. Germline checkIf germline, apply ACMG/AMP criteria (PVS1, PS1–4, PM1–6, PP1–5)ACMG/AMP 2015 framework
5. Novel-variant fallbackIn-silico prediction + population frequency + domain analysisSIFT, PolyPhen-2, REVEL, gnomAD
6. Lab-specific overrideApply any institutional classifications from your lab's override setYour lab's pinned interpretations

Why explainability matters for CAP/CLIA labs

CAP inspectors ask: “How did you arrive at this tier?” A black-box LLM answer — “the model said so” — is not an answer. UNMIRI returns the full reasoning chain for every variant: which knowledge-base entry matched, what evidence level applied, and whether any lab override was triggered. This is what makes the output auditable rather than merely plausible.

Architecture note. Classifications come from the knowledge graph. The LLM only formats the final report. This separation is what eliminates the hallucination failure mode. Full technical walkthrough: Why Vector RAG Fails for Oncology.

Output formats

Per variant, you receive: HGVS notation, transcript ID, clinical tier, evidence citations (with knowledge-base version), confidence flag, and any lab-specific override annotations. Output is available as structured JSON for LIMS ingestion, or rendered into the 2-page Actionable Insight alongside treatment recommendations.

Compliance controls — BAA, zero-retention, US-only residency — are documented on the security page. Practical architecture: Building a HIPAA-Ready Architecture for Clinical Decision Support.

How UNMIRI actually does this

UNMIRI classifies variants by traversing a knowledge graph that encodes OncoKB evidence levels, ClinVar clinical significance assertions, and ClinicalTrials.gov eligibility. The AMP/ASCO/CAP tier is produced by deterministic rules, not by an LLM. Every tier assignment traces back to a specific KB entry, and lab-specific overrides are first-class graph edges. More on the architecture.

Frequently asked questions

Our tiering follows the AMP/ASCO/CAP guideline framework (Tier I–IV), cross-referenced against OncoKB evidence levels (1–4) and ClinVar clinical significance classifications. Somatic variants use AMP/ASCO/CAP; germline variants use ACMG/AMP. Every variant receives both a clinical tier and a source attribution.

Free your bioinformatician from tier assignment.

Move 80% of curation work to the engine. Your bioinformatician reviews edge cases instead of every case.