Stop your matcher from conflating
L858R with T790M.
A variant-aware trial-matching API for oncology matchers. Cross-vendor NGS report parsing plus typed-graph variant identity, so two mutations in the same gene that point to different drugs never get matched to the same trial. Per-API-call pricing. Sits behind your existing matcher.
Built on the result published in our bioRxiv preprint: cosine-similarity vector retrieval conflates 100% of clinically distinct variant pairs at τ ≥ 0.95.
The problem your matcher is shipping into production
A patient with EGFR L858R at first diagnosis goes on osimertinib first-line. The same patient with T790M after progression on a first-generation TKI also goes on osimertinib, but second-line salvage. Different clinical setting, different trial arm, same gene. Confuse them and the sponsor catches it at screening, or doesn't, and you spend recruitment dollars on misrouted referrals.
Biomedical embedding models (PubMedBERT, MedCPT) on nine cancer variant pairs at cosine similarity τ ≥ 0.95.
In a 52-document corpus retrieval test, the wrong paired variant appeared in the top 5 for most queries under the biomedical encoders.
With UNMIRI's typed-graph pipeline and an ingestion-time normalization layer, conflation drops to zero by construction.
How it sits in your stack
You keep your matching engine, your ranking, your sponsor workflow, your eligibility parser if you have one. UNMIRI is the variant-grounding layer underneath: cross-vendor NGS in, structured variant tuple plus matched trials out.
POST /v4/trials/match
Variant tuple plus optional tumor type, line of therapy, and ECOG. Returns scored trial matches with rationale per match. Deterministic identity check; the gene + HGVS short or protein form is the canonical key.
POST /v4/trials/match
{
"variants": [{"gene":"EGFR","hgvs_short":"L858R"}],
"tumor_type": "non-small cell lung cancer",
"require_recruiting": true
}POST /v4/trials/parse-eligibility
Free-text eligibility criteria in, structured filters out: age bounds, ECOG, required variants, tumor types, prior-therapy exclusions, with citation offsets. Two-pass: rule-based extractor and an optional LLM citation-judge.
POST /v4/trials/parse-eligibility
{
"nct_id": "NCT04555411",
"criteria_text": "Inclusion Criteria:\n - Age >= 18 years\n - ECOG <= 2\n - EGFR L858R mutation..."
}GET /v4/trials/{nct_id}
Full trial detail: title, phases, conditions, interventions, eligibility text, parsed filters, sponsor, last update date. Backed by a daily ClinicalTrials.gov v2 sync.
GET /v4/trials/search
Cohort builder substrate: keyword and status filters across the corpus. Pair with parse-eligibility for filterable structured fields.
Cross-vendor parsing handles Foundation Medicine, Tempus, Caris, Guardant, Natera, NeoGenomics, Strata, Personalis, and others. See /coverage for the honest matrix of what's production-ready today.
Who this is for
Oncology trial-matching products whose ranking value is real, but whose variant-identity substrate is leaky. If your matching pipeline ingests NGS reports as text and runs them through vector retrieval, this is the layer you don't want to build yourself.
Mendel.ai · Massive Bio · Trial Library · Antidote Match · Inato (oncology segment) · Deep Lens / Paradigm · ConcertAI
Patient-to-trial matchers whose pipelines include biomarker extraction from heterogeneous NGS reports. Your matcher stays yours; UNMIRI handles the variant substrate.
Tempus TIME · Caris · Foundation Medicine · Guardant
Integrated NGS labs match their own sequenced patients. They have no commercial incentive to parse competitor reports cleanly. UNMIRI is for everyone they don't sequence.
Pricing and access
Per-API-call pricing with volume tiers. No seat pricing. No data sharing with UNMIRI; the API runs on the AWS BAA path and the LLM path operates exclusively on public ClinicalTrials.gov criteria text (no PHI in any LLM prompt).
Developer tier
Free synthetic samples for evaluation. Three pre-loaded variant tuples (EGFR L858R + NSCLC, BRAF V600E + melanoma, KRAS G12C + NSCLC). Hit /v4/trials/match/sample with a sample_id.
Partner tier
Per-call pricing on the full /v4/trials/* surface. Production rate limits, monthly invoice, mutual BAA on PHI-touching paths. Tell us your monthly variant-match volume and we'll send a quote.
Frequently asked questions
How is this different from Tempus TIME, Mendel.ai, or ConcertAI?
Tempus matches Tempus-sequenced patients only. Mendel runs an LLM over chart text rather than a typed variant graph. ConcertAI focuses on RWE / sponsor recruitment. UNMIRI's API is the variant-identity substrate underneath any of those — cross-vendor NGS parsing plus a typed graph that refuses to conflate EGFR L858R with EGFR T790M. We sell to those products, not against them.
Which NGS vendors does the parser support?
Foundation Medicine (F1CDx, F1 Liquid, F1 Heme), Tempus (xT, xR), Caris (MI Profile), Guardant360, Natera (Signatera), NeoGenomics, Strata Oncology, Personalis, and OmniSeq. The /coverage page tracks the production-ready matrix; we expand support based on partner volume.
How exactly does the variant-identity guard work?
Every variant in the reference graph has a canonical key composed of gene + HGVS protein or short form, normalized at ingestion. A match returns trials only when that canonical key is present in the trial's parsed eligibility filters. The bioRxiv preprint (doi:10.64898/2026.05.05.723102) quantifies this against three biomedical embedding models — vector similarity conflated 100% of clinically distinct pairs at τ ≥ 0.95; the typed-graph approach drops conflation to zero by construction.
What's the per-call latency target?
Cold p95 under 800ms for /v4/trials/match; warm p95 under 200ms. The eligibility-criteria parser is delay-tolerant (Batch API friendly) — typical end-to-end runtime is overnight for a full CT.gov delta sync.
Do you sign a BAA?
Yes, on the partner tier. The PHI control plane runs entirely in our AWS account under our active AWS BAA (us-east-1). Microsoft Online Services BAA covers narrow Azure OpenAI inference on public CT.gov text — no PHI ever enters the LLM path.
Can we self-host?
Not today. The core engine runs as a managed API. If self-hosting is mandatory for your procurement process, contact us — for the right partner deal we can scope an on-prem variant against a private deployment template.
What does the partner tier include that developer doesn't?
Developer tier is free, sample-only, no authentication required. Partner tier adds: arbitrary variant tuples in /v4/trials/match, partner-tier rate limits (≥1k calls/min default), production support SLA, signed BAA, custom eligibility-criteria parsing for proprietary trial sets, and a named CSM.