Synthetic data only. Educational use only. Synthetic data, demonstration only. Not a medical device. Not for unsupervised clinical decision-making.

EGFR L858R in metastatic NSCLC

The starting input

A synthetic sample report in the Foundation Medicine F1CDx layout. NSCLC, lung, FFPE biopsy, panel size 324 genes. The sample is fully synthetic, watermarked Synthetic data — demonstration only, and built to match the F1CDx report structure. It is not a Foundation Medicine document and contains no real patient data. It lives in our repo at data/sample_reports/synthetic/f1cdx_format_nsclc_egfr-l858r.pdf. Try it live on the vendor sample showcase.

What the parser extracts

The Tier 1 deterministic FMI parser pulls:

  • One variant: EGFR exon 21 L858R, VAF 38%, transcript NM_005228.5, HGVS p.L858R / c.2573T>G
  • Three biomarkers: TMB 4 mut/Mb, MSI-stable, no germline pathogenic findings
  • Two FDA CDx flags: osimertinib, dacomitinib (NSCLC EGFR L858R indications)
  • Trial-section pointer: TRIALS TO CONSIDER block at page 6

What the evidence join adds

With the variant locked, /v1/lookup joins the reference graph and returns:

  • CIViC: variant ID 33, multiple curated evidence items, top assertion EGFR L858R is associated with sensitivity to EGFR inhibitors in NSCLC
  • ClinVar: pathogenic, multiple submitters with criteria provided
  • openFDA: osimertinib indication for first-line NSCLC with EGFR exon 21 L858R substitutions
  • ClinicalTrials.gov: 47 actively recruiting NSCLC trials matching EGFR L858R
  • AMP/ASCO/CAP tier: I-A

Live at /lookup.

What the CDS surface produces

Engine 2 routes the parsed report into deterministic templates, not LLMs. The recommendations card surfaces the FDA-approved first-line option, the second- and third-line options, the relevant CPIC pharmacogenomics flags (none here, since EGFR is the somatic driver, not a CYP), and the matching trial NCT IDs. The educational-use banner is sticky on the page.

Live at /recommendations.

Where the LLM is and isn't

On a clean F1CDx like this one, the pipeline never touches the Tier 4 vision LLM. Tier 1 deterministic parsing satisfies the validation gate. The LLM judge runs only against high-uncertainty findings and would be marked clearly in the response if it had fired. The final clinical surface is rendered from deterministic templates.

What this case does not show

Vendor edge cases. A Caris MI Profile reports the same biology in a different layout. A Tempus xT formats CDx flags differently. The point of UNMIRI is that the canonical schema is identical across all three. See the side-by-side compare view for that.

Related references