Beyond the PDF: Why We Need a “Compositional Grammar” for Coverage Rules -

Beyond the PDF: Why We Need a “Compositional Grammar” for Coverage Rules

Beyond the PDF: Why We Need a “Compositional Grammar” for Coverage Rules

Meta description (≈155 chars): PDFs are unreadable to machines. A compositional grammar—atoms + logic—plus a HITL QA Sidecar yields citable, 99.9%-grade coverage rules you can compare, query, and automate.

Suggested URL slug: /coverage-rules-compositional-grammar

TL;DR

Part 1 showed why 90% accuracy isn’t good enough in prior auth—the remaining 10% error compounds across many criteria and becomes a safety, compliance, and financial problem. The antidote is GenAI with evidence: a Human‑in‑the‑Loop (HITL) QA Sidecar verifying each criterion against the source. To make that verification scalable, we encode policies as a compositional grammar: atoms (e.g., cnid_bmi_30_ge) combined with Boolean logic. The result is a citable, auditable, and automatable policy‑as‑code layer.

Why the Grammar Matters (and how it fixes the 90% Problem)

In PA, a single determination may hinge on 10–15 specific checks (diagnosis, labs, time windows, step therapies, exclusions, renewal thresholds). Even if a model hits 90% per criterion, overall correctness nosedives—0.9¹⁰ ≈ 35%, 0.9¹⁵ ≈ 21%. By contrast, if we drive per‑criterion accuracy toward 99.9%, a 10‑criterion decision is ~99.0% correct.

The compositional grammar is what lets us measure and achieve that: it breaks prose into verifiable units (“atoms”) that the HITL Sidecar can accept or correct with line‑level evidence. Those verified atoms populate a Gold‑Standard dataset that pushes per‑criterion accuracy from “good drafts” to 99%+ decisions.

From Prose to Primitives: Atoms + Logic

Think of a coverage rule as two pieces:

Atoms — Named, typed predicates that evaluate to true/false for a member, claim, or auth request.
- cnid_bmi_30_ge → BMI ≥ 30
- age_ge_18 → Age ≥ 18
- dx_icd10_e11 → ICD‑10 Type 2 diabetes (E11.*)
- lab_a1c_7_gt_within_90d → HbA1c > 7.0% within last 90 days
- rx_metformin_trial_90d_ge_within_365d → ≥90‑day metformin trial in last year
- doc_lifestyle_program_6mo_ge → Documented lifestyle program ≥ 6 months

Naming pattern (stable + readable):
{namespace}_{concept}_{value?}_{operator}{_time/qualifiers?}
Names are deterministic, units are standardized, and each atom is typed (numeric, code, date, boolean) with mappings to ICD‑10, CPT/HCPCS, LOINC, RxNorm where applicable.

Logic — How atoms combine, using AND / OR / NOT with explicit grouping.

Samples shown below:

age_ge_18
AND
( cnid_bmi_30_ge OR (cnid_bmi_27_ge AND (dx_icd10_e11 OR dx_icd10_i10)) )
AND
doc_lifestyle_program_6mo_ge

{
  "all": [
    {"atom": "age_ge_18"},
    {
      "any": [
        {"atom": "cnid_bmi_30_ge"},
        {
          "all": [
            {"atom": "cnid_bmi_27_ge"},
            {"any": [
              {"atom": "dx_icd10_e11"},
              {"atom": "dx_icd10_i10"}
            ]}
          ]
        }
      ]
    },
    {"atom": "doc_lifestyle_program_6mo_ge"}
  ]
}

Provenance by Design: Make Every Atom Citable

To be trustworthy, each atom must be evidence‑backed:

{
  "id": "cnid_bmi_30_ge",
  "type": "numeric",
  "concept": "bmi",
  "operator": "ge",
  "value": 30,
  "unit": "kg/m2",
  "evidence": {
    "plan": "CarrierName Policy XYZ",
    "uid": "policy-xyz-2025-01",
    "pdf_url": "https://…/policy-xyz.pdf",
    "clause_id": "2.1.3"
  },
  "effective": {
    "start": "2025-01-01",
    "end": null
  },
  "version": "1.0.0"
}

Why this matters: the HITL QA Sidecar can show the model’s
claim on the left and the exact cited clause on the right. Reviewers
either accept or correct the atom; approvals are versioned and hash‑logged.
That workflow turns GenAI drafts into a Gold‑Standard dataset with 99.9%
clause accuracy (no one says 100%—think “Clorox standard”).

How the HITL QA Sidecar Uses the Grammar

Side‑by‑side verification

One-side:
Structured model output (plan, drug, indication, atomized criteria).
Other-side:
Cited passages {plan, uid, clause_id, pdf_url} with highlighted.

Pass/fail rubric

Scope
(plan/drug/indication), logic shape (AND/OR), thresholds (e.g., A1c ≥ 7),
windows (≤ 90 days), ICD‑10/CPT correctness, renewal criteria.

One‑step corrections

Swap
in the correct clause or threshold. Corrections are logged to improve
future runs.

Structured only

If
criteria don’t align, the item is sent back to earlier pipeline stages—no
unstructured “maybe” answers escape.

Versioned snapshots & drift

Track clause
prevalence and CNID usage over time for each
plan/drug/indication. When a policy changes, the diff appears at
atom level.

Why Grammar + Sidecar Unlocks Compare, Query, Automate

Compare
at scale
- “This
  year Payer A replaced cnid_bmi_30_ge with (cnid_bmi_27_ge AND
  dx_icd10_e11).”
- “Only
  3 of 14 plans require doc_lifestyle_program_6mo_ge for first‑line
  therapy.”
Query
the Policy Vault (the “magic”)
- “Show
  all policies using A1c thresholds > 7% or requiring two oral
  agents within 120 days.”
- “Find
  rules where negative conditions (NOT contraindications) drive
  denials.”
Automate,
but keep it explainable
- Pre‑Check
  bundles: Likely / Borderline / Unlikely CNIDs per plan‑drug-indication
  pairing.
- Doc
  Pack: Auto‑generated appeal packets and “missing evidence” flags.
- Transparent
  outcomes: Atom‑level pass/fail with the exact policy clause link.

Implementation Blueprint

Controlled
vocabulary — Canonical concepts, operators, units, and code systems.
Atom
registry — IDs, definitions, examples, evidence fields, versions,
effective dates.
Authoring
& linting — A policy editor that composes atoms, flags missing
units, unknown codes, dangling references.
Evaluation
engine — Strong typing, unit coercion, explicit handling of unknown.
Diffs
& alerts — Rule‑ and atom‑level change detection with subscriber
notifications.
APIs
— GET /atoms (or as I call them, Clinical Node IDs), GET /rules, POST
/evaluate returning verdicts and explanations.

Quality You Can Measure (and Promise)

Auditability KPIs

Traceability:
Every claim retains {plan, uid, clause_id, statement_hash, pdf_url}.
Snapshot
integrity: Answers tie to a specific policy version; clause and rule
hashes match the plan × drug × indication at first pass.
Drift
tracking: Alerts when thresholds/time windows change, with impact
analysis.
Reviewer
agreement: Inter‑rater reliability on Sidecar approvals.

Worked GLP‑1 Example (bridge to Article 3)

Draft claim: “Plan U requires two oral agents within 90
days and A1c ≥ 7% for initial GLP‑1 approval; renewal needs ≥
1% A1c reduction.”

Sidecar check items

Step
therapy atoms present? cnid_step_any_two_orals_n2, window_days=90
Lab
threshold atom present? cnid_lab_a1c_ge_7_90d
Renewal
atom present? cnid_renewal_a1c_drop_ge_1
Scope
correct? (GLP‑1, adult T2D)
Citations
present? Show exact clause lines; verify no contradicting text.

Decision: Approve or edit (e.g., step window is 120
days, not 90).
Outcome: Ships only when every atom is backed by a current snapshot
citation.

In Article 3, we’ll “mint” the first 25 CNIDs for GLP‑1
coverage—proof that the grammar is practical, testable, and reusable.

FAQ

Isn’t this just guidelines in JSON?
No—this is policy‑as‑code with provenance. Drug policy changes
continuously. We need typed atoms, normalized units/codes, explicit operators,
and versioned logic, each tied to the source clause.

How do you handle ambiguity and negatives?
We model them explicitly (e.g., NOT cnid_contraindication_any) and flag
ambiguous language for reviewer action. The QA Sidecar enforces a pass/fail
rubric by licensed clinicians

Will this replace PDFs?
The PDF remains the legal narrative. The grammar is the operational
substrate that makes policies comparable, queryable, and automatable..

Request a 10-Minute Walkthrough

Want to see the HITL QA Sidecar verify a GLP-1 policy end-to-end? We’ll review citations, atoms, diffs, and the evaluation output live.

✉️ Email to Request the Demo

About Andrew

Hey! I’m Andrew Gilberto Vargas, a pharmacist and writer. I reflect on concepts that shape pharmacy benefits, drug access, leadership and meaning-making. Always curious, always learning.

behind the back handshake with signals up

Why Employers Stay with PBMs They Don’t Trust

January 21, 2026

A horizon, parent, child, eye and ekg readout in the night sky

Entrusting Resilience: Cardinal Signs and the Discipline of Not Intervening

December 31, 2025

A Reese's cup sitting on top of a written prescription.

The Reese’s Prescription: Administrative Pathogenesis

January 7, 2026

A person holding text in their hand for the GAP

The Goodness Arbitrage Protocol (GAP): An Ethical Framework for Reflective Creation

May 26, 2025

Long-Chain Contextual Reframing: Restoring Meaning Through Context

November 30, 2025

About the Author

Andrew Vargas, PharmD is a Pharmacist practicing watchdoggery and founder of Pharmacist Write. He builds coverage intelligence tools and writes about what pharmacy benefits managers would prefer stayed invisible—turning policy into something patients, consultants, and purchasers can actually use.

🧠 Read full bio · View all articles