Policy Vault: from PDFs and Portals to FHIR‑Ready Coverage Rules
Turning payer policy content into clean, computable coverage rules—with hashes, dates, and URLs for every artifact.
TL;DR
Policy Vault ingests payer policy content (PDF/HTML), cleans the text, and segments true prior authorization (PA) criteria. That data powers two surfaces:
- Coverage Explorer (front end): a coverage checker for pharmacists and hubs.
- Computable Rules (back end): a rules feed payers can wire into CRD/DTR/PAS.
Everything is provable by provenance—we keep hashes, dates, and URLs for every artifact.
Why pharmacists still struggle with PA—even in 2025
If you’ve ever tried to quickly answer “Is PA required and what documentation do I need?” you already know the problem:
- Drug coverage criteria live behind portal shells or inside PDFs.
- Language is inconsistent across payers and lines of business (commercial vs. Medicare Advantage, drug class vs. drug category, or specialty/non‑specialty).
- Criteria are split across sections (“Target Agent(s)” here; “Clinical Criteria for Approval” there), multi‑policy program setups, and unique back‑end benefit‑design rules.
- Updates land without notice—last week’s answer may be stale today.
The result: wasted time, denials, and back‑and‑forth that delays therapy. The web is full of policy content, but what healthcare workflows need is computable coverage rules with provenance, and fewer clicks.
What we’re building at Policy Vault coverage.pharmacistwrite.com
Policy Vault is a methodology and a toolchain that transforms payer policy content into clean, structured, FHIR‑ready data. We ship it two ways (and the product is evolving):
- Coverage Explorer directory (front end) – A fast web directory that reduces the clicks required to find coverage criteria for drug × plan × indication triples.
- Computable Rules (back end) – Schematized output you can map to FHIR PlanDefinition / Questionnaire / Library (CQL) so CRD/DTR/PAS implementations can consume it. Stored in JSON, CSV, and lightweight database files.
Both share the same spine: crawl → clean → segment → extract → normalize → validate → publish, with hashes and URLs preserved end‑to‑end.
The methodology (how we make policy content computable)
1) Acquisition with provenance
We snapshot payer pages and PDFs and store:
- URLs (page + print), normalized.
- Hashes (MD5/SHA‑256) for idempotency.
- First/last seen dates.
This allows exact re‑link to source data.
2) HTML and/or PDF to text—without losing the signal
Policies vary by host (e.g., BCBS Alabama or Medica). We:
- Strip portal chrome/edge and on‑page fluff.
- Detect embedded PDFs and extract true policy text (with OCR fallback, e.g., Tesseract, when needed).
- Normalize bullets, line breaks, and hyphenation so sections read cleanly.
3) Smart segmentation beats “chunking”
A common failure mode for API calls is getting zero useful results when the input structure undermines the model. We fix that by:
- Identifying the PA window (e.g., from PRIOR AUTHORIZATION CLINICAL CRITERIA FOR APPROVAL to the start of quantity‑limit sections).
- Harvesting drug hints from “Target Agent(s)” tables and headings—even if those live outside the PA window.
- Maintaining alias awareness (preferred agents/products, step 2 products, non‑formulary, non‑preferred, etc.).
4) Two‑pass extraction (strict → interpretive)
- Pass 1 (strict): extract criteria explicitly present in the PA window. Guardrails allow simpler models here.
- Pass 2 (interpretive): when structure is fragmented, infer linkage (e.g., apply global criteria to the agents listed in drug hints) while keeping direct quotes as evidence. Use stronger models—this is closer to human reasoning.
Both passes return structured JSON with initial_criteria, reauth_criteria, approval_duration, preferred_products, and evidence snippets tied to the source.
5) Normalization to clinical codes (in research and testing now)
Map entities to standards required by downstream workflows:
- Drugs → RxNorm (and HCPCS/CPT for medical‑benefit drugs—coming).
- Diagnoses → ICD‑10 (if available).
- Labs/Measures → LOINC where feasible.
- Specialties → NUCC taxonomy.
Normalization is deterministic; if a code or unit can’t be resolved, we flag it for review instead of guessing. Human‑in‑the‑loop remains best practice.
6) QA harness with acceptance metrics
We maintain a gold set across payers and measure:
- Acceptance rate (“no edits needed”).
- Minor edit rate (spelling/format tweaks).
- Drift detection (policy changed vs. extracted criteria changed).
Every run emits a manifest of differences so you can review only the deltas.
7) Compile to FHIR (optional but powerful)
When payers or partners need computable rules, we convert our JSON to:
- PlanDefinition (actions + conditions)
- Questionnaire (documentation prompts for DTR)
- Library (CQL) to encode logic
- Bundles suited for CRD/DTR/PAS workflows
You can deploy these artifacts behind your own FHIR server or consume them directly in a prior‑auth service.
What this unlocks for pharmacists, hubs, and payers
For pharmacists & hubs
- Minutes, not hours: a clear “PA checklist” before submission.
- Fewer back‑and‑forths: criteria are specific, with provenance for auditing.
- Confidence: every rule can be traced to the exact policy version and hash.
For Medicare Advantage teams
- Content readiness: many MA orgs are moving toward FHIR‑based prior‑auth APIs and documentation templates. Computable rules shorten the path from policy PDF to CRD/DTR/PAS scenarios.
- Change control: when a policy moves a step‑therapy requirement or alters a lab threshold, you’ll see a diff and a timestamp—maximal audit readiness without headaches.
For vendors
- Data to wire into CDS Hooks (CRD cards) and Questionnaires (DTR) without building your own policy‑ops team.
- Evaluation harness to prove accuracy—and regress if needed—in pre‑production.
What “good” looks like (quality criteria)
- Coverage: % of targeted policies with successful PA extraction on first‑pass.
- Accuracy: SME acceptance rate on a per‑criterion basis.
- Downstream code completeness: % of criteria fully normalized to RxNorm/HCPCS/ICD‑10/LOINC.
- Time to freshness: hours between policy change and data refresh.
- Provenance fidelity: every triple (drug × indication × criteria) includes a URL, a hash, and an effective date.
We can report these as run‑level metrics so buyers and users can trust the output.
FAQ
Do you replace ePA or submission rails?
No. We focus on content intelligence—criteria clarity and computable rules. You can pair our output with your existing ePA or X12 flows.
How do you keep policies current?
We track first/last seen, hashes, and URLs. At scale, the intent is to run a daily diff pipeline that flags changes and re‑extracts only what’s new.
Can you handle medical‑benefit drugs (HCPCS) in addition to pharmacy benefit?
Yes. We are researching how to normalize to HCPCS/CPT and align criteria to medical‑benefit workflows—not just Rx.
What’s your stance on provenance and liability?
Every extracted criterion is traceable to a source URL + hash + timestamp. The tool is advisory; we encourage payer sign‑off for computable rules when used in production workflows.
Closing
The industry doesn’t need another portal search bar. It needs trustworthy, computable coverage rules with provenance that slot cleanly into pharmacist workflows and payer tech stacks. That’s what we’re doing with Policy Vault. If you want to see your top five high‑friction drugs rendered as clean criteria—with codes, evidence quotes, and a provable chain back to the source—let’s start there.
→ Try the Coverage Explorer at coverage.pharmacistwrite.com and reach out by email if you want a specialty‑focused pilot.
