How Fonteum reconciles provider records across sources.
CCN is the 100% anchor identifier across all 15 registered source families. NPI is partial (~10% via OIG LEIE intersection); full NPPES coverage lands Sprint 2. TIN, DEA, and state Medicaid IDs are dated commitments on the roadmap below.
We publish identifier coverage honestly — including where it's missing — because acquirer data teams will diligence it anyway. Same brand discipline as /trust/data-provenance and the data-availability flags on /sanctions.
Current-state coverage matrix
Per-identifier coverage across the Fonteum corpus, with explicit Sprint roadmap commitments where coverage is partial or pending.
| Identifier | Definition | Coverage | Source families | Notes |
|---|---|---|---|---|
| CCN | CMS Certification Number — 6-character TEXT, leading zeros preserved | 100% | POS + Care Compare 8 + NH-depth 4 + LEIE-via-name-match | Anchor identifier across our corpus. Every CCN-keyed query joins all 15 source families. |
| NPI | National Provider Identifier — 10-digit TEXT | ~10% (LEIE intersection only) | OIG LEIE | Full NPI mapping requires NPPES ingestion (Sprint 2 priority Q3 2026). LEIE provides NPI for 8,551 of 83,001 exclusion records (~10%); the rest carry name + state. |
| TIN | Taxpayer ID — 9-digit TEXT, masked in most public sources | Not currently ingested | (Sprint 2: PECOS Ordering & Referring) | Available via CMS PECOS Ordering & Referring file (Sprint 2 priority Q3 2026). Public PECOS exposes TIN for ordering/referring providers; full TIN coverage requires the §108B PECOS ingestion to land. |
| DEA | Drug Enforcement Administration registration — 9-character TEXT | Not currently ingested | (Sprint 3: DEA Active Registrants) | DEA Active Registrants Q4 2026 target. Subject to DEA data-distribution licensing review; not all DEA distributions are redistributable. |
| Medicaid Provider ID | Per-state Medicaid program ID — format varies by state | Not currently ingested | (Sprint 4: state-specific) | State-by-state ingestion. CA, NY, TX prioritized for Sprint 4 (Q1 2027). Each state has its own Medicaid Management Information System (MMIS) with distinct data formats and access policies. |
| HCP-OneKey ID | IQVIA proprietary individual-physician identifier | Not applicable | — | Proprietary identifier requiring IQVIA license. Fonteum does not ingest licensed reference data; Fonteum is anchored on public-record sources only. |
Sample crosswalk — CCN 015009
One worked example showing what a single facility looks like when joined across the registered source families. Verifiable via the audit-pack export endpoint at /api/v1/audit-pack/export?ccn=015009 (requires API key — see /data-platform/api-access).
| CCN | 015009· Anchor identifier |
| Facility name | BURNS NURSING HOME, INC.· As reported in CMS POS |
| State | AL |
| Facility type | skilled-nursing-facility |
| POS record | 1 (release 2026-05-07) |
| Care Compare NH | 1 (release 2026-05-07) |
| PBJ daily staffing | 4 reporting quarters |
| NH deficiencies | N citations (per snapshot) |
| NH penalties | M records (per snapshot) |
| SNF Owners (Phase-1) | 1 record · ownership_pct missing per Health Affairs 2024 baseline |
| SFF status | flagged 85/441 active/candidate corpus |
| NPI | not currently mapped (LEIE intersection N/A) |
| TIN | not ingested (Sprint 2 PECOS) |
| DEA | not ingested (Sprint 3) |
Per-field provenance ships as an 8-tuple on every audit-pack record: (source, source_url, dataset_id, snapshot_date, methodology_version, last_checked, confidence_score, data_availability).
JSON snippet — first NDJSON line of /api/v1/audit-pack/export?ccn=015009
{"schema_type":"ownlisted-audit-pack-export","schema_version":"1.0","ccn":"015009","format":"ndjson","methodology_version":"v2026.05.0","emitted_at":"<iso-timestamp>"}
{"line_type":"audit_pack_record","ccn":"015009","facility_type":"skilled-nursing-facility","facility_name":"BURNS NURSING HOME, INC.","state":"AL","pos_qa_status":"ok",
"fields":{"sff_status":{"value":"<status>","provenance":{"source":"CMS Special Focus Facility list","snapshot_date":"2026-05-07","data_availability":"available"}},
"ownership_pct":{"value":null,"provenance":{"source":"CMS SNF All Owners","snapshot_date":"<iso>","data_availability":"missing-by-design","caveat":"Per Health Affairs March 2024: 82.40% of top-10-chain ownership_pct missing in CMS data"}}}}Crosswalk methodology in brief
CCN-anchored joins. Every facility-keyed query in the Fonteum corpus joins on the 6-character CCN with leading zeros preserved as TEXT (not coerced to integer — alphanumeric CCNs would lose information on int-cast). The CCN is the 100% anchor across POS + Care Compare 8 + NH-depth 4.
Name + state fallback. Sources without CCN (OIG LEIE individual-provider records, restricted-source name lists) join via fuzzy name + state pairing. The fuzzy match runs through the §sprint1-nh-depth-ingest-b PE/REIT entity registry pattern: substring match against a curated alias list with per-entity confidence scoring. Match results are flagged in the audit-pack with data_availability: "name-matched" so downstream consumers can opt out of fuzzy joins.
Edge cases.CCN reuse after facility closure (a CCN can be re-issued after termination) is handled by joining on (CCN, snapshot_date) tuples — every audit-pack record is anchored to a snapshot date, so two sequential occupants of a CCN appear as distinct records, not a merged history. Facility name variants (legal name vs DBA, casing differences, "Inc." vs "Incorporated") are normalized via the §sprint1-export hydration service before comparison.
Full per-field provenance contract: /methodology. Per-source license + redistribution posture: /trust/data-provenance.
Roadmap
Dated commitments. Each entry is subject to data-source availability and licensing review; we update this page on Sprint completion (or earlier if a milestone slips).
How acquirers and integration partners consume this
- REST API today. GET /api/v1/audit-pack/export?ccn=015009 returns the full record with all available identifiers + per-field provenance. Authenticate via the API access flow at /data-platform/api-access.
- Delta Sharing (Sprint 2 Q3 2026). Parquet-native crosswalk table with same provenance contract.
- Snowflake Secure Data Share (Sprint 2 Q3 2026). Same data, exposed natively via Snowflake-native distribution.
- SFTP delivery. Available on request — see /integrations.
Related
- · /sources — Source registry index
- · /sources/cadence — Per-source refresh cadence
- · /trust/data-provenance — Per-source license + redistribution posture
- · /audit-pack — Compliance-grade export deliverable
- · /methodology — Per-field provenance contract
- · /integrations — Delta Sharing, Snowflake, S3 roadmap
Compliance posture
We don’t sell ranking and don’t accept payment to move a provider up the list. For final hire decisions, verify licensing, insurance, and references directly with the applicable licensing or credentialing body.
No bulk-licensing source family is currently ingested for this vertical. Hire-time checking still routes through the body named above.