Skip to content
1,322,867 nurse-staffing records · CMS PBJ
fonteum
DataAPIResearchCompareRequest a pilot →

FOR · ACADEMIC RESEARCHERS

Cite-able federal data, prepared.

Analysis-ready CMS and HHS-OIG datasets cross-joined on CCN and NPI. Methodology documented per version. Public-use, no IRB review required.

Request data access →Read the methodology →

Federal upstream data: U.S. Government Works. Fonteum compilation: CC BY 4.0.

Federal datasets

CMS, HHS-OIG, and HHS sub-agencies. All public-domain upstream.

Total rows

Across all 13 datasets. Cross-joinable on CCN and NPI.

IRB requirements

Federal administrative data is public-use. No patient records. No de-identification required.

What's cite-able

13 federal datasets with upstream source URLs.

Every dataset row carries the federal source URL at the time of ingest. The table below lists the upstream origin for each dataset — the URL a peer reviewer can independently retrieve. Fonteum's compilation layer adds cross-source joins, field typing, and methodology versioning; it does not alter or clean the source values.

DatasetFederal sourceRowsGrainJoin keyLicense
CMS Provider of Services (POS) iQIESdata.cms.gov ↗68,211Per certified facilityCCNU.S. Government Works
CMS Care Compare — Home Healthdata.cms.gov ↗12,392Per CCN-keyed agencyCCNU.S. Government Works
CMS Care Compare — Hospicedata.cms.gov ↗6,943Per CCN-keyed facilityCCNU.S. Government Works
CMS Care Compare — Nursing Home Penaltiesdata.cms.gov ↗16,832Per enforcement actionCCN + Survey Event IDU.S. Government Works
CMS NH Health Deficienciesdata.cms.gov ↗418,148Per citationCCN + Survey Event IDU.S. Government Works
OIG LEIE Exclusionsoig.hhs.gov ↗68,055Per excluded individual / entityNPI (joined)U.S. Government Works
CMS PECOS PPEF (Medical Enrichment)data.cms.gov ↗VariesPer enrolled providerNPIU.S. Government Works
CMS QPP MIPS Individual Scoresqpp.cms.gov ↗477,137Per clinician, per performance yearNPIU.S. Government Works
HCRIS Hospital Cost Reportswww.cms.gov ↗6,102Per facility, per cost report periodCCNU.S. Government Works
CMS Open Paymentsopenpaymentsdata.cms.gov ↗VariesPer payment recordNPIU.S. Government Works
Federally Qualified Health Centers (HHS annual utilization data)data.hrsa.gov ↗~9,000Per FQHC siteFQHC ID (NPI joinable)U.S. Government Works
NSA IDR + MRF Compliancewww.cms.gov ↗DerivedPer entityNPI / Tax IDU.S. Government Works (upstream); CC BY 4.0 (Fonteum scoring layer)
CMS NSA Surprise Billing IDR Filingswww.cms.gov ↗DerivedPer filing, per initiating partyNPI / Tax IDU.S. Government Works

Methods-section boilerplate

Drop-in paragraph for journal submission.

Replace the bracketed tokens with the specific dataset, federal source name, methodology version string (e.g. snf-owners/v1), and snapshot date. The methodology version is pinned at export time and retrievable from /methodology indefinitely.

Data for this analysis were obtained from Fonteum
(fonteum.com), a federally-sourced healthcare data
infrastructure layer. The dataset used [DATASET] is
derived from [FEDERAL SOURCE], methodology version
[VERSION], snapshot date [DATE]. The upstream federal
data are public-domain (U.S. Government Works); the
Fonteum compilation is available under CC BY 4.0.
Cross-source joins are performed on CMS Certification
Number (CCN) or National Provider Identifier (NPI)
as documented in the methodology version cited above.
No patient-level data are included. No IRB review
was required for this analysis.

The methodology version in your data export matches the version page at fonteum.com/methodology/[dataset]. That page is durable — the same URL is retrievable after publication so peer reviewers and journal editors can independently verify the methods.


Reproducibility

Stata, R, and Python codebooks on request.

Statistical codebooks are available upon pilot access request. Each codebook includes variable descriptions, dtype contracts, and worked join examples that replicate the cross-source joins documented in the methodology.

Stata

Value labels, variable descriptions, and import scripts for all 13 datasets.

R

Tidyverse-compatible tibble import, column typing, and join vignettes.

Python

pandas / polars import scripts with dtype contracts and CCN ↔ NPI join examples.

Why this works as a reproducibility reference

The methodology page is the audit artifact.

Every Fonteum dataset ships a public methodology page at /methodology/[dataset]. The page renders: source family and Tier classification, ingest cadence, field schema with per-field confidence levels, join logic, known limitations, and version history with change rationale.

A peer reviewer who questions your methods gets a URL, not a vendor statement. The methodology version in the URL matches the version in your data export — and it does not change after you publish.

Browse all methodology pages →

Pre-publication data dictionary

Every field documented before you commit to a methodology.

Pilot access includes the full pre-publication data dictionary: field names, types, null rates, known edge cases, and the CMS/HHS source column they map to. The dictionary is delivered as a machine-readable JSON alongside the CSV export so your analysis scripts can validate dtypes at import time without manual inspection.

Field-level null rates

Per field, per dataset snapshot. Null rate changes flagged across versions.

Source column mapping

Every Fonteum field traces to the originating federal column name and file.

Known edge cases

CMS suppression sentinels ("*", "DS"), partial-year cost reports, facility closures mid-year.

Version diff

Field additions, removals, and type changes are documented between methodology versions.

How to cite Fonteum

Four citation formats. One per data export.

Every data export from the Fonteum API includes a four-format citation block in the response envelope: APA, Chicago, plain text, and BibTeX. The citation pins the methodology version and snapshot date so the reference is reproducible regardless of when a reader retrieves it.

// APA
Fonteum, Inc. (2026). [Dataset name], methodology
version [VERSION], snapshot [DATE].
Fonteum. https://fonteum.com/methodology/[dataset]

// BibTeX
@dataset{fonteum_[dataset]_[year],
  author    = {{Fonteum, Inc.}},
  title     = {[Dataset name]},
  year      = {[year]},
  version   = {[VERSION]},
  publisher = {Fonteum},
  url       = {https://fonteum.com/methodology/[dataset]},
  note      = {Snapshot date: [DATE]. CC BY 4.0.}
}

The citation block in your API response is generated from the pinned methodology version — it does not change when a new methodology version is released. Your published citation remains valid.

Data access

Request data access for your study.

Describe the study scope (datasets, analysis period, research question), and we send a scoped data access agreement within 2 business days. Academic research requests receive a reduced pilot rate. No procurement loop required for single-PI studies.

Request data access →Read the methodology →

Compliance posture

Methodology · Corrections log · Editorial policy

fonteum

Product

  • Data
  • API
  • Methodology
  • Sources
  • Freshness
  • Citations

For buyers

  • AI agents
  • RAG developers
  • Compliance
  • Investors
  • Researchers
  • Developers

Reference

  • Compare
  • llms.txt
  • Agent card
  • Audit pack
  • Pilot intake
  • Research

Sourced from CMS and HHS-OIG. Fonteum, Inc., Delaware C-corp. © 2026.