Skip to content
1,322,867 nurse-staffing records · CMS PBJ
fonteum
DataAPIRisk SignalsResearchCompareSnapshotsRequest access →
Inpatient utilization · Reference

Per-CCN per-MS-DRG inpatient utilization, free + open. The dataset Definitive sells as Hospital Performance.

Fonteumingests the CMS Medicare Inpatient Hospitals by Provider and Service file on the daily HEAD-probe pattern, with full ingestion firing on each annual mid-June release. ~60M rows per data year. The 14-tuple provenance contract ships inline with every API response so consumers verify what they’re looking at without a second round-trip.

Try the API → Outpatient sister dataset → Data catalog →

1. What this dataset is

Per-CCN per-MS-DRG Medicare inpatient discharge + payment + length-of-stay aggregates.

CMS publishes the “Medicare Inpatient Hospitals by Provider and Service” file annually each mid-June. One row per (CCN, MS-DRG code, data year). Each row carries the count of Medicare fee-for-service discharges, the average covered/total/Medicare payment amounts, and the average length of stay. Coverage extends back to data year 2018 in the current schema (legacy 2013-2017 schema is different and not ingested in Phase 2).

For context: this is the dataset Definitive Healthcare sells as their flagship “Hospital Performance” module at $45,000-$95,000 per buyer per year (depending on facility count). Fonteum publishes it free, open, with full 14-tuple provenance + Dataset JSON-LD discoverability + free .edu/.gov researcher tier.

2. What this dataset is NOT

No PHI. No claims. No patient records.

CMS pre-aggregates this file to facility-level rows before public release. The dataset contains:

  • NO patient identifiers — no names, no addresses, no dates of birth.
  • NO claim-level rows — only annual rollups per (CCN, MS-DRG).
  • NO discharge dates — just the data year.
  • NO cells with discharge counts under 11 — CMS pre-suppresses these per its privacy policy. Our schema preserves the suppression by allowing NULL in the count fields.

We additionally drop the provider-name and provider-address columns CMS ships with the file: those facts already live in the CMS Provider of Services (POS) file (canonical name + address per CCN), and dual-storage would create drift. Joins back to POS happen at query time via the federated identity bridge.

3. Refresh schedule

Daily HEAD probe at 06:00 UTC. Full ingest on annual mid-June release.

The Inngest cron runs daily on the schedule 0 6 * * *. The HEAD probe is cheap and short-circuits via the UNIQUE(source_id, snapshot_date) constraint when nothing has changed. Full ingest only fires when CMS publishes a new data year — typically once a year mid-June.

4. How it joins to other sources

CCN is the bridge to POS, Care Compare, ownership chains.

Every utilization row carries a CMS Certification Number (CCN). The federated identity layer (/identity) joins the CMS Provider of Services (POS) file (canonical facility name + address + type), Care Compare quality ratings, NH ownership chains, and now utilization in a single query.

Format guard: every CCN is validated against the ^[A-Z0-9]{6}$ pattern at ingest time. Rows failing the check are dropped before they reach inpatient_utilization_summary.

5. The API

GET /api/v1/utilization/inpatient/[ccn]

Returns the top-10 MS-DRGs by discharge count for the given CCN, with the full 14-tuple provenance contract attached inline. Auth flows through the standard withApi handler — bearer token, rate limit, tier resolution. The free .edu/.gov researcher tier gets the same envelope as the paid tiers.

{
  "data": {
    "ccn": "010001",
    "data_year": 2022,
    "top_ms_drgs": [
      {
        "ms_drg_code": "470",
        "ms_drg_description": "MAJOR JOINT REPLACEMENT OR REATTACHMENT...",
        "total_discharges": 342,
        "avg_covered_charges": 65000,
        "avg_total_payments": 13800,
        "avg_medicare_payments": 12100,
        "avg_length_of_stay": 2.5,
        "data_year": 2022
      }
    ],
    "provenance": {
      "_source": "CMS Medicare Inpatient Hospitals by Provider and Service",
      "_dataset_id": "cms-inpatient-utilization",
      "_snapshot": "2022-12-31",
      "_methodology": "v2026.05.0",
      "_license": "US-Government-Works",
      "_coverage_period_start": "2018-01-01",
      "_coverage_period_end": "ongoing"
    }
  },
  "meta": { "request_id": "req_...", "api_version": "v1", "...": "..." }
}
6. License + redistribution

US-Government-Works. Anyone can redistribute.

CMS publishes this file as a federal-government work, public domain in the U.S. under 17 U.S.C. §105 and the Open Government Data Act. The SPDX identifier US-Government-Works is what Fonteum surfaces in the provenance contract’s _license field for every row derived from this dataset.

7. How to cite

APA-ish, with the upstream CMS source named.

Fonteum. (2026). CMS Medicare Inpatient
Hospitals by Provider and Service [data set]. https://fonteum.com/docs/utilization-inpatient.
Retrieved [date]. Original source: Centers for Medicare & Medicaid
Services. License: US-Government-Works.

Detailed researcher citation guidance lives at /cite; the researcher-api docs describe the citation TOS for the free tier.

8. Verify the snapshot

SHA-256 attestation + S3 cache mirror.

Every snapshot lands with a SHA-256 attestation written by writeAttestation (PR #135). When the source-cache mirror (PR #154) is provisioned, every snapshot also mirrors to S3 — verifiers can re-download the original CSV from the cache and recompute the hash to confirm byte-exact provenance. Use /verify to walk the chain for any snapshot.

Compliance posture

Methodology · Corrections log · Editorial policy

fonteum

Product

  • Data
  • API
  • Methodology
  • Sources
  • Freshness
  • Citations

For buyers

  • AI agents
  • RAG developers
  • Compliance
  • Researchers
  • Developers

Reference

  • Compare
  • llms.txt
  • Agent card
  • Audit pack
  • Quality scorecard
  • Pilot intake
  • Research

Sourced from federal agencies. Fonteum, Inc., Delaware C-corp. © 2026.

Request access→