Docs → Hugging Face
Hugging Face datasets.
Fonteum publishes attested federal healthcare snapshots to huggingface.co/datasets. Each dataset card links back to the source attestation chain so verifiers can recompute the SHA-256 hash and confirm the snapshot was not modified after publication.
Quickstart — Python
from datasets import load_dataset
ds = load_dataset("fonteum/cms-nppes", split="latest")
df = ds.to_pandas()
print(df.columns.tolist())The load_dataset call authenticates via your Hugging Face token. Datasets are public; no token is required for read access.
JavaScript / TypeScript
import { listDatasets } from "@huggingface/hub";
for await (const dataset of listDatasets({ search: { owner: "fonteum" } })) {
console.log(dataset.id);
}Use @huggingface/hub for metadata queries, streaming reads, and dataset card inspection from Node.js or edge runtimes.
Row provenance
Every row in every Fonteum dataset carries a 14-tuple provenance header. The 14-tuple encodes source family, snapshot date, ingestion run ID, content hash, attestation ID, methodology version, and seven additional traceability fields. This matches the 14-tuple provenance standard used in the FHIR R4 resource layer at /docs/fhir.