Submitting Fonteum datasets to Google Dataset Search + DCAT-US 3.0
Every public surface listed below ships schema.org/DatasetJSON-LD. Crawlers eventually find these on their own, but explicit submission to Google Dataset Search + data.gov cuts time-to- index from weeks to days. This page is the operator's submission runbook.
Dataset surfaces
9 surfaces emit Dataset JSON-LD. The /data surface aggregates all of them in DCAT-US 3.0 format — submit that one URL to register the entire Fonteum graph.
- https://fonteum.com/dataDataCatalog
DCAT-US 3.0 aggregate catalog. Submit this URL once to data.gov DCAT ingestion + Google Dataset Search; downstream agency catalogs (NIH ODSS, AHRQ, CMS data.gov) discover every Fonteum dataset through this surface.
- https://fonteum.com/coverageDataset
Top-level distribution-network dataset. Describes the healthcare-vertical provider directories that consume the Fonteum provider graph.
- https://fonteum.com/researchDataCatalog (per-study Dataset entries)
Catalog of every research study. Each per-study record on /research/[slug] is itself a Dataset record so journalists can cite individual studies via DOI.
- https://fonteum.com/freshnessDataset + 6 DataFeed
Live freshness status for the 6 canonical healthcare data sources (NPPES, PECOS, Care Compare, HRSA HPSA, BLS OEWS, BEA Regional). Each source emits a DataFeed with the latest snapshot date.
- https://fonteum.com/trust/integrityDataCatalog (per-attestation Dataset entries)
Public SHA-256 attestations for every snapshot Fonteum publishes. Re-fetch any source archive, re-hash, and compare to the published hash for byte-exact integrity checking.
- https://fonteum.com/identityDataset + WebAPI
Federated provider identity layer (NPPES NPI ↔ PECOS-ID ↔ CCN ↔ LEIE ↔ HRSA). The Dataset record describes the cross-source link table; the WebAPI describes the per-link resolver.
- https://fonteum.com/searchDataset + WebSite + SearchAction
Public natural-language search across the Fonteum healthcare-provider data graph. The Dataset.potentialAction registers the SearchAction so Google Dataset Search lists Fonteum as a queryable dataset.
- https://fonteum.com/v/<slug>Dataset (one per healthcare vertical)
Per-healthcare-vertical Dataset records (chiropractors, dermatologists, plastic-surgeons, med-spas, weight-loss, rehab-centers, hair-transplant, fertility-clinics, trt-clinics, ketamine-clinics).
- https://fonteum.com/care-compare/<module>Dataset (one per Care Compare module)
Per-Care-Compare-module Dataset records (nursing-homes, home-health, hospice, dialysis, asc).
One-time submission steps
1. Google Dataset Search
Submit each Dataset URL via Google Search Console > URL Inspection > Request Indexing. Once a single Dataset URL is indexed, Google Dataset Search picks up the rest from the sitemap. The fastest path is to submit /data — its DataCatalog record links every individual Dataset.
Confirm via search.google.com/search-console > URL Inspection > paste https://fonteum.com/data > Request Indexing
2. data.gov DCAT ingestion (federal-agency catalogs)
data.gov accepts DCAT-US 3.0 catalogs via its CKAN harvester. Submit /data to catalog.data.gov via the publisher form. Once registered, agency catalogs (NIH ODSS, AHRQ research portal, CMS data.gov listings) auto-harvest the catalog on their published cadence.
3. Verify with Google Rich Results Test
Before broad submission, verify each Dataset URL renders valid structured data:
https://search.google.com/test/rich-results?url=https://fonteum.com/coverage
Replace the URL parameter with each surface from the list above. The tool flags any required-field gaps that would block ingestion. The launch gate dataset-jsonld:every-surface-validates catches the same gaps in CI.