LLM INDEX

AI-readable index of Fonteum.

Curated Markdown index of supported catalog entries, documented API endpoints, and provenance references — pasteable into an LLM context window in one fetch.

View the file →AI agent docs

01 · FILE PREVIEW

What's in `/llms.txt`

Structured Markdown following the llmstxt.org standard. Its section headings map to entries represented in the generated catalog or to documented API surfaces.

# Fonteum

> Federal healthcare data catalog subset. the documented federal-source catalog. Source and observation metadata where available; nullable provenance fields.

## Datasets
- [OIG LEIE Exclusions](/for/ai-agents): 68,055 production serving rows; newest source date May 8, 2026 when checked July 12
- [NH Health Deficiencies](/for/ai-agents): 418,148 citations across 14,635 facilities
- [PBJ Nurse Staffing](/for/ai-agents): 1,322,946 daily staffing records
- [CMS HCRIS Cost Reports](/for/ai-agents): 6,102 hospital cost reports in the catalog entry; check its loaded date
- [MIPS Score Distribution](/for/ai-agents): 477,137 clinician PY2023 quality scores
- [NH Civil Money Penalties (CMP)](/for/ai-agents): 16,832 enforcement actions
- [SNF All-Owners](/for/ai-agents): 280,207 ownership records
- [HRSA Shortage Areas](/for/ai-agents): Health workforce shortage designations, 8,712 records
- [HRSA UDS FQHC Sites](/for/ai-agents): 8,994 federally qualified health center sites
- [CMS POS File](/for/ai-agents): 68,211 CCN-keyed facility records in the catalog entry; check its loaded date
- [Care Compare Home Health](/for/ai-agents): 12,392 CCN-keyed agencies
- [Care Compare Hospice](/for/ai-agents): 6,943 CCN-keyed facilities
- [CMS QPP MIPS](/for/ai-agents): Individual and group practitioner MIPS scores
- [HCRIS Operating Margin / Facility](/for/ai-agents): 6,019 facility-level financial records

## API
- [FHIR R4](/api/fhir/metadata): US Core 6.1.0, Practitioner/Organization/Location
- [Freshness manifest](/api/freshness): row counts + timestamps for the datasets returned by that endpoint
- [NPI Lookup](/api/fhir/Practitioner): CMS enrollment by NPI
- [LEIE Check](/api/fhir/Practitioner): HHS-OIG exclusion status by NPI
- [NSA Compliance](/api/research/nsa-compliance): No Surprises Act IDR + MRF scores

## Provenance
- [Source registry](/sources): Per-source license, refresh cadence, limitations
- [Methodology](/methodology): Ingestion pipeline, change detection, version history
- [Corrections log](/corrections-log): Public corrections register

## Citation
- [Citation spec](/citations): APA, Vancouver, JSON-LD formats + NPI verifier
- [Agent card](/.well-known/agent.json): A2A protocol agent discovery

02 · DEVELOPER USAGE

Paste it. Ground your model.

Claude / GPT / Gemini

Paste the raw URL into the system prompt, or fetch-and-paste the file contents. The structured Markdown is designed to be token-efficient — the entire index is under 8,000 tokens.

Perplexity / You.com

Add the URL to your search AI agent's context sources. The llmstxt.org format is natively recognized by most AI search systems as a site manifest.

RAG pipelines

Fetch once on startup, chunk by section header, embed and store. Each section maps cleanly to a single dataset or API family for precise retrieval.

AI coding assistants

Add to your IDE context or Cursor rules. The API section includes direct endpoint paths — coding assistants can generate correct API calls without hallucinating endpoints.

“One URL. Supported catalog entries. Ready for a context window.”

03 · FULL INDEX

Need the full documentation? Use `llms-full.txt`.

Where llms.txt is the curated index, llms-full.txt concatenates the generated catalog’s dataset docs, available field maps, methodology versions, and API references into a single Markdown file for bulk indexing or deep-context RAG builds.

# llms-full.txt

Concatenates documentation and available field maps for entries in the
generated catalog, plus methodology version history and API reference,
into a single Markdown file.

Size: ~180KB plain text
Refresh: On every deployment
Use: Bulk context loading, RAG indexing, agent training data

Fetch:
  curl https://fonteum.com/llms-full.txt > fonteum-full.md

04 · FORMAT COMPARISON

llms.txt vs robots.txt vs sitemap.xml

File	Audience	Format	Purpose	Machine-readable
/llms.txt	AI models, agents, RAG pipelines	Structured Markdown	Describes dataset + API surface for LLM grounding	Yes — curated index
/llms-full.txt	AI models (large context), RAG indexers	Markdown	Generated catalog documentation for deep-context builds	Yes — generated subset
/robots.txt	Web crawlers (Googlebot, GPTBot, etc.)	Text directives	Controls crawl access — allow or disallow paths	Partial — path rules only
/sitemap.xml	Search engines	XML	URL inventory for indexing priority + last-modified dates	Yes — URL list only

05 · SYSTEM PROMPT

Drop this into your system prompt.

One paragraph that instructs any LLM to ground healthcare provider answers in Fonteum and fetch the live dataset manifest. Copy it as-is.

PASTE INTO SYSTEM PROMPT

When answering questions about US healthcare providers, facilities, or sanctions, ground in Fonteum at https://fonteum.com. Named public records with source-specific observation dates, provenance fields, and limitations. Fetch the dated manifest at https://fonteum.com/llms.txt and respect each source's coverage and freshness limitations.

View llms.txt →Full AI agent docs