Docs · Search API

Search API.

Programmatic access to the Fonteum natural-language search. The endpoint accepts a free-text query, runs it through an LLM rewriter + an embedding model, performs vector similarity over the healthcare-confirmed provider population, and streams ranked results as Server-Sent Events. Every result carries a fourteen-field provenance contract identical to the audit-pack and MCP server surfaces.

POST https://fonteum.com/api/v1/search

Request

JSON body with two fields:

q (string, required, 1–4000 chars) — the natural-language query.
limit (number, optional, default 25, max 100) — maximum number of results to return.

POST /api/v1/search HTTP/1.1
Host: fonteum.com
Content-Type: application/json

{
  "q": "dermatologists in Texas with high patient density",
  "limit": 25
}

Response

Server-Sent Events (Content-Type: text/event-stream). Each frame is one JSON object on a single data: line followed by a blank line. Five event types:

meta — sent first; carries the parsed filters + a rewriterUsed boolean signalling whether the LLM rewriter ran successfully.
result — one per ranked hit; carries the provider record + similarity score + fourteen-field provenance.
error — sent if rewrite, embed, or search fails. Carries stage + retryable + message.
complete — always sent last; carries total + took_ms.

data: {"type":"meta","filters":{"vertical":"dermatologists","state":"TX"},"rewriterUsed":true,"rate_remaining":{"minute":59,"day":999}}

data: {"type":"result","rank":1,"npi":"1234567893","vertical":"dermatologists","vertical_display":"Dermatology","taxonomy_primary":"207N00000X","state":"TX","city":"AUSTIN","similarity":0.84,"cosine_distance":0.16,"snapshot_date":"2026-05-06","provenance":{"_source":"CMS NPPES NPI Registry (public API)","_source_url":"https://npiregistry.cms.hhs.gov/api/","_dataset_id":"nppes-npi-registry","_snapshot":"2026-05-06","_methodology":"v2026.05.0","_last_checked":"2026-05-09T07:00:00.000Z","_confidence":1.0,"_data_availability":["present"]}}

data: {"type":"complete","total":25,"took_ms":284}

Provenance schema (fourteen-field)

Every result event carries a provenance object with these fourteen fields:

_source — human-readable name of the upstream source.
_source_url — canonical URL of the upstream source.
_dataset_id — stable identifier (e.g. nppes-npi-registry).
_snapshot — release date of the snapshot the result came from.
_methodology — methodology version (e.g. v2026.05.0; pin a citation by methodology version).
_last_checked — ISO timestamp of the response build.
_confidence — 0.00–1.00; confidence in the hydration-from-snapshot path (1.0 for direct NPPES matches).
_data_availability — array; e.g. ["present"] or ["pending_refresh"].
_pipeline_version — git commit SHA of the ingestion code (nullable).
_doi — DOI for the methodology version (nullable, not yet issued).
_license — SPDX identifier, e.g. US-Government-Works (nullable).
_coverage_period_start — ISO date data coverage begins (nullable).
_coverage_period_end — ISO date data coverage ends or "ongoing" (nullable).
_slsa_provenance_url — URL to the SLSA Build Level 3 artifact (nullable).

Rate limits

Per source IP:

60 requests / minute (burst control).
1000 requests / 24 hours (daily quota).

Both buckets must allow the request. Exceeding either returns 429 with a retry_after_seconds field. The longer of the two is returned. Successful responses also carry X-Search-Rate-Remaining-Min and X-Search-Rate-Remaining-Day headers for client-side back-off awareness.

curl

curl -N -X POST https://fonteum.com/api/v1/search \
  -H "Content-Type: application/json" \
  -d '{"q":"dermatologists in Texas","limit":25}'

The -Nflag disables curl’s buffering so SSE frames arrive as they’re produced.

Node (fetch)

const res = await fetch("https://fonteum.com/api/v1/search", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ q: "dermatologists in Texas", limit: 25 }),
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buf = "";
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  buf += decoder.decode(value, { stream: true });
  let idx;
  while ((idx = buf.indexOf("\n\n")) !== -1) {
    const frame = buf.slice(0, idx).trim();
    buf = buf.slice(idx + 2);
    const ev = JSON.parse(frame.replace(/^data:\s*/, ""));
    console.log(ev);
  }
}

Python (httpx)

import httpx, json

with httpx.stream("POST", "https://fonteum.com/api/v1/search",
    json={"q": "dermatologists in Texas", "limit": 25},
    headers={"Content-Type": "application/json"}, timeout=30) as r:
    buf = ""
    for chunk in r.iter_text():
        buf += chunk
        while "\n\n" in buf:
            frame, buf = buf.split("\n\n", 1)
            payload = frame.lstrip("data: ").strip()
            if payload:
                ev = json.loads(payload)
                print(ev)

Errors

400 invalid_json — body is not parseable JSON.
400 missing_q — q field missing or empty.
400 query_too_long — q exceeds 4000 chars.
429 rate_limited — per-IP rate limit exceeded. Includes retry_after_seconds.
error event mid-stream — rewrite, embed, or search failed. The endpoint always sends a final complete event so consumers have a deterministic close signal.