Search API.
Programmatic access to the Fonteum natural-language search. The endpoint accepts a free-text query, runs it through an LLM rewriter + an embedding model, performs vector similarity over the healthcare-confirmed provider population, and streams ranked results as Server-Sent Events. Every result carries an 8-tuple provenance contract identical to the audit-pack and MCP server surfaces.
POST https://fonteum.com/api/v1/search
Request
JSON body with two fields:
q(string, required, 1–4000 chars) — the natural-language query.limit(number, optional, default 25, max 100) — maximum number of results to return.
POST /api/v1/search HTTP/1.1
Host: fonteum.com
Content-Type: application/json
{
"q": "dermatologists in Texas with high patient density",
"limit": 25
}Response
Server-Sent Events (Content-Type: text/event-stream). Each frame is one JSON object on a single data: line followed by a blank line. Five event types:
- meta — sent first; carries the parsed filters + a
rewriterUsedboolean signalling whether the LLM rewriter ran successfully. - result — one per ranked hit; carries the provider record + similarity score + 8-tuple provenance.
- error — sent if rewrite, embed, or search fails. Carries
stage+retryable+message. - complete — always sent last; carries
total+took_ms.
data: {"type":"meta","filters":{"vertical":"dermatologists","state":"TX"},"rewriterUsed":true,"rate_remaining":{"minute":59,"day":999}}
data: {"type":"result","rank":1,"npi":"1234567893","vertical":"dermatologists","vertical_display":"Dermatology","taxonomy_primary":"207N00000X","state":"TX","city":"AUSTIN","similarity":0.84,"cosine_distance":0.16,"snapshot_date":"2026-05-06","provenance":{"_source":"CMS NPPES NPI Registry (public API)","_source_url":"https://npiregistry.cms.hhs.gov/api/","_dataset_id":"nppes-npi-registry","_snapshot":"2026-05-06","_methodology":"v2026.05.0","_last_checked":"2026-05-09T07:00:00.000Z","_confidence":1.0,"_data_availability":["present"]}}
data: {"type":"complete","total":25,"took_ms":284}
Provenance schema (8-tuple)
Every result event carries a provenance object with these eight fields:
_source— human-readable name of the upstream source._source_url— canonical URL of the upstream source._dataset_id— stable identifier (e.g.nppes-npi-registry)._snapshot— release date of the snapshot the result came from._methodology— methodology version (e.g.v2026.05.0; pin a citation by methodology version)._last_checked— ISO timestamp of the response build._confidence— 0.00–1.00; confidence in the hydration-from-snapshot path (1.0 for direct NPPES matches)._data_availability— array; e.g.["present"]or["pending_refresh"].
Rate limits
Per source IP:
- 60 requests / minute (burst control).
- 1000 requests / 24 hours (daily quota).
Both buckets must allow the request. Exceeding either returns 429 with a retry_after_seconds field. The longer of the two is returned. Successful responses also carry X-Search-Rate-Remaining-Min and X-Search-Rate-Remaining-Day headers for client-side back-off awareness.
curl
curl -N -X POST https://fonteum.com/api/v1/search \
-H "Content-Type: application/json" \
-d '{"q":"dermatologists in Texas","limit":25}'The -Nflag disables curl’s buffering so SSE frames arrive as they’re produced.
Node (fetch)
const res = await fetch("https://fonteum.com/api/v1/search", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ q: "dermatologists in Texas", limit: 25 }),
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buf = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buf += decoder.decode(value, { stream: true });
let idx;
while ((idx = buf.indexOf("\n\n")) !== -1) {
const frame = buf.slice(0, idx).trim();
buf = buf.slice(idx + 2);
const ev = JSON.parse(frame.replace(/^data:\s*/, ""));
console.log(ev);
}
}Python (httpx)
import httpx, json
with httpx.stream("POST", "https://fonteum.com/api/v1/search",
json={"q": "dermatologists in Texas", "limit": 25},
headers={"Content-Type": "application/json"}, timeout=30) as r:
buf = ""
for chunk in r.iter_text():
buf += chunk
while "\n\n" in buf:
frame, buf = buf.split("\n\n", 1)
payload = frame.lstrip("data: ").strip()
if payload:
ev = json.loads(payload)
print(ev)Errors
400 invalid_json— body is not parseable JSON.400 missing_q—qfield missing or empty.400 query_too_long—qexceeds 4000 chars.429 rate_limited— per-IP rate limit exceeded. Includesretry_after_seconds.- error event mid-stream — rewrite, embed, or search failed. The endpoint always sends a final
completeevent so consumers have a deterministic close signal.