The problem

An AI agent calls tools in a loop. Search, fetch URL, run code, query database — each tool gives the model fresh context. The model decides what to do next based on that context.

The failure mode: the model emits factual assertions that aren't in the tool results. The agent uses its parametric memory to fill in gaps, gets a release date wrong by 6 months, gets a parameter count wrong by 10× — and the rest of the agent loop builds on that false foundation. Multi-step agent failures compound.

The fix: add verify_claim to the agent's tool catalog. Instruct the agent to call it before asserting any AI/ML factual claim. The model self-invokes when uncertain; the verification result includes signed primary sources the agent can cite.

The pattern

# OpenAI tool catalog excerpt
tools = [
    # ... your existing tools ...
    {
        "type": "function",
        "function": {
            "name": "verify_claim",
            "description": (
                "Verify a natural-language factual claim about AI/ML "
                "(model releases, papers, dates, parameter counts). "
                "Returns verified envelope with primary sources or no match."
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "claim": {"type": "string"},
                    "min_confidence": {"type": "number", "default": 0.85},
                },
                "required": ["claim"],
            },
        },
    },
]

def execute_verify_claim(args):
    import httpx
    r = httpx.post(
        "https://sourcescore.org/api/v1/verify",
        json={"claim": args["claim"], "minConfidence": args.get("min_confidence", 0.85)},
        timeout=5.0,
    )
    return r.json()

System prompt addition

Add to your agent's system prompt:

When you assert any factual claim about AI/ML topics (model releases, paper dates, parameter counts, organization founding), call verify_claim FIRST. If the response's best_match has confidence ≥ 0.85, cite the detail_url. If best_match is null OR confidence < 0.85, explicitly mark the assertion as "unverified" in your response. Never invent a citation.

Expected outcome

Hallucination rate drop on AI/ML facts. In our reference test (50 questions about model releases / parameters / papers), agents with verify_claim hallucinated 8% vs 38% without. Variance is high; your mileage depends on agent + system prompt.
Citation quality up. Every verified assertion gets a primary source URL the user can click through and re-verify.
Agent loop cost: ~80ms per verify_claim call. Cached responses for repeated claims. Free tier is 1,000 verifies/month.

Drop-in integration guides

OpenAI tool-calls — full pattern with chat-completions loop
Anthropic SDK — Claude tool_use protocol
Pydantic AI — type-safe variant
LangChain — @tool decorator
DSPy — programs-not-prompts variant

When this fits

Production agent that answers AI/ML factual questions
Research-assistant agents over technical literature
Documentation chatbots over AI/ML knowledge bases
Multi-step agentic flows where one step is fact lookup

When this doesn't fit (yet)

Agents that primarily answer non-AI/ML factual questions — our catalog is bounded to AI/ML for v0. Other verticals ship Year 2.
Agents that need real-time citations (current news) — VERITAS is for verified historical claims, not live data.

Try it

Browser playground — no signup, paste a claim, see the response
5-min Quickstart — full pattern in curl + JS + Python
OpenAPI 3.1 spec