Customer-support chatbot grounding — stop bots from hallucinating product facts
Production support bots hallucinate pricing, release dates, integration details. Add a verify layer on top of RAG over your docs; users see accurate citations or 'I'm not sure'.
The problem
Customer support chatbots hallucinate things customers actually care about: pricing tiers, integration partners, rate limits, supported regions, refund policies, model context windows. A bot telling a paying customer "Sure, your Pro tier supports unlimited API calls" when the real answer is 50k/month creates a billing dispute + trust collapse + maybe a chargeback.
Standard RAG over your help docs catches most factual questions, but the tail of failure modes — wrong number on the right doc, fabricated integration, made-up rate limit — is what dings trust.
The pattern
Three-layer architecture:
- RAG over your docs. Index your help center, API docs, pricing pages. Standard retrieval.
- Verify atomic claims in the response. Extract assertions (prices, limits, dates, features) and check each against a verified-claim catalog. For your product's claims, you maintain the catalog. For AI/ML claims (if you're an AI-adjacent product), use SourceScore VERITAS.
- Decline unverifiable claims. If the bot would emit a claim and verification fails, the bot says "I'm not sure — let me get a human." The cost of declining is much lower than the cost of being wrong.
Implementation sketch
# Two-catalog setup: your-own + SourceScore VERITAS
# 1. Build your own verified-claim catalog of product facts.
# (Pricing, rate limits, feature support, etc.)
# Update it whenever pricing/features change.
# JSON file or simple key-value DB.
PRODUCT_FACTS = {
"pro_tier_monthly_calls": "50,000 API calls per month",
"scale_tier_monthly_calls": "500,000 API calls per month",
"stripe_billing_supported": True,
"free_tier_signup_required": False,
# ...
}
# 2. For AI/ML factual claims (if user asks about Llama 3.1, Claude,
# etc.), use SourceScore VERITAS.
import httpx
def verify_aiml_claim(claim_text: str) -> dict | None:
r = httpx.post(
"https://sourcescore.org/api/v1/verify",
json={"claim": claim_text, "minConfidence": 0.85},
timeout=2.0,
)
result = r.json()
return result.get("bestMatch")
# 3. In the bot's response pipeline:
async def respond(user_question: str):
# Standard RAG
retrieved = await rag.retrieve(user_question, k=5)
draft = await llm.generate(user_question, context=retrieved)
# Extract atomic claims from draft
claims = extract_atomic_claims(draft)
verified = []
unverified = []
for c in claims:
if c.matches_product_pattern():
ok = verify_against_product_facts(c, PRODUCT_FACTS)
else:
ok = verify_aiml_claim(c.text) is not None
(verified if ok else unverified).append(c)
if unverified:
# Don't ship the response with unverified claims
return (
"I'm not 100% certain about one or more facts in my "
"answer. Let me transfer you to a human teammate."
)
return draft # All claims verifiedWhat this catches
- Wrong pricing. Bot says "€199/month" when the actual price is "€499/month" — product-facts catalog catches it.
- Hallucinated integrations. Bot says "Yes, we integrate with Zapier" when you don't — catalog catches it.
- Wrong AI/ML facts. Bot says "Llama 3.1 has 70B parameters" when the user asked about the 405B variant — VERITAS catches it.
- Stale info. Bot uses 2-year-old training data for current pricing — catalog (which you update on pricing changes) catches it.
The escape valve: route to human
The bot doesn't need to answer everything. Routing to a human for unverifiable claims is a feature, not a bug. Production support chatbots that resolve 60-80% of queries with high accuracy beat ones that resolve 95% with 10% incorrect answers — because the 10% generates more support load + trust damage than the 30% routed to humans would have.
Free-tier economics
- SourceScore VERITAS free tier: 1,000 verifies/month, no signup. Probably enough for low-volume product support.
- ~80ms per VERITAS call. Adds <100ms to bot response time.
- Your product-facts catalog: cost = engineering time to maintain (small).
- Paid tiers start at €19/month if your bot does >1k/mo AI/ML claim verifications.
Related
- AI agent grounding — broader agent pattern
- RAG pipeline verification — closing the right-doc-wrong-number gap
- LLM grounding concept pillar
- Six grounding strategies blog post