SourceScore

Integration guide

LangChain + SourceScore VERITAS

Wire signed-claim retrieval into your LangChain pipeline. Verify model-generated assertions against a catalog of facts that ship with HMAC-SHA256 signatures and ≥2 primary sources each.

When to use this

Two patterns. Both are drop-in additions to an existing chain.

  1. Retrieve-then-cite: fetch the most relevant VERITAS claims for the user's query, render them as context, and instruct the model to cite the claim id with every fact it asserts.
  2. Generate-then-verify: let the model answer freely, then post-process the response: extract atomic claims, send each to /api/v1/verify, attach a confidence + citation badge to each, flag unmatched assertions.

Install

# Python
pip install langchain langchain-openai requests

# JavaScript
npm install @langchain/core @langchain/openai

Pattern 1 — Retrieve-then-cite (Python)

The VERITAS catalog acts as a curated retriever. Search returns the top-N matching claims; we render them as context blocks the LLM is instructed to cite from.

import os, requests
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

VERITAS = "https://sourcescore.org/api/v1"

def veritas_retrieve(query: str, k: int = 5) -> str:
    """Fetch top-k VERITAS claims for a query, render as numbered context."""
    r = requests.get(f"{VERITAS}/search", params={"q": query, "limit": k}, timeout=8)
    r.raise_for_status()
    claims = r.json().get("matches", [])
    if not claims:
        return "(no VERITAS claims match this query)"
    lines = []
    for i, c in enumerate(claims, 1):
        lines.append(
            f"[{i}] {c['statement']} "
            f"(claim_id={c['id']}, confidence={c['confidence']:.2f}, "
            f"sources={c['sourceCount']})"
        )
    return "\n".join(lines)

prompt = ChatPromptTemplate.from_template("""You are a precise assistant. Answer the user's question
using ONLY the verified claims below. Cite every fact with [claim_id]. If the
claims do not cover the question, say so explicitly — do not improvise.

Verified claims:
{context}

Question: {question}

Answer (every fact must end with [claim_id]):""")

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

chain = (
    {"context": lambda x: veritas_retrieve(x["question"]),
     "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

print(chain.invoke({"question": "When was the Transformer architecture introduced?"}))

The model now MUST attach a claim_id to every assertion. Any unattached statement is a hallucination signal — surface it as a UI warning or auto-strip it from the output.

Pattern 2 — Generate-then-verify (JavaScript)

Useful when you want the model's free-form output but need a confidence layer. Each assertion gets a verification badge from VERITAS before the user sees the final response.

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";

const VERITAS = "https://sourcescore.org/api/v1";

async function verifyClaim(text) {
  const r = await fetch(`${VERITAS}/verify`, {
    method: "POST",
    headers: { "content-type": "application/json" },
    body: JSON.stringify({ claim: text, minConfidence: 0.85 }),
  });
  return r.json();
}

// Step 1 — model generates answer
const llm = new ChatOpenAI({ model: "gpt-4o-mini", temperature: 0 });
const prompt = ChatPromptTemplate.fromTemplate(`
Answer the user's question with one fact per line. Be concise.

Question: {question}
`);
const answer = await prompt.pipe(llm).invoke({
  question: "When did OpenAI release GPT-4?",
});

// Step 2 — verify each line
const lines = answer.content.split("\n").filter(Boolean);
const verified = [];
for (const line of lines) {
  const v = await verifyClaim(line);
  verified.push({
    statement: line,
    matched: !!v.bestMatch,
    confidence: v.bestMatch?.confidence ?? 0,
    veritasId: v.bestMatch?.id,
    sourceUrl: v.bestMatch ? `https://sourcescore.org/claims/${v.bestMatch.id}/` : null,
  });
}

// Step 3 — render with badges
for (const r of verified) {
  const badge = r.matched ? `✅ [${r.veritasId}]` : "⚠️ unverified";
  console.log(`${r.statement.trim()} ${badge}`);
}

UI suggestion: render verified lines in the normal answer style, unverified lines with a yellow underline + tooltip linking to a "submit verification request" page. Operationalizes hallucination-discovery as user feedback.

Pattern 3 — Verify the signature (defensive)

High-stakes deployments should re-verify the HMAC-SHA256 signature locally before trusting a claim envelope. The signing public-key-equivalent shipped with the response so you can re-compute and compare.

import os, hmac, hashlib, json, requests

SECRET = os.environ["SOURCESCORE_SIGNING_SECRET"]  # given on Enterprise tier
VERITAS = "https://sourcescore.org/api/v1"

def verify_envelope(envelope: dict) -> bool:
    """Re-compute HMAC-SHA256 over canonical-JSON of the claim + signature
    metadata, compare constant-time to envelope.signature.value."""
    claim = envelope["claim"]
    sig   = envelope["signature"]
    # canonical: sorted keys, ASCII-safe, no extra whitespace
    payload = json.dumps(
        {**claim, "signedAt": sig["signedAt"], "signedBy": sig["signedBy"]},
        sort_keys=True, separators=(",", ":"), ensure_ascii=False
    ).encode()
    expected = hmac.new(SECRET.encode(), payload, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, sig["value"])

env = requests.get(f"{VERITAS}/claims/<claim_id>.json").json()
assert verify_envelope(env), "VERITAS signature mismatch — do not trust"
print("ok — claim is genuine + unmodified")

Free + Indie tiers can re-verify against the public catalog JSON which has the same envelope shape. The shared secret is only required when you need an additional signature your own systems generate (Enterprise tier feature).

Choosing a pattern

Use casePatternLatency
Educational Q&A botRetrieve-then-cite~150ms search + LLM time
Search auto-completeRetrieve only (skip LLM)<100ms
Internal research assistantGenerate-then-verifyLLM + N × ~80ms verify
High-stakes citation badgeGenerate-then-verify + signatureLLM + N × ~80ms verify + signature compute (~1ms)

Cost model

Each VERITAS call counts as one claim against your monthly quota. Search responses with N matches still count as ONE call, regardless of N. Verify counts as ONE call per request.

  • Free: 1,000 calls/month — sufficient for prototyping
  • Indie €19/mo: 25,000 calls/month — solo apps + small teams
  • Startup €99/mo: 250,000 calls/month — Series-A class
  • Scale €499/mo: 2,500,000 calls/month — high-throughput

See pricing for full tier comparison.

What VERITAS is not

We are deliberately not a generic fact-checker. The Day 1 catalog (91 claims today) covers AI/ML research — model releases, foundational papers, organizations, datasets. If your chain asks about "the capital of France" we will return no matches and your code should fall through to whatever retrieval you'd use anyway.

Catalog expansion is gated by our verification methodology (≥2 primary sources, verbatim excerpts, no performance- comparison claims). New verticals (cybersecurity, data engineering, scientific computing) ship Y2.

Next steps