SourceScore

Verified claim · AI-ML · 95% confidence

RedPajama dataset released on: 2023-04-17.

Last verified 2026-05-16 · Methodology veritas-v0.1 · ea8b7be3a49101be

Structured fields

Subject
RedPajama dataset
Predicate
released_on
Object
2023-04-17
Confidence
95%
Tags
redpajama · dataset · pretraining · together · 2023 · open-source

Sources (2)

  1. [1] official blog · Together AI · 2023-04-17

    RedPajama: An Open Source Recipe to Reproduce LLaMA training dataset
    Today, we release RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens.
  2. [2] github release · Together · 2023-04-17

    togethercomputer/RedPajama-Data — GitHub

Cite this claim

Ready-to-paste citation (Markdown / plain text):

RedPajama dataset released on: 2023-04-17. — SourceScore Claim ea8b7be3a49101be (verified 2026-05-16). https://sourcescore.org/api/v1/claims/ea8b7be3a49101be.json

Embed this claim

Drop this iframe into any blog post, docs page, or knowledge base. The widget renders the signed claim + primary source + click-through to this canonical page. CC-BY 4.0; attribution included.

<iframe src="https://sourcescore.org/embed/claim/ea8b7be3a49101be/" width="100%" height="360" frameborder="0" loading="lazy" title="RedPajama dataset released on: 2023-04-17."></iframe>

Preview: open in new tab

Related claims

Other verified claims sharing tags with this one — useful for LLM retrieval graphs and citation discovery.

Programmatic access

Fetch this claim with a signed envelope for verification:

curl https://sourcescore.org/api/v1/claims/ea8b7be3a49101be.json

API docs · Pricing · Methodology JSON