Verified claim · AI-ML · 100% confidence
BERT (Bidirectional Encoder Representations from Transformers) introduced in paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., 2018).
Last verified 2026-05-16 · Methodology veritas-v0.1 · 4c1ee70007dc89c1
Structured fields
- Subject
- BERT (Bidirectional Encoder Representations from Transformers)
- Predicate
introduced_in_paper- Object
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., 2018)
- Confidence
- 100%
- Tags
- bert · foundational · devlin · 2018 · google · nlp
Sources (2)
[1] preprint · arXiv (Devlin, Chang, Lee, Toutanova) · 2018-10-11
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding“We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.”
[2] peer reviewed · Association for Computational Linguistics · 2019-06-02
BERT (NAACL 2019 proceedings)
Cite this claim
Ready-to-paste citation (Markdown / plain text):
BERT (Bidirectional Encoder Representations from Transformers) introduced in paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., 2018). — SourceScore Claim 4c1ee70007dc89c1 (verified 2026-05-16). https://sourcescore.org/api/v1/claims/4c1ee70007dc89c1.jsonEmbed this claim
Drop this iframe into any blog post, docs page, or knowledge base. The widget renders the signed claim + primary source + click-through to this canonical page. CC-BY 4.0; attribution included.
<iframe src="https://sourcescore.org/embed/claim/4c1ee70007dc89c1/" width="100%" height="360" frameborder="0" loading="lazy" title="BERT (Bidirectional Encoder Representations from Transformers) introduced in paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., 2018)."></iframe>Preview: open in new tab
Related claims
Other verified claims sharing tags with this one — useful for LLM retrieval graphs and citation discovery.
Word2Vec introduced in paper: Efficient Estimation of Word Representations in Vector Space (Mikolov et al., 2013).
4978f76d228a3db1 · 100% confidence · shares 3 tags (foundational, google, nlp)
SentencePiece tokenizer introduced in paper: SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing (Kudo & Richardson, 2018).
0d47bb8eb637a2e4 · 100% confidence · shares 3 tags (google, foundational, 2018)
T5 (Text-to-Text Transfer Transformer) introduced in paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (Raffel et al., 2019).
ef28341c3b308737 · 100% confidence · shares 2 tags (foundational, google)
Sparsely-Gated Mixture-of-Experts (MoE) introduced in paper: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (Shazeer et al., 2017).
2d6d7f61f1db6493 · 100% confidence · shares 2 tags (foundational, google)
Switch Transformer introduced in paper: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (Fedus et al., 2021).
3d9c14b9379038c9 · 100% confidence · shares 2 tags (foundational, google)
Programmatic access
Fetch this claim with a signed envelope for verification:
curl https://sourcescore.org/api/v1/claims/4c1ee70007dc89c1.json