BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop: Teven Le Scao, Angela Fan, Christopher Akiki,, Ellie Pavlick, Suzana Ili\'c, Daniel Hesslow, Roman Castagn\'e, Alexandra, Sasha Luccioni, Fran\c{c}ois Yvon, Matthias Gall\'e, Jonathan Tow, Alexander, M. Rush, Stella Biderman, Albert Webson

TL;DR
BLOOM is a large, open-access multilingual language model with 176 billion parameters, designed to democratize access to advanced NLP technology and achieve competitive performance across diverse tasks.
Contribution
This work introduces BLOOM, a 176B-parameter multilingual language model developed through a collaborative effort, and emphasizes open access and multilingual capabilities.
Findings
BLOOM achieves competitive benchmark performance.
Multitask prompted finetuning improves results.
Models and code are publicly released.
Abstract
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗bigscience/bloommodel· 7.4k dl· ♡ 49897.4k dl♡ 4989
- 🤗BelleGroup/BELLE-7B-gptqmodel· 7 dl· ♡ 267 dl♡ 26
- 🤗ybelkada/papersmodel
- 🤗BelleGroup/BELLE_BLOOM_GPTQ_4BITmodel· 3 dl· ♡ 33 dl♡ 3
- 🤗xverse/XVERSE-65Bmodel· 43 dl· ♡ 3843 dl♡ 38
- 🤗xverse/XVERSE-65B-2model· 23 dl· ♡ 1023 dl♡ 10
- 🤗norallm/norbloom-7b-scratchmodel· 489 dl· ♡ 2489 dl♡ 2
- 🤗tayyibsupercool/bloom-560m-loramodel
- 🤗Aleph-Alpha/Pharia-1-LLM-7B-controlmodel· ♡ 69♡ 69
- 🤗keras/bloom_560m_multimodel· 6 dl6 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · BLOOM · Massively multilingual probing based on Universal Dependencies · Label Smoothing · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Linear Layer · Multi-Head Attention · Adam
