Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Project Apertus, Alejandro Hern\'andez-Cano, Alexander H\"agele, Allen Hao Huang, Angelika Romanou, Antoni-Joan Solergibert, Barna Pasztor, Bettina Messmer, Dhia Garbaya, Eduard Frank \v{D}urech, Ido Hakimi, Juan Garc\'ia Giraldo, Mete Ismayilzada, Negar Foroutan, Skander Moalla

TL;DR
Apertus introduces fully open, multilingual large language models that prioritize data compliance and transparency, achieving competitive performance while providing comprehensive open artifacts for community use.
Contribution
Apertus is the first open LLM suite that ensures data compliance, filters harmful content, and expands multilingual coverage, with full transparency of development artifacts.
Findings
Approaches state-of-the-art results on multilingual benchmarks
Outperforms other open models in multilingual tasks
Provides transparent access to all development artifacts
Abstract
We present Apertus, a fully open suite of large language models (LLMs) designed to address two systemic shortcomings in today's open model ecosystem: data compliance and multilingual representation. Unlike many prior models that release weights without reproducible data pipelines or regard for content-owner rights, Apertus models are pretrained exclusively on openly available data, retroactively respecting `robots.txt` exclusions and filtering for non-permissive, toxic, and personally identifiable content. To mitigate risks of memorization, we adopt the Goldfish objective during pretraining, strongly suppressing verbatim recall of data while retaining downstream task performance. The Apertus models also expand multilingual coverage, training on 15T tokens from over 1800 languages, with ~40% of pretraining data allocated to non-English content. Released at 8B and 70B scales, Apertus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗swiss-ai/Apertus-8B-Instruct-2509model· 164k dl· ♡ 444164k dl♡ 444
- 🤗swiss-ai/Apertus-70B-Instruct-2509model· 5.0k dl· ♡ 1845.0k dl♡ 184
- 🤗swiss-ai/Apertus-8B-2509model· 15k dl· ♡ 15715k dl♡ 157
- 🤗AWuhrmann/Apertus-70B-Instruct-2509-heretic-v1model· ♡ 1♡ 1
- 🤗swiss-ai/Apertus-70B-2509model· 4.4k dl· ♡ 1454.4k dl♡ 145
- 🤗unsloth/Apertus-8B-Instruct-2509model· 104 dl104 dl
- 🤗redponike/Apertus-8B-Instruct-2509-GGUFmodel· 58 dl· ♡ 158 dl♡ 1
- 🤗redponike/Apertus-70B-Instruct-2509-GGUFmodel· 133 dl· ♡ 1133 dl♡ 1
- 🤗unsloth/Apertus-8B-Instruct-2509-GGUFmodel· 1.7k dl· ♡ 161.7k dl♡ 16
- 🤗unsloth/Apertus-70B-Instruct-2509-GGUFmodel· 1.2k dl· ♡ 141.2k dl♡ 14
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Language and cultural evolution
