Salamandra Technical Report
Aitor Gonzalez-Agirre, Marc P\`amies, Joan Llop, Irene Baucells,, Severino Da Dalt, Daniel Tamayo, Jos\'e Javier Saiz, Ferran Espu\~na, Jaume, Prats, Javier Aula-Blasco, Mario Mina, I\~nigo Pikabea, Adri\'an Rubio,, Alexander Shvets, Anna Sall\'es, I\~naki Lacunza, Jorge Palomar

TL;DR
Salamandra is an open-source suite of multilingual, decoder-only large language models trained on diverse European languages and code, with supplementary fine-tuned chat models and initial multimodal experiments, promoting transparency and open science.
Contribution
This work introduces Salamandra, a new open-source multilingual LLM suite with detailed training, evaluation, and open access, including models, scripts, and preliminary multimodal capabilities.
Findings
Salamandra models achieve competitive performance on multilingual benchmarks.
Extensive evaluations cover downstream tasks, bias, and safety.
Open-source release fosters research and commercial use.
Abstract
This work introduces Salamandra, a suite of open-source decoder-only large language models available in three different sizes: 2, 7, and 40 billion parameters. The models were trained from scratch on highly multilingual data that comprises text in 35 European languages and code. Our carefully curated corpus is made exclusively from open-access data compiled from a wide variety of sources. Along with the base models, supplementary checkpoints that were fine-tuned on public-domain instruction data are also released for chat applications. Additionally, we also share our preliminary experiments on multimodality, which serve as proof-of-concept to showcase potential applications for the Salamandra family. Our extensive evaluations on multilingual benchmarks reveal that Salamandra has strong capabilities, achieving competitive performance when compared to similarly sized open-source models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗BSC-LT/salamandra-7b-instructmodel· 81k dl· ♡ 7781k dl♡ 77
- 🤗BSC-LT/salamandra-7bmodel· 355 dl· ♡ 29355 dl♡ 29
- 🤗BSC-LT/salamandra-2bmodel· 1.3k dl· ♡ 251.3k dl♡ 25
- 🤗BSC-LT/salamandra-2b-instructmodel· 6.3k dl· ♡ 276.3k dl♡ 27
- 🤗BSC-LT/ALIA-40bmodel· 166 dl· ♡ 88166 dl♡ 88
- 🤗langtech-languagemodeling/IberianLLM-7B-Instructmodel· 23 dl· ♡ 623 dl♡ 6
- 🤗BSC-LT/salamandra-7b-visionmodel· 9 dl· ♡ 49 dl♡ 4
- 🤗BSC-LT/Salamandra-VL-7B-2512model· 57 dl· ♡ 357 dl♡ 3
- 🤗SINAI/ALIA-legal-administrative-7B-Basemodel
- 🤗BSC-LT/ALIA-40b-instruct-2601model· 7.7k dl· ♡ 57.7k dl♡ 5
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSouth Asian Studies and Conflicts
MethodsBalanced Selection
