Gemma: Open Models Based on Gemini Research and Technology
Gemma Team: Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya, Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivi\`ere, Mihir Sanjay, Kale, Juliette Love, Pouya Tafti, L\'eonard Hussenot, Pier Giuseppe Sessa,, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev

TL;DR
Gemma introduces lightweight open models based on Gemini research, demonstrating strong performance on language understanding, reasoning, and safety benchmarks, with a focus on responsible release and model development transparency.
Contribution
The paper presents Gemma, a new family of open models with 2B and 7B parameters, built from Gemini research, and evaluates their performance and safety comprehensively.
Findings
Gemma models outperform similar open models on 11 of 18 tasks.
Models demonstrate strong language understanding and reasoning capabilities.
Safety and responsibility evaluations are thoroughly conducted.
Abstract
This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemma outperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensive evaluations of safety and responsibility aspects of the models, alongside a detailed description of model development. We believe the responsible release of LLMs is critical for improving the safety of frontier models, and for enabling the next wave of LLM innovations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗google/paligemma-3b-pt-224model· 86k dl· ♡ 42686k dl♡ 426
- 🤗google/paligemma-3b-mix-448model· 2.9k dl· ♡ 1162.9k dl♡ 116
- 🤗google/paligemma-3b-pt-224-jaxmodel· 205 dl· ♡ 3205 dl♡ 3
- 🤗google/paligemma-3b-pt-448-jaxmodel· 2 dl· ♡ 22 dl♡ 2
- 🤗google/paligemma-3b-pt-896-jaxmodel· ♡ 2♡ 2
- 🤗google/paligemma-3b-ft-aokvqa-mc-448-jaxmodel
- 🤗google/paligemma-3b-ft-textcaps-224-jaxmodel
- 🤗google/paligemma-3b-ft-widgetcap-448-jaxmodel· ♡ 2♡ 2
- 🤗google/paligemma-3b-ft-vqav2-448-jaxmodel· 1 dl· ♡ 21 dl♡ 2
- 🤗google/paligemma-3b-ft-refcoco-seg-448-jaxmodel· ♡ 1♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
