Gemma: Open Models Based on Gemini Research and Technology

Gemma Team: Thomas Mesnard; Cassidy Hardin; Robert Dadashi; Surya; Bhupatiraju; Shreya Pathak; Laurent Sifre; Morgane Rivi\`ere; Mihir Sanjay; Kale; Juliette Love; Pouya Tafti; L\'eonard Hussenot; Pier Giuseppe Sessa,; Aakanksha Chowdhery; Adam Roberts; Aditya Barua; Alex Botev; Alex Castro-Ros,; Ambrose Slone; Am\'elie H\'eliou; Andrea Tacchetti; Anna Bulanova; Antonia; Paterson; Beth Tsai; Bobak Shahriari; Charline Le Lan; Christopher A.; Choquette-Choo; Cl\'ement Crepy; Daniel Cer; Daphne Ippolito; David Reid,; Elena Buchatskaya; Eric Ni; Eric Noland; Geng Yan; George Tucker,; George-Christian Muraru; Grigory Rozhdestvenskiy; Henryk Michalewski; Ian; Tenney; Ivan Grishchenko; Jacob Austin; James Keeling; Jane Labanowski,; Jean-Baptiste Lespiau; Jeff Stanway; Jenny Brennan; Jeremy Chen; Johan; Ferret; Justin Chiu; Justin Mao-Jones; Katherine Lee; Kathy Yu; Katie; Millican; Lars Lowe Sjoesund; Lisa Lee; Lucas Dixon; Machel Reid; Maciej; Miku{\l}a; Mateo Wirth; Michael Sharman; Nikolai Chinaev; Nithum Thain,; Olivier Bachem; Oscar Chang; Oscar Wahltinez; Paige Bailey; Paul Michel,; Petko Yotov; Rahma Chaabouni; Ramona Comanescu; Reena Jana; Rohan Anil; Ross; McIlroy; Ruibo Liu; Ryan Mullins; Samuel L Smith; Sebastian Borgeaud; Sertan; Girgin; Sholto Douglas; Shree Pandya; Siamak Shakeri; Soham De; Ted Klimenko,; Tom Hennigan; Vlad Feinberg; Wojciech Stokowiec; Yu-hui Chen; Zafarali Ahmed,; Zhitao Gong; Tris Warkentin; Ludovic Peran; Minh Giang; Cl\'ement Farabet,; Oriol Vinyals; Jeff Dean; Koray Kavukcuoglu; Demis Hassabis; Zoubin; Ghahramani; Douglas Eck; Joelle Barral; Fernando Pereira; Eli Collins; Armand; Joulin; Noah Fiedel; Evan Senter; Alek Andreev; Kathleen Kenealy

arXiv:2403.08295·cs.CL·April 17, 2024·224 cites

Gemma: Open Models Based on Gemini Research and Technology

Gemma Team: Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya, Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivi\`ere, Mihir Sanjay, Kale, Juliette Love, Pouya Tafti, L\'eonard Hussenot, Pier Giuseppe Sessa,, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev

PDF

Open Access 2 Repos 10 Models 2 Datasets

TL;DR

Gemma introduces lightweight open models based on Gemini research, demonstrating strong performance on language understanding, reasoning, and safety benchmarks, with a focus on responsible release and model development transparency.

Contribution

The paper presents Gemma, a new family of open models with 2B and 7B parameters, built from Gemini research, and evaluates their performance and safety comprehensively.

Findings

01

Gemma models outperform similar open models on 11 of 18 tasks.

02

Models demonstrate strong language understanding and reasoning capabilities.

03

Safety and responsibility evaluations are thoroughly conducted.

Abstract

This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemma outperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensive evaluations of safety and responsibility aspects of the models, alongside a detailed description of model development. We believe the responsible release of LLMs is critical for improving the safety of frontier models, and for enabling the next wave of LLM innovations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation