LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron; Thibaut Lavril; Gautier Izacard; Xavier Martinet,; Marie-Anne Lachaux; Timoth\'ee Lacroix; Baptiste Rozi\`ere; Naman Goyal; Eric; Hambro; Faisal Azhar; Aurelien Rodriguez; Armand Joulin; Edouard Grave,; Guillaume Lample

arXiv:2302.13971·cs.CL·February 28, 2023·3.9k cites

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet,, Marie-Anne Lachaux, Timoth\'ee Lacroix, Baptiste Rozi\`ere, Naman Goyal, Eric, Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave,, Guillaume Lample

PDF

Open Access 5 Repos 10 Models 5 Datasets 1 Video

TL;DR

LLaMA introduces a series of open, efficient foundation language models trained on publicly available data, achieving competitive performance with larger proprietary models and promoting accessible AI research.

Contribution

The paper presents LLaMA, a new set of open foundation models trained solely on public data, outperforming or matching larger proprietary models on key benchmarks.

Findings

01

LLaMA-13B outperforms GPT-3 (175B) on most benchmarks.

02

LLaMA-65B is competitive with Chinchilla-70B and PaLM-540B.

03

All models are publicly released for research use.

Abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

LLaMA: Open and Efficient Foundation Language Models (Paper Explained)· youtube

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · LLaMA · Linear Layer · Cosine Annealing · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Dropout