LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet,, Marie-Anne Lachaux, Timoth\'ee Lacroix, Baptiste Rozi\`ere, Naman Goyal, Eric, Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave,, Guillaume Lample

TL;DR
LLaMA introduces a series of open, efficient foundation language models trained on publicly available data, achieving competitive performance with larger proprietary models and promoting accessible AI research.
Contribution
The paper presents LLaMA, a new set of open foundation models trained solely on public data, outperforming or matching larger proprietary models on key benchmarks.
Findings
LLaMA-13B outperforms GPT-3 (175B) on most benchmarks.
LLaMA-65B is competitive with Chinchilla-70B and PaLM-540B.
All models are publicly released for research use.
Abstract
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗allenai/OLMo-7Bmodel· 1.7k dl· ♡ 6511.7k dl♡ 651
- 🤗allenai/OLMo-1Bmodel· 3.7k dl· ♡ 1083.7k dl♡ 108
- 🤗sardukar/llama13b-4bit-v2model· 3 dl· ♡ 53 dl♡ 5
- 🤗zirui3/alpaca-med-lora-7bmodel
- 🤗project-baize/baize-lora-7Bmodel· ♡ 37♡ 37
- 🤗project-baize/baize-lora-13Bmodel· ♡ 13♡ 13
- 🤗project-baize/baize-lora-30Bmodel· ♡ 34♡ 34
- 🤗project-baize/baize-healthcare-lora-7Bmodel· ♡ 18♡ 18
- 🤗lmsys/vicuna-13b-delta-v0model· 530 dl· ♡ 452530 dl♡ 452
- 🤗lmsys/vicuna-7b-delta-v0model· 231 dl· ♡ 165231 dl♡ 165
Videos
LLaMA: Open and Efficient Foundation Language Models (Paper Explained)· youtube
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · LLaMA · Linear Layer · Cosine Annealing · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Dropout
