LIMA: Less Is More for Alignment
Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning, Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike, Lewis, Luke Zettlemoyer, Omer Levy

TL;DR
LIMA shows that a large language model can achieve high performance and generalization with minimal instruction tuning, relying mainly on pretraining, and can even outperform some models trained with reinforcement learning.
Contribution
This paper demonstrates that extensive instruction tuning and reinforcement learning are not strictly necessary for high-quality language model performance, emphasizing the importance of pretraining.
Findings
LIMA performs well with only 1,000 curated prompts and responses.
LIMA generalizes to unseen tasks not in training data.
LIMA responses are often preferred or equivalent to GPT-4 in human evaluations.
Abstract
Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, learning to follow specific response formats from only a handful of examples in the training data, including complex queries that range from planning trip itineraries to speculating about alternate history. Moreover, the model tends to generalize well to unseen tasks that did not appear in the training data. In a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗jkeisling/laura-openllama13b-600bt-ggmlmodel· ♡ 2♡ 2
- 🤗nxnhjrjtbjfzhrovwl/limarp-llama2-ggml-f16model· ♡ 4♡ 4
- 🤗Alpha-VLLM/LLaMA2-Accessorymodel· ♡ 38♡ 38
- 🤗pankajmathur/Lima_Unchained_70bmodel· 1.1k dl· ♡ 51.1k dl♡ 5
- 🤗nxnhjrjtbjfzhrovwl/limarp-llongma2-8k-ggml-f16model
- 🤗nxnhjrjtbjfzhrovwl/limarp-llongma2-8k-gguf-f16model· 4 dl· ♡ 24 dl♡ 2
- 🤗DominikLindorfer/SQL-LLaMAmodel
- 🤗Markr-AI/COKAL-DPO-13b-v2model· 13 dl· ♡ 913 dl♡ 9
- 🤗Markr-AI/Gukbap-Mistral-7Bmodel· 16 dl· ♡ 316 dl♡ 3
- 🤗Markr-AI/Gukbap-Qwen2-7Bmodel· 10 dl· ♡ 210 dl♡ 2
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Dropout · Byte Pair Encoding · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer · Absolute Position Encodings
