LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham, Fikri Aji

TL;DR
This paper introduces LaMini-LM, a set of small, diverse models distilled from large instruction-tuned LLMs, achieving competitive performance across multiple NLP benchmarks with significantly reduced size.
Contribution
We develop a large, diverse instruction dataset and fine-tune a herd of small models, demonstrating effective knowledge distillation from large LLMs into compact models.
Findings
LaMini-LM models perform comparably to larger baselines.
The instruction dataset covers broad topics ensuring diversity.
Small models achieve high performance on multiple NLP tasks.
Abstract
Large language models (LLMs) with instruction fine-tuning demonstrate superior generative capabilities. However, these models are resource-intensive. To alleviate this issue, we explore distilling knowledge from instruction-tuned LLMs into much smaller ones. To this end, we carefully develop a large set of 2.58M instructions based on both existing and newly-generated instructions. In addition to being sizable, we design our instructions to cover a broad set of topics to ensure diversity. Extensive analysis of our instruction dataset confirms its diversity, and we generate responses for these instructions using gpt-3.5-turbo. Leveraging these instructions, we fine-tune a diverse herd of models, collectively referred to as LaMini-LM, which includes models from both the encoder-decoder and decoder-only families, with varying sizes. We evaluate the performance of our models using automatic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗MBZUAI/LaMini-Flan-T5-77Mmodel· 9.3k dl· ♡ 269.3k dl♡ 26
- 🤗MBZUAI/LaMini-Flan-T5-248Mmodel· 1.0k dl· ♡ 811.0k dl♡ 81
- 🤗MBZUAI/LaMini-T5-61Mmodel· 361 dl· ♡ 18361 dl♡ 18
- 🤗MBZUAI/LaMini-Cerebras-256Mmodel· 155 dl· ♡ 4155 dl♡ 4
- 🤗MBZUAI/LaMini-Cerebras-590Mmodel· 39 dl· ♡ 739 dl♡ 7
- 🤗MBZUAI/LaMini-GPT-124Mmodel· 3.8k dl· ♡ 233.8k dl♡ 23
- 🤗MBZUAI/LaMini-Cerebras-111Mmodel· 78 dl· ♡ 378 dl♡ 3
- 🤗MBZUAI/LaMini-Neo-125Mmodel· 175 dl· ♡ 16175 dl♡ 16
- 🤗MBZUAI/LaMini-GPT-774Mmodel· 854 dl· ♡ 14854 dl♡ 14
- 🤗MBZUAI/LaMini-T5-223Mmodel· 209 dl· ♡ 3209 dl♡ 3
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
