What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen, Ga\"el Varoquaux

TL;DR
This survey explores the evolving role of small models in the era dominated by large language models, emphasizing their collaboration potential, resource efficiency, and practical significance.
Contribution
It systematically analyzes the relationship between large and small models, highlighting the importance of small models in practical applications and resource-constrained environments.
Findings
Small models are crucial for resource-limited settings.
Collaboration between small and large models enhances performance.
Small models offer practical benefits despite the prominence of LLMs.
Abstract
Large Language Models (LLMs) have made significant progress in advancing artificial general intelligence (AGI), leading to the development of increasingly large models such as GPT-4 and LLaMA-405B. However, scaling up model sizes results in exponentially higher computational costs and energy consumption, making these models impractical for academic researchers and businesses with limited resources. At the same time, Small Models (SMs) are frequently used in practical settings, although their significance is currently underestimated. This raises important questions about the role of small models in the era of LLMs, a topic that has received limited attention in prior research. In this work, we systematically examine the relationship between LLMs and SMs from two key perspectives: Collaboration and Competition. We hope this survey provides valuable insights for practitioners, fostering a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security · Artificial Intelligence in Law · Multi-Agent Systems and Negotiation
MethodsByte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Layer Normalization · Dropout · Attention Is All You Need · Position-Wise Feed-Forward Layer · Residual Connection · Linear Layer
