Densing Law of LLMs
Chaojun Xiao, Jie Cai, Weilin Zhao, Guoyang Zeng, Biyuan Lin, Jie, Zhou, Zhi Zheng, Xu Han, Zhiyuan Liu, Maosong Sun

TL;DR
This paper introduces the concept of capacity density to evaluate LLMs, revealing an exponential growth trend that guides future development towards more efficient models with better effectiveness.
Contribution
It proposes a new metric, capacity density, and an empirical law showing its exponential growth, providing a unified framework for assessing LLM effectiveness and efficiency.
Findings
Capacity density doubles approximately every three months.
Capacity density of LLMs grows exponentially over time.
The law offers new insights for optimizing LLM development.
Abstract
Large Language Models (LLMs) have emerged as a milestone in artificial intelligence, and their performance can improve as the model size increases. However, this scaling brings great challenges to training and inference efficiency, particularly for deploying LLMs in resource-constrained environments, and the scaling trend is becoming increasingly unsustainable. This paper introduces the concept of ``\textit{capacity density}'' as a new metric to evaluate the quality of the LLMs across different scales and describes the trend of LLMs in terms of both effectiveness and efficiency. To calculate the capacity density of a given target LLM, we first introduce a set of reference models and develop a scaling law to predict the downstream performance of these reference models based on their parameter sizes. We then define the \textit{effective parameter size} of the target LLM as the parameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCorporate Governance and Law · European and International Contract Law · Corporate Insolvency and Governance
MethodsSparse Evolutionary Training · Balanced Selection
