Densing Law of LLMs

Chaojun Xiao; Jie Cai; Weilin Zhao; Guoyang Zeng; Biyuan Lin; Jie; Zhou; Zhi Zheng; Xu Han; Zhiyuan Liu; Maosong Sun

arXiv:2412.04315·cs.AI·December 9, 2024·2 cites

Densing Law of LLMs

Chaojun Xiao, Jie Cai, Weilin Zhao, Guoyang Zeng, Biyuan Lin, Jie, Zhou, Zhi Zheng, Xu Han, Zhiyuan Liu, Maosong Sun

PDF

Open Access 1 Models 2 Datasets

TL;DR

This paper introduces the concept of capacity density to evaluate LLMs, revealing an exponential growth trend that guides future development towards more efficient models with better effectiveness.

Contribution

It proposes a new metric, capacity density, and an empirical law showing its exponential growth, providing a unified framework for assessing LLM effectiveness and efficiency.

Findings

01

Capacity density doubles approximately every three months.

02

Capacity density of LLMs grows exponentially over time.

03

The law offers new insights for optimizing LLM development.

Abstract

Large Language Models (LLMs) have emerged as a milestone in artificial intelligence, and their performance can improve as the model size increases. However, this scaling brings great challenges to training and inference efficiency, particularly for deploying LLMs in resource-constrained environments, and the scaling trend is becoming increasingly unsustainable. This paper introduces the concept of ``\textit{capacity density}'' as a new metric to evaluate the quality of the LLMs across different scales and describes the trend of LLMs in terms of both effectiveness and efficiency. To calculate the capacity density of a given target LLM, we first introduce a set of reference models and develop a scaling law to predict the downstream performance of these reference models based on their parameter sizes. We then define the \textit{effective parameter size} of the target LLM as the parameter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
openbmb/DensingLaw-ScalingModels
model· ♡ 3
♡ 3

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCorporate Governance and Law · European and International Contract Law · Corporate Insolvency and Governance

MethodsSparse Evolutionary Training · Balanced Selection