Rethinking LLM Advancement: Compute-Dependent and Independent Paths to Progress
Jack Sanderson, Teddy Foley, Spencer Guo, Anqi Qu, Henry Josephson

TL;DR
This paper introduces a framework to distinguish between compute-dependent and compute-independent innovations in LLM development, showing that algorithmic improvements can still advance capabilities despite hardware restrictions.
Contribution
The study proposes a novel framework for classifying LLM innovations and demonstrates its effectiveness through experimental validation with nanoGPT models.
Findings
Compute-independent innovations significantly improve performance across scales.
Compute-dependent innovations benefit mainly at larger scales, with mixed effects at smaller scales.
Hardware restrictions alone are insufficient to halt all AI capability progress.
Abstract
Regulatory efforts to govern large language model (LLM) development have predominantly focused on restricting access to high-performance computational resources. This study evaluates the efficacy of such measures by examining whether LLM capabilities can advance through algorithmic innovation in compute-constrained environments. We propose a novel framework distinguishing compute-dependent innovations--which yield disproportionate benefits at high compute--from compute-independent innovations, which improve efficiency across compute scales. The impact is quantified using Compute-Equivalent Gain (CEG). Experimental validation with nanoGPT models confirms that compute-independent advancements yield significant performance gains (e.g., with combined CEG up to ) across the tested scales. In contrast, compute-dependent advancements were detrimental to performance at smaller…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Materials Science · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Layer · Multi-Head Attention · Dense Connections · Discriminative Fine-Tuning · Adam · Attention Is All You Need · Dropout · Weight Decay
