Neural Thermodynamic Laws for Large Language Model Training
Ziming Liu, Yizhou Liu, Jeff Gore, Max Tegmark

TL;DR
This paper introduces Neural Thermodynamic Laws (NTL), a novel framework that applies thermodynamic principles to understand and guide the training dynamics of large language models, bridging theory and practice.
Contribution
It establishes a theoretical connection between thermodynamic quantities and LLM training, and provides practical guidelines for learning rate schedule design.
Findings
Thermodynamic quantities naturally emerge during LLM training.
Classical thermodynamic principles apply to neural loss landscapes.
Guidelines for designing effective learning rate schedules.
Abstract
Beyond neural scaling laws, little is known about the laws underlying large language models (LLMs). We introduce Neural Thermodynamic Laws (NTL) -- a new framework that offers fresh insights into LLM training dynamics. On the theoretical side, we demonstrate that key thermodynamic quantities (e.g., temperature, entropy, heat capacity, thermal conduction) and classical thermodynamic principles (e.g., the three laws of thermodynamics and the equipartition theorem) naturally emerge under river-valley loss landscape assumptions. On the practical side, this scientific perspective yields intuitive guidelines for designing learning rate schedules.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
