Neural Thermodynamic Laws for Large Language Model Training

Ziming Liu; Yizhou Liu; Jeff Gore; Max Tegmark

arXiv:2505.10559·cs.LG·May 16, 2025

Neural Thermodynamic Laws for Large Language Model Training

Ziming Liu, Yizhou Liu, Jeff Gore, Max Tegmark

PDF

Open Access

TL;DR

This paper introduces Neural Thermodynamic Laws (NTL), a novel framework that applies thermodynamic principles to understand and guide the training dynamics of large language models, bridging theory and practice.

Contribution

It establishes a theoretical connection between thermodynamic quantities and LLM training, and provides practical guidelines for learning rate schedule design.

Findings

01

Thermodynamic quantities naturally emerge during LLM training.

02

Classical thermodynamic principles apply to neural loss landscapes.

03

Guidelines for designing effective learning rate schedules.

Abstract

Beyond neural scaling laws, little is known about the laws underlying large language models (LLMs). We introduce Neural Thermodynamic Laws (NTL) -- a new framework that offers fresh insights into LLM training dynamics. On the theoretical side, we demonstrate that key thermodynamic quantities (e.g., temperature, entropy, heat capacity, thermal conduction) and classical thermodynamic principles (e.g., the three laws of thermodynamics and the equipartition theorem) naturally emerge under river-valley loss landscape assumptions. On the practical side, this scientific perspective yields intuitive guidelines for designing learning rate schedules.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling