Thermodynamics of Reinforcement Learning Curricula

Jacob Adamczyk; Juan Sebastian Rojas; Rahul V. Kulkarni

arXiv:2603.12324·cs.LG·March 16, 2026

Thermodynamics of Reinforcement Learning Curricula

Jacob Adamczyk, Juan Sebastian Rojas, Rahul V. Kulkarni

PDF

Open Access

TL;DR

This paper introduces a thermodynamic framework for reinforcement learning curricula, modeling reward parameters as a task manifold and deriving optimal curricula as geodesics to improve learning efficiency.

Contribution

It formalizes curriculum learning in RL using non-equilibrium thermodynamics and proposes the MEW algorithm for optimal temperature scheduling.

Findings

01

Optimal curricula are geodesics in task space.

02

The MEW algorithm provides a principled temperature annealing schedule.

03

Framework links thermodynamics with RL curriculum design.

Abstract

Connections between statistical mechanics and machine learning have repeatedly proven fruitful, providing insight into optimization, generalization, and representation learning. In this work, we follow this tradition by leveraging results from non-equilibrium thermodynamics to formalize curriculum learning in reinforcement learning (RL). In particular, we propose a geometric framework for RL by interpreting reward parameters as coordinates on a task manifold. We show that, by minimizing the excess thermodynamic work, optimal curricula correspond to geodesics in this task space. As an application of this framework, we provide an algorithm, "MEW" (Minimum Excess Work), to derive a principled schedule for temperature annealing in maximum-entropy RL.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Thermodynamics and Statistical Mechanics · Reinforcement Learning in Robotics · Statistical Mechanics and Entropy