Ergodicity in reinforcement learning

Dominik Baumann; Erfaun Noorani; Arsenii Mustafin; Xinyi Sheng; Bert Verbruggen; Arne Vanhoyweghen; Vincent Ginis; Thomas B. Sch\"on

arXiv:2603.10895·cs.LG·March 12, 2026

Ergodicity in reinforcement learning

Dominik Baumann, Erfaun Noorani, Arsenii Mustafin, Xinyi Sheng, Bert Verbruggen, Arne Vanhoyweghen, Vincent Ginis, Thomas B. Sch\"on

PDF

Open Access

TL;DR

This paper explores how non-ergodic reward processes affect reinforcement learning, emphasizing the importance of ergodicity for meaningful long-term optimization and reviewing solutions for non-ergodic scenarios.

Contribution

It clarifies the impact of non-ergodic rewards in reinforcement learning and connects ergodic theory to existing solutions for optimizing individual trajectories.

Findings

01

Non-ergodic rewards can lead to misleading expected values.

02

Ergodic reward processes are crucial for meaningful long-term optimization.

03

Existing solutions address non-ergodic reward dynamics.

Abstract

In reinforcement learning, we typically aim to optimize the expected value of the sum of rewards an agent collects over a trajectory. However, if the process generating these rewards is non-ergodic, the expected value, i.e., the average over infinitely many trajectories with a given policy, is uninformative for the average over a single, but infinitely long trajectory. Thus, if we care about how the individual agent performs during deployment, the expected value is not a good optimization objective. In this paper, we discuss the impact of non-ergodic reward processes on reinforcement learning agents through an instructive example, relate the notion of ergodic reward processes to more widely used notions of ergodic Markov chains, and present existing solutions that optimize long-term performance of individual trajectories under non-ergodic reward dynamics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Game Theory and Applications