Faster Reinforcement Learning by Freezing Slow States

Yijia Wang; Daniel R. Jiang

arXiv:2301.00922·cs.AI·October 28, 2025

Faster Reinforcement Learning by Freezing Slow States

Yijia Wang, Daniel R. Jiang

PDF

Open Access

TL;DR

This paper introduces a freezing approach for slow states in MDPs with fast-slow structures, reducing computational complexity while maintaining high-quality policies, supported by theoretical analysis and empirical benchmarks.

Contribution

The paper proposes a novel frozen-state approximation method for fast-slow MDPs, enabling efficient planning by decoupling slow and fast state dynamics.

Findings

01

Significantly reduces computation time in benchmark problems.

02

Maintains high policy quality comparable to full-state methods.

03

Omitting slow states without freezing leads to poorer performance.

Abstract

We study infinite horizon Markov decision processes (MDPs) with "fast-slow" structure, where some state variables evolve rapidly ("fast states") while others change more gradually ("slow states"). This structure commonly arises in practice when decisions must be made at high frequencies over long horizons, and where slowly changing information still plays a critical role in determining optimal actions. Examples include inventory control under slowly changing demand indicators or dynamic pricing with gradually shifting consumer behavior. Modeling the problem at the natural decision frequency leads to MDPs with discount factors close to one, making them computationally challenging. We propose a novel approximation strategy that "freezes" slow states during phases of lower-level planning and subsequently applies value iteration to an auxiliary upper-level MDP that evolves on a slower…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Age of Information Optimization