Theoretical Barriers in Bellman-Based Reinforcement Learning

Brieuc Pinon; Rapha\"el Jungers; Jean-Charles Delvenne

arXiv:2502.11968·cs.LG·February 18, 2025

Theoretical Barriers in Bellman-Based Reinforcement Learning

Brieuc Pinon, Rapha\"el Jungers, Jean-Charles Delvenne

PDF

Open Access

TL;DR

This paper identifies fundamental limitations of Bellman-based reinforcement learning algorithms in high-dimensional spaces, showing they can neglect critical information and fail to generalize effectively across the state space.

Contribution

It formalizes a key theoretical barrier in Bellman-based RL and demonstrates the failure of common algorithms on constructed counterexamples.

Findings

01

Bellman-based methods can overlook critical problem information.

02

Counterexamples show failure to exploit simple structures.

03

Hindsight Experience Replay also faces similar limitations.

Abstract

Reinforcement Learning algorithms designed for high-dimensional spaces often enforce the Bellman equation on a sampled subset of states, relying on generalization to propagate knowledge across the state space. In this paper, we identify and formalize a fundamental limitation of this common approach. Specifically, we construct counterexample problems with a simple structure that this approach fails to exploit. Our findings reveal that such algorithms can neglect critical information about the problems, leading to inefficiencies. Furthermore, we extend this negative result to another approach from the literature: Hindsight Experience Replay learning state-to-state reachability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications

MethodsExperience Replay