The complexity of non-stationary reinforcement learning
Christos Papadimitriou, Binghui Peng

TL;DR
This paper establishes a theoretical complexity barrier in non-stationary reinforcement learning, showing that updating the value function after changing a single state-action probability is nearly as hard as solving large-scale problems, unless a major complexity hypothesis fails.
Contribution
It provides a worst-case complexity analysis for non-stationary reinforcement learning, highlighting fundamental computational challenges in adapting to changes.
Findings
Modifying transition probabilities or rewards in RL is computationally hard.
Adding a new state-action pair is significantly easier than updating existing ones.
The complexity results are conditioned on the strong exponential time hypothesis (SETH).
Abstract
The problem of continual learning in the domain of reinforcement learning, often called non-stationary reinforcement learning, has been identified as an important challenge to the application of reinforcement learning. We prove a worst-case complexity result, which we believe captures this challenge: Modifying the probabilities or the reward of a single state-action pair in a reinforcement learning problem requires an amount of time almost as large as the number of states in order to keep the value function up to date, unless the strong exponential time hypothesis (SETH) is false; SETH is a widely accepted strengthening of the P NP conjecture. Recall that the number of states in current applications of reinforcement learning is typically astronomical. In contrast, we show that just a new state-action pair is considerably easier to implement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Computability, Logic, AI Algorithms · Reinforcement Learning in Robotics
