Computational-Statistical Gaps in Reinforcement Learning
Daniel Kane, Sihan Liu, Shachar Lovett, Gaurav Mahajan

TL;DR
This paper establishes a fundamental computational lower bound for reinforcement learning with linear function approximation, revealing a significant gap between what is statistically possible and what is computationally feasible.
Contribution
It presents the first computational lower bound for RL with linear function approximation, demonstrating a computational-statistical gap and an NP-hardness result under standard complexity assumptions.
Findings
No polynomial time algorithm exists unless NP=RP.
A reduction from Unique-Sat to RL with linear value functions.
Existence of a computational-statistical gap in RL.
Abstract
Reinforcement learning with function approximation has recently achieved tremendous results in applications with large state spaces. This empirical success has motivated a growing body of theoretical work proposing necessary and sufficient conditions under which efficient reinforcement learning is possible. From this line of work, a remarkably simple minimal sufficient condition has emerged for sample efficient reinforcement learning: MDPs with optimal value function and linear in some known low-dimensional features. In this setting, recent works have designed sample efficient algorithms which require a number of samples polynomial in the feature dimension and independent of the size of state space. They however leave finding computationally efficient algorithms as future work and this is considered a major open problem in the community. In this work, we make progress on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Gene Regulatory Network Analysis · Reinforcement Learning in Robotics
