Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity
Guhao Feng, Han Zhong

TL;DR
This paper explores the hierarchy of representation complexity among RL paradigms, revealing that model representation is simpler than policy and value functions, with formal complexity results distinguishing their computational tractability.
Contribution
It introduces a formal hierarchy of representation complexity in RL, demonstrating the computational intractability of representing optimal policies and values compared to models.
Findings
Model representation is achievable with constant-depth circuits or MLPs.
Optimal policy and value functions are computationally intractable to represent with simple neural networks.
A hierarchy of complexity exists: model < policy < value in RL representations.
Abstract
Reinforcement Learning (RL) encompasses diverse paradigms, including model-based RL, policy-based RL, and value-based RL, each tailored to approximate the model, optimal policy, and optimal value function, respectively. This work investigates the potential hierarchy of representation complexity -- the complexity of functions to be represented -- among these RL paradigms. We first demonstrate that, for a broad class of Markov decision processes (MDPs), the model can be represented by constant-depth circuits with polynomial size or Multi-Layer Perceptrons (MLPs) with constant layers and polynomial hidden dimension. However, the representation of the optimal policy and optimal value proves to be -complete and unattainable by constant-layer MLPs with polynomial size. This demonstrates a significant representation complexity gap between model-based RL and model-free RL, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
