Rethinking Model-based, Policy-based, and Value-based Reinforcement   Learning via the Lens of Representation Complexity

Guhao Feng; Han Zhong

arXiv:2312.17248·cs.LG·December 10, 2024·1 cites

Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

Guhao Feng, Han Zhong

PDF

Open Access 1 Repo

TL;DR

This paper explores the hierarchy of representation complexity among RL paradigms, revealing that model representation is simpler than policy and value functions, with formal complexity results distinguishing their computational tractability.

Contribution

It introduces a formal hierarchy of representation complexity in RL, demonstrating the computational intractability of representing optimal policies and values compared to models.

Findings

01

Model representation is achievable with constant-depth circuits or MLPs.

02

Optimal policy and value functions are computationally intractable to represent with simple neural networks.

03

A hierarchy of complexity exists: model < policy < value in RL representations.

Abstract

Reinforcement Learning (RL) encompasses diverse paradigms, including model-based RL, policy-based RL, and value-based RL, each tailored to approximate the model, optimal policy, and optimal value function, respectively. This work investigates the potential hierarchy of representation complexity -- the complexity of functions to be represented -- among these RL paradigms. We first demonstrate that, for a broad class of Markov decision processes (MDPs), the model can be represented by constant-depth circuits with polynomial size or Multi-Layer Perceptrons (MLPs) with constant layers and polynomial hidden dimension. However, the representation of the optimal policy and optimal value proves to be $NP$ -complete and unattainable by constant-layer MLPs with polynomial size. This demonstrates a significant representation complexity gap between model-based RL and model-free RL, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guhfeng/rl-representation-complexity
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning