Spectral Decomposition Representation for Reinforcement Learning
Tongzheng Ren, Tianjun Zhang, Lisa Lee, Joseph E. Gonzalez, Dale, Schuurmans, Bo Dai

TL;DR
This paper introduces SPEDER, a spectral decomposition method for reinforcement learning that improves state-action representations by addressing policy dependence and exploration issues, leading to better sample efficiency and performance.
Contribution
The paper proposes SPEDER, a novel spectral method that extracts state-action abstractions without policy dependence and balances exploration, with theoretical and empirical validation.
Findings
SPEDER achieves superior performance on benchmark tasks.
Theoretical analysis confirms sample efficiency in online and offline settings.
Addresses limitations of previous spectral methods by incorporating exploration considerations.
Abstract
Representation learning often plays a critical role in reinforcement learning by managing the curse of dimensionality. A representative class of algorithms exploits a spectral decomposition of the stochastic transition dynamics to construct representations that enjoy strong theoretical properties in an idealized setting. However, current spectral methods suffer from limited applicability because they are constructed for state-only aggregation and derived from a policy-dependent transition kernel, without considering the issue of exploration. To address these issues, we propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy, while also balancing the exploration-versus-exploitation trade-off during learning. A theoretical analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks
