Representation-Driven Reinforcement Learning
Ofir Nabati, Guy Tennenholtz, Shie Mannor

TL;DR
This paper introduces a representation-driven framework for reinforcement learning that leverages policy representations in a linear feature space to improve exploration and exploitation, demonstrating significant performance gains.
Contribution
It proposes a novel approach that embeds policy networks into a linear feature space, reframing exploration-exploitation as a representation problem, applicable to various RL methods.
Findings
Enhanced performance over traditional methods
Effective application to evolutionary and policy gradient approaches
Highlights importance of policy representation in RL
Abstract
We present a representation-driven framework for reinforcement learning. By representing policies as estimates of their expected values, we leverage techniques from contextual bandits to guide exploration and exploitation. Particularly, embedding a policy network into a linear feature space allows us to reframe the exploration-exploitation problem as a representation-exploitation problem, where good policy representations enable optimal exploration. We demonstrate the effectiveness of this framework through its application to evolutionary and policy gradient-based approaches, leading to significantly improved performance compared to traditional methods. Our framework provides a new perspective on reinforcement learning, highlighting the importance of policy representation in determining optimal exploration-exploitation strategies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Bandit Algorithms Research
