Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Remi, Munos, Will Dabney, Diana L Borsa

TL;DR
This paper introduces a singular value decomposition-based method for deep reinforcement learning that learns representations preserving transition structures and state visitation frequencies, scalable to large and partially observable domains.
Contribution
The authors propose a novel SVD-based approach for representation learning in RL that captures transition dynamics and pseudo-counts, scalable with deep networks and applicable to partially observable environments.
Findings
Effective in multi-task partially observable environments
Learns useful representations for complex inputs like language and images
Improves exploration in hard RL tasks
Abstract
Representation learning and exploration are among the key challenges for any deep reinforcement learning agent. In this work, we provide a singular value decomposition based method that can be used to obtain representations that preserve the underlying transition structure in the domain. Perhaps interestingly, we show that these representations also capture the relative frequency of state visitations, thereby providing an estimate for pseudo-counts for free. To scale this decomposition method to large-scale domains, we provide an algorithm that never requires building the transition matrix, can make use of deep networks, and also permits mini-batch training. Further, we draw inspiration from predictive state representations and extend our decomposition method to partially observable environments. With experiments on multi-task settings with partially observable domains, we show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Ferroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing
