Representations and Exploration for Deep Reinforcement Learning using   Singular Value Decomposition

Yash Chandak; Shantanu Thakoor; Zhaohan Daniel Guo; Yunhao Tang; Remi; Munos; Will Dabney; Diana L Borsa

arXiv:2305.00654·cs.LG·May 3, 2023·1 cites

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Remi, Munos, Will Dabney, Diana L Borsa

PDF

Open Access 1 Video

TL;DR

This paper introduces a singular value decomposition-based method for deep reinforcement learning that learns representations preserving transition structures and state visitation frequencies, scalable to large and partially observable domains.

Contribution

The authors propose a novel SVD-based approach for representation learning in RL that captures transition dynamics and pseudo-counts, scalable with deep networks and applicable to partially observable environments.

Findings

01

Effective in multi-task partially observable environments

02

Learns useful representations for complex inputs like language and images

03

Improves exploration in hard RL tasks

Abstract

Representation learning and exploration are among the key challenges for any deep reinforcement learning agent. In this work, we provide a singular value decomposition based method that can be used to obtain representations that preserve the underlying transition structure in the domain. Perhaps interestingly, we show that these representations also capture the relative frequency of state visitations, thereby providing an estimate for pseudo-counts for free. To scale this decomposition method to large-scale domains, we provide an algorithm that never requires building the transition matrix, can make use of deep networks, and also permits mini-batch training. Further, we draw inspiration from predictive state representations and extend our decomposition method to partially observable environments. With experiments on multi-task settings with partially observable domains, we show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Ferroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing