Latent Variable Representation for Reinforcement Learning

Tongzheng Ren; Chenjun Xiao; Tianjun Zhang; Na Li; Zhaoran Wang; Sujay; Sanghavi; Dale Schuurmans; Bo Dai

arXiv:2212.08765·cs.LG·March 8, 2023·1 cites

Latent Variable Representation for Reinforcement Learning

Tongzheng Ren, Chenjun Xiao, Tianjun Zhang, Na Li, Zhaoran Wang, Sujay, Sanghavi, Dale Schuurmans, Bo Dai

PDF

Open Access 1 Video

TL;DR

This paper introduces a latent variable model-based approach for reinforcement learning that enhances sample efficiency through a new planning algorithm with UCB exploration, supported by theoretical analysis and empirical validation.

Contribution

It presents a novel latent variable representation for RL value functions, enabling efficient planning and exploration with theoretical guarantees and superior empirical performance.

Findings

01

Proposed a kernel embedding-based UCB planning algorithm.

02

Established sample complexity bounds for the approach.

03

Demonstrated improved performance on benchmark tasks.

Abstract

Deep latent variable models have achieved significant empirical successes in model-based reinforcement learning (RL) due to their expressiveness in modeling complex transition dynamics. On the other hand, it remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of RL. In this paper, we provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle in the face of uncertainty for exploration. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models. Theoretically, we establish the sample complexity of the proposed approach in the online…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Latent Variable Representation for Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics