On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning   Problems in High-dimension

Udari Madhushani; Biswadip Dey; Naomi Ehrich Leonard; Amit Chakraborty

arXiv:2011.05927·cs.LG·March 29, 2022

On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension

Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty

PDF

Open Access

TL;DR

This paper introduces Hamiltonian Q-Learning, a novel approach that leverages Hamiltonian Monte Carlo sampling and matrix completion to efficiently learn value functions in high-dimensional, stochastic reinforcement learning environments.

Contribution

It presents a new framework combining HMC sampling with matrix completion for effective Q-learning in complex high-dimensional stochastic settings.

Findings

01

Theoretically demonstrates the viability of HMC-generated data for Q-learning.

02

Empirically validates the approach on high-dimensional RL problems.

03

Shows improved data efficiency and scalability in complex environments.

Abstract

Value function based reinforcement learning (RL) algorithms, for example, $Q$ -learning, learn optimal policies from datasets of actions, rewards, and state transitions. However, when the underlying state transition dynamics are stochastic and evolve on a high-dimensional space, generating independent and identically distributed (IID) data samples for creating these datasets poses a significant challenge due to the intractability of the associated normalizing integral. In these scenarios, Hamiltonian Monte Carlo (HMC) sampling offers a computationally tractable way to generate data for training RL algorithms. In this paper, we introduce a framework, called \textit{Hamiltonian $Q$ -Learning}, that demonstrates, both theoretically and empirically, that $Q$ values can be learned from a dataset generated by HMC samples of actions, rewards, and state transitions. Furthermore, to exploit the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Markov Chains and Monte Carlo Methods · Advancements in Semiconductor Devices and Circuit Design

MethodsQ-Learning