Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte   Carlo Method

Duo Xu; Faramarz Fekri

arXiv:2103.12020·cs.LG·January 4, 2022

Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method

Duo Xu, Faramarz Fekri

PDF

Open Access

TL;DR

This paper introduces Hamiltonian Policy, integrating Hamiltonian Monte Carlo with actor-critic reinforcement learning to enhance policy approximation, exploration, and safety in continuous control tasks.

Contribution

It proposes a novel Hamiltonian Policy method that reduces the amortization gap, improves exploration, and enhances safety in actor-critic RL through HMC integration and a new leapfrog operator.

Findings

01

Improves policy approximation and exploration efficiency.

02

Reduces safety constraint violations in safe RL.

03

Achieves better data efficiency on continuous control benchmarks.

Abstract

The actor-critic RL is widely used in various robotic control tasks. By viewing the actor-critic RL from the perspective of variational inference (VI), the policy network is trained to obtain the approximate posterior of actions given the optimality criteria. However, in practice, the actor-critic RL may yield suboptimal policy estimates due to the amortization gap and insufficient exploration. In this work, inspired by the previous use of Hamiltonian Monte Carlo (HMC) in VI, we propose to integrate the policy network of actor-critic RL with HMC, which is termed as {\it Hamiltonian Policy}. As such we propose to evolve actions from the base policy according to HMC, and our proposed method has many benefits. First, HMC can improve the policy distribution to better approximate the posterior and hence reduce the amortization gap. Second, HMC can also guide the exploration more to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning

MethodsVariational Inference