Understanding Hindsight Goal Relabeling from a Divergence Minimization   Perspective

Lunjun Zhang; Bradly C. Stadie

arXiv:2209.13046·cs.LG·January 31, 2023

Understanding Hindsight Goal Relabeling from a Divergence Minimization Perspective

Lunjun Zhang, Bradly C. Stadie

PDF

Open Access

TL;DR

This paper offers a new perspective on hindsight goal relabeling in multi-goal reinforcement learning by framing it as divergence minimization within imitation learning, leading to insights and improved algorithms.

Contribution

It introduces a divergence minimization framework for understanding hindsight goal relabeling, connecting it to imitation learning and deriving new insights for goal-reaching algorithms.

Findings

01

Q-learning outperforms behavioral cloning under hindsight relabeling

02

Selective application of behavioral cloning improves performance

03

Reward design significantly impacts goal-reaching success

Abstract

Hindsight goal relabeling has become a foundational technique in multi-goal reinforcement learning (RL). The essential idea is that any trajectory can be seen as a sub-optimal demonstration for reaching its final state. Intuitively, learning from those arbitrary demonstrations can be seen as a form of imitation learning (IL). However, the connection between hindsight goal relabeling and imitation learning is not well understood. In this paper, we propose a novel framework to understand hindsight goal relabeling from a divergence minimization perspective. Recasting the goal reaching problem in the IL framework not only allows us to derive several existing methods from first principles, but also provides us with the tools from IL to improve goal reaching algorithms. Experimentally, we find that under hindsight relabeling, Q-learning outperforms behavioral cloning (BC). Yet, a vanilla…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Neural and Behavioral Psychology Studies

MethodsQ-Learning · Experience Replay