Variational Inverse Control with Events: A General Framework for   Data-Driven Reward Definition

Justin Fu; Avi Singh; Dibya Ghosh; Larry Yang; Sergey Levine

arXiv:1805.11686·cs.LG·November 14, 2018·34 cites

Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition

Justin Fu, Avi Singh, Dibya Ghosh, Larry Yang, Sergey Levine

PDF

Open Access

TL;DR

This paper introduces VICE, a framework that learns reward functions from goal state samples rather than full demonstrations, enabling reinforcement learning in complex, high-dimensional environments.

Contribution

The paper presents VICE, a novel variational inverse control method that relaxes the need for full demonstrations, focusing on event-based goal achievement in reinforcement learning.

Findings

01

Effective in continuous control tasks with high-dimensional observations

02

Requires only goal state samples, not full demonstrations

03

Outperforms traditional inverse reinforcement learning methods

Abstract

The design of a reward function often poses a major practical challenge to real-world applications of reinforcement learning. Approaches such as inverse reinforcement learning attempt to overcome this challenge, but require expert demonstrations, which can be difficult or expensive to obtain in practice. We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available. Our method is grounded in an alternative perspective on control and reinforcement learning, where an agent's goal is to maximize the probability that one or more events will happen at some point in the future, rather than maximizing cumulative rewards. We demonstrate the effectiveness of our methods on continuous control tasks, with a focus on high-dimensional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Fault Detection and Control Systems · Reinforcement Learning in Robotics