Constraint Estimation and Derivative-Free Recovery for Robot Learning   from Demonstrations

Jonathan Lee; Michael Laskey; Roy Fox; Ken Goldberg

arXiv:1801.10321·cs.RO·October 17, 2018

Constraint Estimation and Derivative-Free Recovery for Robot Learning from Demonstrations

Jonathan Lee, Michael Laskey, Roy Fox, Ken Goldberg

PDF

Open Access

TL;DR

This paper introduces Derivative-Free Recovery (DFR), a two-phase method that enhances the safety of learned robotic manipulation policies by estimating support from demonstrations and switching to recovery policies to avoid unsafe states.

Contribution

The paper presents a novel support estimation approach and a switching policy framework that guarantees constraint satisfaction without explicit constraint modeling.

Findings

01

DFR reduces collisions by 83% in MuJoCo simulation.

02

DFR decreases collisions by 84% in real-world surgical and platform tasks.

03

The method provides theoretical guarantees for constraint adherence during execution.

Abstract

Learning from human demonstrations can facilitate automation but is risky because the execution of the learned policy might lead to collisions and other failures. Adding explicit constraints to avoid unsafe states is generally not possible when the state representations are complex. Furthermore, enforcing these constraints during execution of the learned policy can be challenging in environments where dynamics are difficult to model such as push mechanics in grasping. In this paper, we propose Derivative-Free Recovery (DFR), a two-phase method for generating robust policies from demonstrations in robotic manipulation tasks where the system comes to rest at each time step. In the first phase, we use support estimation of supervisor demonstrations and treat the support as implicit constraints on states. We also propose a time-varying modification for sequential tasks. In the second phase,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Advanced Control Systems Optimization