Constraint Estimation and Derivative-Free Recovery for Robot Learning from Demonstrations
Jonathan Lee, Michael Laskey, Roy Fox, Ken Goldberg

TL;DR
This paper introduces Derivative-Free Recovery (DFR), a two-phase method that enhances the safety of learned robotic manipulation policies by estimating support from demonstrations and switching to recovery policies to avoid unsafe states.
Contribution
The paper presents a novel support estimation approach and a switching policy framework that guarantees constraint satisfaction without explicit constraint modeling.
Findings
DFR reduces collisions by 83% in MuJoCo simulation.
DFR decreases collisions by 84% in real-world surgical and platform tasks.
The method provides theoretical guarantees for constraint adherence during execution.
Abstract
Learning from human demonstrations can facilitate automation but is risky because the execution of the learned policy might lead to collisions and other failures. Adding explicit constraints to avoid unsafe states is generally not possible when the state representations are complex. Furthermore, enforcing these constraints during execution of the learned policy can be challenging in environments where dynamics are difficult to model such as push mechanics in grasping. In this paper, we propose Derivative-Free Recovery (DFR), a two-phase method for generating robust policies from demonstrations in robotic manipulation tasks where the system comes to rest at each time step. In the first phase, we use support estimation of supervisor demonstrations and treat the support as implicit constraints on states. We also propose a time-varying modification for sequential tasks. In the second phase,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Advanced Control Systems Optimization
