TL;DR
This paper introduces a novel framework for robot learning that effectively handles crash constraints and failures, using a Gaussian process model to improve data efficiency and constraint estimation in real robotic systems.
Contribution
It proposes a new GP-based model (GPCR) for learning with crash constraints, addressing data scarcity and failure handling in robot learning.
Findings
GPCR outperforms manual tuning in experiments.
Framework successfully estimates unknown constraint thresholds.
Demonstrated on simulated and real quadruped robot.
Abstract
In the past decade, numerous machine learning algorithms have been shown to successfully learn optimal policies to control real robotic systems. However, it is common to encounter failing behaviors as the learning loop progresses. Specifically, in robot applications where failing is undesired but not catastrophic, many algorithms struggle with leveraging data obtained from failures. This is usually caused by (i) the failed experiment ending prematurely, or (ii) the acquired data being scarce or corrupted. Both complicate the design of proper reward functions to penalize failures. In this paper, we propose a framework that addresses those issues. We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints, where no data is obtained upon constraint violation. The no-data case is addressed by a novel GP model (GPCR) for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
