TL;DR
CubeDAgger is a novel interactive imitation learning method that enhances robustness and dynamic stability in agents performing dynamic tasks, requiring less expert intervention and demonstrating effectiveness in real-robot experiments.
Contribution
It introduces three key improvements to EnsembleDAgger, enabling low-risk, stable learning in dynamic environments with minimal expert supervision.
Findings
Policies trained with CubeDAgger maintain dynamic stability during interaction.
The method achieves robust policies with only 30 minutes of human-robot interaction.
Simulation results confirm reduced stability violations compared to baseline methods.
Abstract
Interactive imitation learning makes an agent's control policy robust by stepwise supervisions from an expert. The recent algorithms mostly employ expert-agent switching systems to reduce the expert's burden by limitedly selecting the supervision timing. However, this approach is useful only for static tasks; in dynamic tasks, timing discrepancies cause abrupt changes in actions, losing the robot's dynamic stability. This paper therefore proposes a novel method, named CubeDAgger, which improves robustness with less dynamic stability violations even for dynamic tasks. The proposed method is designed on a baseline, EnsembleDAgger, with three improvements. The first adds a regularization to explicitly activate the threshold for deciding the supervision timing. The second transforms the expert-agent switching system to an optimal consensus system of multiple action candidates. Third,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
