TL;DR
This paper presents a curriculum learning approach for reinforcement learning that efficiently trains a quadrotor to stabilize from random initial conditions, reducing training time and computational resources.
Contribution
The authors introduce a three-stage curriculum learning method that improves sample efficiency and robustness in training quadrotor stabilization policies.
Findings
The curriculum approach outperforms conventional one-stage training in performance.
It significantly reduces training samples and convergence time.
The trained policy demonstrates robustness in simulation and pose-tracking scenarios.
Abstract
This article introduces a novel sample-efficient curriculum learning (CL) approach for training an end-to-end reinforcement learning (RL) policy for robust stabilization of a Quadrotor. The learning objective is to simultaneously stabilize position and yaw-orientation from random initial conditions through direct control over motor RPMs (end-to-end), while adhering to pre-specified transient and steady-state specifications. This objective, relevant in aerial inspection applications, is challenging for conventional one-stage end-to-end RL, which requires substantial computational resources and lengthy training times. To address this challenge, this article draws inspiration from human-inspired curriculum learning and decomposes the learning objective into a three-stage curriculum that incrementally increases task complexity, while transferring knowledge from one stage to the next. In the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
