A Plug-and-Play Fully On-the-Job Real-Time Reinforcement Learning Algorithm for a Direct-Drive Tandem-Wing Experiment Platforms Under Multiple Random Operating Conditions
Zhang Minghao, Song Bifeng, Yang Xiaojun, Wang Liang

TL;DR
This paper introduces CRL2E, a plug-and-play, real-time reinforcement learning algorithm designed for stable control of tandem-wing systems under diverse, unpredictable conditions, outperforming existing methods in accuracy and convergence speed.
Contribution
The paper presents a novel Physics-Inspired Rule-Based Policy Composer Strategy with a Perturbation Module for real-time control, significantly improving stability and efficiency over prior algorithms.
Findings
CRL2E achieves safe, stable training within 500 steps.
CRL2E improves tracking accuracy by up to 66 times over baseline algorithms.
CRL2E converges 36-58% faster than previous CRL methods.
Abstract
The nonlinear and unstable aerodynamic interference generated by the tandem wings of such biomimetic systems poses substantial challenges for motion control, especially under multiple random operating conditions. To address these challenges, the Concerto Reinforcement Learning Extension (CRL2E) algorithm has been developed. This plug-and-play, fully on-the-job, real-time reinforcement learning algorithm incorporates a novel Physics-Inspired Rule-Based Policy Composer Strategy with a Perturbation Module alongside a lightweight network optimized for real-time control. To validate the performance and the rationality of the module design, experiments were conducted under six challenging operating conditions, comparing seven different algorithms. The results demonstrate that the CRL2E algorithm achieves safe and stable training within the first 500 steps, improving tracking accuracy by 14 to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Aerospace and Aviation Technology · Advanced Control Systems Optimization
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
