Training Efficient Controllers via Analytic Policy Gradient
Nina Wiedemann, Valentin W\"uest, Antonio Loquercio, Matthias, M\"uller, Dario Floreano, Davide Scaramuzza

TL;DR
This paper introduces an Analytic Policy Gradient method that trains efficient, accurate controllers offline using differentiable simulators, outperforming RL and matching MPC in accuracy while being computationally cheaper.
Contribution
The paper presents a novel APG approach that leverages differentiable simulators for offline training of controllers, addressing training instability with curriculum learning.
Findings
APG outperforms RL in tracking error.
APG matches MPC performance with significantly less computation.
Open-source code for APG is provided.
Abstract
Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. Conversely, learning-based offline optimization approaches, such as Reinforcement Learning (RL), allow fast and efficient execution on the robot but hardly match the accuracy of MPC in trajectory tracking tasks. In systems with limited compute, such as aerial vehicles, an accurate controller that is efficient at execution time is imperative. We propose an Analytic Policy Gradient (APG) method to tackle this problem. APG exploits the availability of differentiable simulators by training a controller offline with gradient descent on the tracking error. We address training instabilities that frequently occur with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Adaptive Dynamic Programming Control · Mechanical Circulatory Support Devices
