Training Efficient Controllers via Analytic Policy Gradient

Nina Wiedemann; Valentin W\"uest; Antonio Loquercio; Matthias; M\"uller; Dario Floreano; Davide Scaramuzza

arXiv:2209.13052·cs.RO·May 4, 2023

Training Efficient Controllers via Analytic Policy Gradient

Nina Wiedemann, Valentin W\"uest, Antonio Loquercio, Matthias, M\"uller, Dario Floreano, Davide Scaramuzza

PDF

Open Access 1 Repo

TL;DR

This paper introduces an Analytic Policy Gradient method that trains efficient, accurate controllers offline using differentiable simulators, outperforming RL and matching MPC in accuracy while being computationally cheaper.

Contribution

The paper presents a novel APG approach that leverages differentiable simulators for offline training of controllers, addressing training instability with curriculum learning.

Findings

01

APG outperforms RL in tracking error.

02

APG matches MPC performance with significantly less computation.

03

Open-source code for APG is provided.

Abstract

Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. Conversely, learning-based offline optimization approaches, such as Reinforcement Learning (RL), allow fast and efficient execution on the robot but hardly match the accuracy of MPC in trajectory tracking tasks. In systems with limited compute, such as aerial vehicles, an accurate controller that is efficient at execution time is imperative. We propose an Analytic Policy Gradient (APG) method to tackle this problem. APG exploits the availability of differentiable simulators by training a controller offline with gradient descent on the tracking error. We address training instabilities that frequently occur with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lis-epfl/apg_trajectory_tracking
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Adaptive Dynamic Programming Control · Mechanical Circulatory Support Devices