Differentiable Robust LQR Layers

Ngo Anh Vien; Gerhard Neumann

arXiv:2106.05535·cs.RO·June 11, 2021

Differentiable Robust LQR Layers

Ngo Anh Vien, Gerhard Neumann

PDF

Open Access

TL;DR

This paper introduces a differentiable robust LQR layer that enhances reinforcement and imitation learning by explicitly modeling uncertainty and stochasticity, enabling robust policy optimization in uncertain environments.

Contribution

It presents a novel differentiable layer for robust LQR optimization by reformulating it as a convex semi-definite program, facilitating end-to-end learning under uncertainty.

Findings

01

Improves policy robustness in uncertain environments

02

Achieves better performance than existing methods without explicit uncertainty modeling

03

Demonstrates effectiveness in imitation learning and dynamic programming tasks

Abstract

This paper proposes a differentiable robust LQR layer for reinforcement learning and imitation learning under model uncertainty and stochastic dynamics. The robust LQR layer can exploit the advantages of robust optimal control and model-free learning. It provides a new type of inductive bias for stochasticity and uncertainty modeling in control systems. In particular, we propose an efficient way to differentiate through a robust LQR optimization program by rewriting it as a convex program (i.e. semi-definite program) of the worst-case cost. Based on recent work on using convex optimization inside neural network layers, we develop a fully differentiable layer for optimizing this worst-case cost, i.e. we compute the derivative of a performance measure w.r.t the model's unknown parameters, model uncertainty and stochasticity parameters. We demonstrate the proposed method on imitation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Control Systems Optimization