Stable Inverse Reinforcement Learning: Policies from Control Lyapunov   Landscapes

Samuel Tesfazgi; Leonhard Sprandl; Armin Lederer; Sandra Hirche

arXiv:2405.08756·eess.SY·May 15, 2024

Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes

Samuel Tesfazgi, Leonhard Sprandl, Armin Lederer, Sandra Hirche

PDF

Open Access

TL;DR

This paper introduces a stable inverse reinforcement learning method that infers control Lyapunov functions from demonstrations, ensuring stability and efficiency in learning complex behaviors for autonomous systems.

Contribution

It reformulates IRL as a control Lyapunov function learning problem, providing stability guarantees and computational efficiency through convex optimization and closed-form policies.

Findings

01

Efficiently learns stable control policies from demonstrations.

02

Provides theoretical stability guarantees for the learned policies.

03

Validated on both simulated and real-world data.

Abstract

Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic cost function that reflects its intent and informs its control actions. While the framework is expressive, it is also computationally demanding and generally lacks convergence guarantees. We therefore propose a novel, stability-certified IRL approach by reformulating the cost function inference problem to learning control Lyapunov functions (CLF) from demonstrations data. By additionally exploiting closed-form expressions for associated control policies, we are able to efficiently search the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · stochastic dynamics and bifurcation