A Differential Dynamic Programming Framework for Inverse Reinforcement Learning
Kun Cao, Xinhang Xu, Wanxin Jin, Karl H. Johansson, Lihua Xie

TL;DR
This paper introduces a DDP-based inverse reinforcement learning framework that efficiently recovers cost function parameters, dynamics, and constraints from demonstrations, with a novel closed-loop loss function outperforming traditional open-loop methods.
Contribution
It presents a new DDP-based IRL method that handles both equality and inequality constraints and introduces a closed-loop IRL framework capturing demonstration dynamics.
Findings
Validated through robot and quadrotor experiments
Proven to recover parameters under certain conditions
Closed-loop IRL outperforms open-loop loss functions
Abstract
A differential dynamic programming (DDP)-based framework for inverse reinforcement learning (IRL) is introduced to recover the parameters in the cost function, system dynamics, and constraints from demonstrations. Different from existing work, where DDP was used for the inner forward problem with inequality constraints, our proposed framework uses it for efficient computation of the gradient required in the outer inverse problem with equality and inequality constraints. The equivalence between the proposed method and existing methods based on Pontryagin's Maximum Principle (PMP) is established. More importantly, using this DDP-based IRL with an open-loop loss function, a closed-loop IRL framework is presented. In this framework, a loss function is proposed to capture the closed-loop nature of demonstrations. It is shown to be better than the commonly used open-loop loss function. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control
