Online Observer-Based Inverse Reinforcement Learning
Ryan Self, Kevin Coleman, He Bai, Rushikesh Kamalapurkar

TL;DR
This paper introduces a new observer-based method for inverse reinforcement learning in linear systems, framing IRL as a state estimation problem and providing theoretical guarantees and simulation results.
Contribution
It develops two observer-based techniques for IRL, including a novel method reusing previous estimates, with proven convergence and robustness.
Findings
Observers perform well under noisy and noise-free conditions
Theoretical guarantees ensure convergence and robustness
Simulations validate the effectiveness of the proposed methods
Abstract
In this paper, a novel approach to the output-feedback inverse reinforcement learning (IRL) problem is developed by casting the IRL problem, for linear systems with quadratic cost functions, as a state estimation problem. Two observer-based techniques for IRL are developed, including a novel observer method that re-uses previous state estimates via history stacks. Theoretical guarantees for convergence and robustness are established under appropriate excitation conditions. Simulations demonstrate the performance of the developed observers and filters under noisy and noise-free measurements.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
