Online inverse reinforcement learning with limited data

Ryan Self; S M Nahid Mahmud; Katrine Hareland; Rushikesh; Kamalapurkar

arXiv:2008.08972·eess.SY·August 21, 2020

Online inverse reinforcement learning with limited data

Ryan Self, S M Nahid Mahmud, Katrine Hareland, Rushikesh, Kamalapurkar

PDF

TL;DR

This paper introduces an online inverse reinforcement learning method that operates with limited data and uncertain dynamics, using real-time data collection, concurrent parameter estimation, and data-driven updates to improve reward function estimation.

Contribution

It presents a novel real-time inverse reinforcement learning approach that compensates for data scarcity and system uncertainties through concurrent parameter estimation and data augmentation.

Findings

01

Effective real-time reward estimation with limited data

02

Concurrent parameter estimation improves robustness

03

Data-driven updates enhance reward function accuracy

Abstract

This paper addresses the problem of online inverse reinforcement learning for systems with limited data and uncertain dynamics. In the developed approach, the state and control trajectories are recorded online by observing an agent perform a task, and reward function estimation is performed in real-time using a novel inverse reinforcement learning approach. Parameter estimation is performed concurrently to help compensate for uncertainties in the agent's dynamics. Data insufficiency is resolved by developing a data-driven update law to estimate the optimal feedback controller. The estimated controller can then be queried to artificially create additional data to drive reward function estimation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.