Online inverse reinforcement learning with limited data
Ryan Self, S M Nahid Mahmud, Katrine Hareland, Rushikesh, Kamalapurkar

TL;DR
This paper introduces an online inverse reinforcement learning method that operates with limited data and uncertain dynamics, using real-time data collection, concurrent parameter estimation, and data-driven updates to improve reward function estimation.
Contribution
It presents a novel real-time inverse reinforcement learning approach that compensates for data scarcity and system uncertainties through concurrent parameter estimation and data augmentation.
Findings
Effective real-time reward estimation with limited data
Concurrent parameter estimation improves robustness
Data-driven updates enhance reward function accuracy
Abstract
This paper addresses the problem of online inverse reinforcement learning for systems with limited data and uncertain dynamics. In the developed approach, the state and control trajectories are recorded online by observing an agent perform a task, and reward function estimation is performed in real-time using a novel inverse reinforcement learning approach. Parameter estimation is performed concurrently to help compensate for uncertainties in the agent's dynamics. Data insufficiency is resolved by developing a data-driven update law to estimate the optimal feedback controller. The estimated controller can then be queried to artificially create additional data to drive reward function estimation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
