Sample Efficient Imitation Learning via Reward Function Trained in Advance
Lihua Zhang

TL;DR
This paper introduces MRFIL, a novel imitation learning approach that uses an ensemble dynamic model as a reward function trained on expert demonstrations, significantly improving sample efficiency in high-dimensional control tasks.
Contribution
The paper proposes MRFIL, a new inverse reinforcement learning scheme that enhances sample efficiency by training a reward function with an ensemble dynamic model and guarantees convergence.
Findings
Achieves competitive performance with fewer environment interactions.
Demonstrates convergence guarantee for the new objective.
Reduces sample complexity in high-dimensional tasks.
Abstract
Imitation learning (IL) is a framework that learns to imitate expert behavior from demonstrations. Recently, IL shows promising results on high dimensional and control tasks. However, IL typically suffers from sample inefficiency in terms of environment interaction, which severely limits their application to simulated domains. In industrial applications, learner usually have a high interaction cost, the more interactions with environment, the more damage it causes to the environment and the learner itself. In this article, we make an effort to improve sample efficiency by introducing a novel scheme of inverse reinforcement learning. Our method, which we call \textit{Model Reward Function Based Imitation Learning} (MRFIL), uses an ensemble dynamic model as a reward function, what is trained with expert demonstrations. The key idea is to provide the agent with an incentive to match the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning
