Meta-Adversarial Inverse Reinforcement Learning for Decision-making Tasks
Pin Wang, Hanhan Li, Ching-Yao Chan

TL;DR
This paper introduces Meta-AIRL, a novel imitation learning framework that combines meta-learning and adversarial inverse reinforcement learning to enable quick adaptation to new tasks with limited demonstration data.
Contribution
It presents a new adaptable imitation learning model that effectively generalizes to unseen tasks using limited data by integrating meta-learning with adversarial inverse reinforcement learning.
Findings
Meta-AIRL achieves rapid adaptation to new tasks.
The model performs comparably to experts with limited demonstrations.
Simulation results validate the effectiveness of the approach.
Abstract
Learning from demonstrations has made great progress over the past few years. However, it is generally data hungry and task specific. In other words, it requires a large amount of data to train a decent model on a particular task, and the model often fails to generalize to new tasks that have a different distribution. In practice, demonstrations from new tasks will be continuously observed and the data might be unlabeled or only partially labeled. Therefore, it is desirable for the trained model to adapt to new tasks that have limited data samples available. In this work, we build an adaptable imitation learning model based on the integration of Meta-learning and Adversarial Inverse Reinforcement Learning (Meta-AIRL). We exploit the adversarial learning and inverse reinforcement learning mechanisms to learn policies and reward functions simultaneously from available training tasks and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning
