Meta-Inverse Reinforcement Learning with Probabilistic Context Variables
Lantao Yu, Tianhe Yu, Chelsea Finn, Stefano Ermon

TL;DR
This paper introduces a deep probabilistic model that learns reward functions from heterogeneous demonstrations and can generalize to new related tasks using minimal data, improving IRL efficiency.
Contribution
It presents a novel latent variable model for IRL that handles diverse demonstrations and infers rewards for new tasks from a single example.
Findings
Outperforms existing IRL methods on continuous control tasks.
Effectively learns from heterogeneous demonstrations.
Successfully generalizes to new related tasks with minimal demonstrations.
Abstract
Providing a suitable reward function to reinforcement learning can be difficult in many real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations, several major challenges remain. First, existing IRL methods learn reward functions from scratch, requiring large numbers of demonstrations to correctly infer the reward for each task the agent may need to perform. Second, existing methods typically assume homogeneous demonstrations for a single behavior or task, while in practice, it might be easier to collect datasets of heterogeneous but related behaviors. To this end, we propose a deep latent variable model that is capable of learning rewards from demonstrations of distinct but related tasks in an unsupervised way. Critically, our model can infer rewards for new, structurally-similar tasks from a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Muscle activation and electromyography studies
