Imitating Latent Policies from Observation
Ashley D. Edwards, Himanshu Sahni, Yannick Schroecker, Charles L., Isbell

TL;DR
This paper presents a new imitation learning method that infers latent policies solely from state observations and aligns them with real actions using minimal environment interactions, outperforming standard methods.
Contribution
The paper introduces a novel approach to infer and align latent policies from observations without expert actions, enhancing imitation learning performance.
Findings
Performs better than standard approaches in control environments.
Effectively infers latent policies from observations.
Uses minimal environment interactions for action alignment.
Abstract
In this paper, we describe a novel approach to imitation learning that infers latent policies directly from state observations. We introduce a method that characterizes the causal effects of latent actions on observations while simultaneously predicting their likelihood. We then outline an action alignment procedure that leverages a small amount of environment interactions to determine a mapping between the latent and real-world actions. We show that this corrected labeling can be used for imitating the observed behavior, even though no expert actions are given. We evaluate our approach within classic control environments and a platform game and demonstrate that it performs better than standard approaches. Code for this work is available at https://github.com/ashedwards/ILPO.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications
