Adversarial Imitation Learning from Video using a State Observer
Haresh Karnan, Garrett Warnell, Faraz Torabi, Peter Stone

TL;DR
This paper introduces VGAIfO-SO, a novel imitation learning algorithm that uses a self-supervised state observer to efficiently learn from video demonstrations, reducing sample complexity in continuous control tasks.
Contribution
The paper proposes a new algorithm, VGAIfO-SO, which improves sample efficiency in video-based imitation learning by estimating low-dimensional states from high-dimensional videos.
Findings
VGAIfO-SO outperforms other imitation from observation algorithms in sample efficiency.
VGAIfO-SO can achieve performance close to algorithms with privileged state information.
The method is effective across several continuous control environments.
Abstract
The imitation learning research community has recently made significant progress towards the goal of enabling artificial agents to imitate behaviors from video demonstrations alone. However, current state-of-the-art approaches developed for this problem exhibit high sample complexity due, in part, to the high-dimensional nature of video observations. Towards addressing this issue, we introduce here a new algorithm called Visual Generative Adversarial Imitation from Observation using a State Observer VGAIfO-SO. At its core, VGAIfO-SO seeks to address sample inefficiency using a novel, self-supervised state observer, which provides estimates of lower-dimensional proprioceptive state representations from high-dimensional images. We show experimentally in several continuous control environments that VGAIfO-SO is more sample efficient than other IfO algorithms at learning from video-only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
