Concurrent Training Improves the Performance of Behavioral Cloning from   Observation

Zachary W. Robertson; Matthew R. Walter

arXiv:2008.01205·cs.LG·August 5, 2020

Concurrent Training Improves the Performance of Behavioral Cloning from Observation

Zachary W. Robertson, Matthew R. Walter

PDF

Open Access

TL;DR

This paper introduces a modified behavioral cloning from observation (BCO*) that concurrently trains inverse dynamics and policy models, reducing initial interaction needs and achieving competitive imitation learning performance.

Contribution

The paper provides a theoretical analysis of BCO, proposes BCO* with concurrent training, and demonstrates improved sample efficiency and performance over existing methods.

Findings

01

BCO* reduces the dependence on initial interactions.

02

Concurrent training improves BCO performance.

03

BCO* achieves results competitive with state-of-the-art methods.

Abstract

Learning from demonstration is widely used as an efficient way for robots to acquire new skills. However, it typically requires that demonstrations provide full access to the state and action sequences. In contrast, learning from observation offers a way to utilize unlabeled demonstrations (e.g., video) to perform imitation learning. One approach to this is behavioral cloning from observation (BCO). The original implementation of BCO proceeds by first learning an inverse dynamics model and then using that model to estimate action labels, thereby reducing the problem to behavioral cloning. However, existing approaches to BCO require a large number of initial interactions in the first step. Here, we provide a novel theoretical analysis of BCO, introduce a modification BCO*, and show that in the semi-supervised setting, BCO* can concurrently improve both its estimate for the inverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Adversarial Robustness in Machine Learning

MethodsGenerative Adversarial Imitation Learning