Behavioral Cloning from Observation

Faraz Torabi; Garrett Warnell; Peter Stone

arXiv:1805.01954·cs.AI·May 15, 2018

Behavioral Cloning from Observation

Faraz Torabi, Garrett Warnell, Peter Stone

PDF

5 Repos

TL;DR

This paper introduces Behavioral Cloning from Observation (BCO), a method enabling autonomous agents to learn tasks by observing experts without action data, achieving comparable performance to existing methods with faster learning.

Contribution

The paper proposes a novel two-phase imitation learning approach that learns from observation only and accelerates learning speed compared to traditional methods.

Findings

01

BCO achieves comparable performance to GAIL in simulation tasks.

02

BCO demonstrates faster learning after acquiring expert trajectories.

03

The method effectively learns from observation without explicit action data.

Abstract

Humans often learn how to perform tasks via imitation: they observe others perform a task, and then very quickly infer the appropriate actions to take based on their observations. While extending this paradigm to autonomous agents is a well-studied problem in general, there are two particular aspects that have largely been overlooked: (1) that the learning is done from observation only (i.e., without explicit action information), and (2) that the learning is typically done very quickly. In this work, we propose a two-phase, autonomous imitation learning technique called behavioral cloning from observation (BCO), that aims to provide improved performance with respect to both of these aspects. First, we allow the agent to acquire experience in a self-supervised fashion. This experience is used to develop a model which is then utilized to learn a particular task by observing an expert…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings