Counterfactual Behavior Cloning: Offline Imitation Learning from Imperfect Human Demonstrations

Shahabedin Sagheb; Dylan P. Losey

arXiv:2505.10760·cs.RO·May 19, 2025

Counterfactual Behavior Cloning: Offline Imitation Learning from Imperfect Human Demonstrations

Shahabedin Sagheb, Dylan P. Losey

PDF

Open Access 1 Repo

TL;DR

Counterfactual Behavior Cloning (Counter-BC) improves offline imitation learning by inferring the intended policy behind imperfect human demonstrations, enabling robots to learn more accurately from noisy and suboptimal data.

Contribution

This work introduces Counter-BC, a novel method that extrapolates the intended behavior from noisy demonstrations, outperforming existing imitation learning techniques.

Findings

01

Counter-BC effectively extracts underlying policies from imperfect data.

02

Counter-BC outperforms state-of-the-art methods in noisy and real-world settings.

03

Theoretically proven to recover desired policies from diverse and imperfect demonstrations.

Abstract

Learning from humans is challenging because people are imperfect teachers. When everyday humans show the robot a new task they want it to perform, humans inevitably make errors (e.g., inputting noisy actions) and provide suboptimal examples (e.g., overshooting the goal). Existing methods learn by mimicking the exact behaviors the human teacher provides -- but this approach is fundamentally limited because the demonstrations themselves are imperfect. In this work we advance offline imitation learning by enabling robots to extrapolate what the human teacher meant, instead of only considering what the human actually showed. We achieve this by hypothesizing that all of the human's demonstrations are trying to convey a single, consistent policy, while the noise and sub-optimality within their behaviors obfuscates the data and introduces unintentional complexity. To recover the underlying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vt-collab/counter-bc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Social Robot Interaction and HRI