Causal Imitation Learning with Unobserved Confounders

Junzhe Zhang; Daniel Kumor; Elias Bareinboim

arXiv:2208.06267·cs.LG·August 15, 2022·23 cites

Causal Imitation Learning with Unobserved Confounders

Junzhe Zhang, Daniel Kumor, Elias Bareinboim

PDF

Open Access 1 Video

TL;DR

This paper investigates imitation learning in scenarios with unobserved confounders, providing a causal framework to determine when imitation is feasible and proposing methods to learn policies under these conditions.

Contribution

It introduces a non-parametric causal criterion for feasibility of imitation learning with unobserved confounders and develops an efficient policy learning procedure.

Findings

01

A complete causal criterion for imitation feasibility.

02

Feasibility can be achieved with quantitative knowledge even if the criterion fails.

03

An efficient algorithm for policy learning from expert trajectories.

Abstract

One of the common ways children learn is by mimicking adults. Imitation learning focuses on learning policies with suitable performance from demonstrations generated by an expert, with an unspecified performance measure, and unobserved reward signal. Popular methods for imitation learning start by either directly mimicking the behavior policy of an expert (behavior cloning) or by learning a reward function that prioritizes observed expert trajectories (inverse reinforcement learning). However, these methods rely on the assumption that covariates used by the expert to determine her/his actions are fully observed. In this paper, we relax this assumption and study imitation learning when sensory inputs of the learner and the expert differ. First, we provide a non-parametric, graphical criterion that is complete (both necessary and sufficient) for determining the feasibility of imitation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Causal Imitation Learning With Unobserved Confounders· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Machine Learning and Algorithms