Confounded Causal Imitation Learning with Instrumental Variables
Yan Zeng, Shenglan Nie, Feng Xie, Libo Huang, Peng Wu, and Zhi Geng

TL;DR
This paper introduces a novel causal imitation learning framework that leverages instrumental variables to address unmeasured confounders, enabling more accurate policy learning from demonstrations affected by hidden variables.
Contribution
It proposes a two-stage framework for valid IV identification and policy optimization in confounded imitation learning, extending IV applicability to multiple timesteps.
Findings
Successfully identified valid IVs in experiments
Improved policy accuracy over baseline methods
Validated effectiveness through extensive experiments
Abstract
Imitation learning from demonstrations usually suffers from the confounding effects of unmeasured variables (i.e., unmeasured confounders) on the states and actions. If ignoring them, a biased estimation of the policy would be entailed. To break up this confounding gap, in this paper, we take the best of the strong power of instrumental variables (IV) and propose a Confounded Causal Imitation Learning (C2L) model. This model accommodates confounders that influence actions across multiple timesteps, rather than being restricted to immediate temporal dependencies. We develop a two-stage imitation learning framework for valid IV identification and policy optimization. In particular, in the first stage, we construct a testing criterion based on the defined pseudo-variable, with which we achieve identifying a valid IV for the C2L models. Such a criterion entails the sufficient and necessary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Robot Manipulation and Learning
