Causal Imitation Learning under Temporally Correlated Noise

Gokul Swamy; Sanjiban Choudhury; J. Andrew Bagnell; Zhiwei Steven Wu

arXiv:2202.01312·cs.LG·February 4, 2022·5 cites

Causal Imitation Learning under Temporally Correlated Noise

Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu

PDF

Open Access 1 Repo

TL;DR

This paper introduces two novel algorithms for imitation learning from corrupted data affected by temporally correlated noise, leveraging instrumental variable regression to recover accurate policies without interactive expert access.

Contribution

It applies econometric IVR techniques to imitation learning, proposing DoubIL and ResiduIL algorithms for offline and simulator-based settings, improving robustness to noise.

Findings

01

Both algorithms outperform behavioral cloning on simulated tasks.

02

ResiduIL and DoubIL effectively mitigate effects of temporally correlated noise.

03

Algorithms do not require interactive expert demonstrations.

Abstract

We develop algorithms for imitation learning from policy data that was corrupted by temporally correlated noise in expert actions. When noise affects multiple timesteps of recorded data, it can manifest as spurious correlations between states and actions that a learner might latch on to, leading to poor policy performance. To break up these spurious correlations, we apply modern variants of the instrumental variable regression (IVR) technique of econometrics, enabling us to recover the underlying policy without requiring access to an interactive expert. In particular, we present two techniques, one of a generative-modeling flavor (DoubIL) that can utilize access to a simulator, and one of a game-theoretic flavor (ResiduIL) that can be run entirely offline. We find both of our algorithms compare favorably to behavioral cloning on simulated control tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gkswamy98/causal_il
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics