Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis

Tian Xu; Ziniu Li; Yang Yu; Zhi-Quan Luo

arXiv:2208.01899·cs.LG·May 5, 2026

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis

Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo

PDF

TL;DR

This paper provides a theoretical analysis of adversarial imitation learning, explaining its strong performance with limited expert data and long planning horizons through a novel stage-coupled analysis.

Contribution

It introduces a horizon-free imitation gap bound for TV-AIL, clarifying why AIL performs well with few trajectories and long horizons.

Findings

01

Imitation gap bound is at most 1 regardless of horizon.

02

Bound is meaningful in small and large sample regimes.

03

Analysis leverages multi-stage policy structure and dynamic programming.

Abstract

Imitation learning learns a policy from expert trajectories. While the expert data is believed to be crucial for imitation quality, it was found that a kind of imitation learning approach, adversarial imitation learning (AIL), can have exceptional performance. With as little as only one expert trajectory, AIL can match the expert performance even in a long horizon, on tasks such as locomotion control. There are two mysterious points in this phenomenon. First, why can AIL perform well with only a few expert trajectories? Second, why does AIL maintain good performance despite the length of the planning horizon? In this paper, we theoretically explore these two questions. For a total-variation-distance-based AIL (called TV-AIL), our analysis shows a horizon-free imitation gap $O ({min {1, ∣ S ∣/ N})$ on a class of instances abstracted from locomotion control tasks.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.