Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation
Zhihan Liu, Yufeng Zhang, Zuyue Fu, Zhuoran Yang, and Zhaoran Wang

TL;DR
This paper introduces provably efficient algorithms for generative adversarial imitation learning (GAIL) in both online and offline settings with linear function approximation, providing theoretical guarantees on regret and optimality gap.
Contribution
The paper develops the OGAP and PGAP algorithms for online and offline GAIL with linear functions, achieving near-optimal regret and gap bounds with rigorous proofs.
Findings
OGAP achieves $ ilde{O}(H^2 d^{3/2}K^{1/2}+KH^{3/2}dN_1^{-1/2})$ regret.
PGAP attains the minimax lower bound for offline GAIL.
Under sufficient coverage, PGAP achieves $ ilde{O}(H^{2}dK^{-1/2} +H^2d^{3/2}N_2^{-1/2}+H^{3/2}dN_1^{-1/2})$ optimality gap.
Abstract
In generative adversarial imitation learning (GAIL), the agent aims to learn a policy from an expert demonstration so that its performance cannot be discriminated from the expert policy on a certain predefined reward set. In this paper, we study GAIL in both online and offline settings with linear function approximation, where both the transition and reward function are linear in the feature maps. Besides the expert demonstration, in the online setting the agent can interact with the environment, while in the offline setting the agent only accesses an additional dataset collected by a prior. For online GAIL, we propose an optimistic generative adversarial policy optimization algorithm (OGAP) and prove that OGAP achieves regret. Here represents the number of trajectories of the expert demonstration, is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
MethodsGenerative Adversarial Imitation Learning
