# Dyna-AIL : Adversarial Imitation Learning by Planning

**Authors:** Vaibhav Saxena, Srinivasan Sivanandan, Pulkit Mathur

arXiv: 1903.03234 · 2019-03-11

## TL;DR

Dyna-AIL introduces a novel adversarial imitation learning approach that combines model-based planning and model-free learning, significantly reducing environment interactions needed for convergence in control tasks.

## Contribution

It proposes an end-to-end differentiable adversarial imitation learning algorithm within a Dyna-like framework, effectively switching between planning and learning from expert data.

## Key findings

- Converges to optimal policy with fewer environment interactions.
- Effective on both discrete and continuous environments.
- Outperforms state-of-the-art methods in sample efficiency.

## Abstract

Adversarial methods for imitation learning have been shown to perform well on various control tasks. However, they require a large number of environment interactions for convergence. In this paper, we propose an end-to-end differentiable adversarial imitation learning algorithm in a Dyna-like framework for switching between model-based planning and model-free learning from expert data. Our results on both discrete and continuous environments show that our approach of using model-based planning along with model-free learning converges to an optimal policy with fewer number of environment interactions in comparison to the state-of-the-art learning methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.03234/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1903.03234/full.md

## References

13 references — full list in the complete paper: https://tomesphere.com/paper/1903.03234/full.md

---
Source: https://tomesphere.com/paper/1903.03234