Genetic Imitation Learning by Reward Extrapolation

Boyuan Zheng; Jianlong Zhou; Fang Chen

arXiv:2301.07182·cs.NE·January 19, 2023

Genetic Imitation Learning by Reward Extrapolation

Boyuan Zheng, Jianlong Zhou, Fang Chen

PDF

Open Access

TL;DR

This paper introduces GenIL, a novel imitation learning method combining genetic algorithms with reward extrapolation to improve data efficiency and policy performance in limited data scenarios.

Contribution

The paper presents GenIL, a new approach that integrates genetic algorithms into imitation learning to enhance reward estimation and data efficiency.

Findings

01

GenIL outperforms previous methods in extrapolation accuracy.

02

GenIL demonstrates robustness in limited data settings.

03

GenIL achieves superior policy performance in Atari and Mujoco environments.

Abstract

Imitation learning demonstrates remarkable performance in various domains. However, imitation learning is also constrained by many prerequisites. The research community has done intensive research to alleviate these constraints, such as adding the stochastic policy to avoid unseen states, eliminating the need for action labels, and learning from the suboptimal demonstrations. Inspired by the natural reproduction process, we proposed a method called GenIL that integrates the Genetic Algorithm with imitation learning. The involvement of the Genetic Algorithm improves the data efficiency by reproducing trajectories with various returns and assists the model in estimating more accurate and compact reward function parameters. We tested GenIL in both Atari and Mujoco domains, and the result shows that it successfully outperforms the previous extrapolation methods over extrapolation accuracy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics