Reward function shape exploration in adversarial imitation learning: an empirical study
Yawei Wang, Xiu Li

TL;DR
This study empirically investigates how different reward function shapes affect the performance of adversarial imitation learning algorithms across various continuous control tasks, highlighting the effectiveness of positive logarithmic rewards.
Contribution
It systematically compares multiple reward function shapes in adversarial imitation learning, providing insights into their suitability for different tasks through extensive experiments.
Findings
Positive logarithmic reward functions perform well in continuous control tasks.
Unbiased reward functions are limited to specific task types.
Several designed reward functions show excellent performance across environments.
Abstract
For adversarial imitation learning algorithms (AILs), no true rewards are obtained from the environment for learning the strategy. However, the pseudo rewards based on the output of the discriminator are still required. Given the implicit reward bias problem in AILs, we design several representative reward function shapes and compare their performances by large-scale experiments. To ensure our results' reliability, we conduct the experiments on a series of Mujoco and Box2D continuous control tasks based on four different AILs. Besides, we also compare the performance of various reward function shapes using varying numbers of expert trajectories. The empirical results reveal that the positive logarithmic reward function works well in typical continuous control tasks. In contrast, the so-called unbiased reward function is limited to specific kinds of tasks. Furthermore, several designed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Reinforcement Learning in Robotics
