ARC - Actor Residual Critic for Adversarial Imitation Learning
Ankur Deka, Changliu Liu, Katia Sycara

TL;DR
This paper introduces Actor Residual Critic (ARC) algorithms that leverage the differentiability of adversarial imitation learning rewards, improving policy training in both discrete and continuous settings, and demonstrating superior performance in robotic tasks.
Contribution
The paper proposes ARC algorithms that utilize a residual critic to enhance adversarial imitation learning by exploiting reward differentiability, with proven convergence and improved empirical results.
Findings
ARC algorithms outperform standard AIL in continuous control tasks.
Policy iteration with residual critic converges to optimal policy in finite state cases.
ARC algorithms are simple to implement and integrate into existing AIL frameworks.
Abstract
Adversarial Imitation Learning (AIL) is a class of popular state-of-the-art Imitation Learning algorithms commonly used in robotics. In AIL, an artificial adversary's misclassification is used as a reward signal that is optimized by any standard Reinforcement Learning (RL) algorithm. Unlike most RL settings, the reward in AIL is but current model-free RL algorithms do not make use of this property to train a policy. The reward is AIL is also shaped since it comes from an adversary. We leverage the differentiability property of the shaped AIL reward function and formulate a class of Actor Residual Critic (ARC) RL algorithms. ARC algorithms draw a parallel to the standard Actor-Critic (AC) algorithms in RL literature and uses a residual critic, function (instead of the standard function) to approximate only the discounted future return (excluding the immediate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
