Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions
Alejandro Escontrela, Xue Bin Peng, Wenhao Yu, Tingnan Zhang, Atil, Iscen, Ken Goldberg, and Pieter Abbeel

TL;DR
This paper introduces adversarial motion priors as a way to replace complex reward functions in reinforcement learning, enabling naturalistic, transferable behaviors in simulated and real quadrupedal robots with minimal data.
Contribution
It demonstrates that style rewards learned from motion capture data can effectively substitute complex rewards, simplifying training and improving transferability to real robots.
Findings
Style rewards enable natural gait transitions.
Policies trained with style rewards transfer to real robots.
Few seconds of motion data suffice for effective style learning.
Abstract
Training a high-dimensional simulated agent with an under-specified reward function often leads the agent to learn physically infeasible strategies that are ineffective when deployed in the real world. To mitigate these unnatural behaviors, reinforcement learning practitioners often utilize complex reward functions that encourage physically plausible behaviors. However, a tedious labor-intensive tuning process is often required to create hand-designed rewards which might not easily generalize across platforms and tasks. We propose substituting complex reward functions with "style rewards" learned from a dataset of motion capture demonstrations. A learned style reward can be combined with an arbitrary task reward to train policies that perform tasks using naturalistic strategies. These natural strategies can also facilitate transfer to the real world. We build upon Adversarial Motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
