Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis
Ye Yuan, Kris Kitani

TL;DR
This paper introduces residual force control (RFC), a novel method that enhances humanoid motion synthesis by compensating for dynamics mismatch, enabling realistic, agile, and long-term human behavior imitation including complex motions like ballet.
Contribution
The paper proposes RFC, a new approach that adds residual forces to improve humanoid control policies, allowing for more realistic and diverse long-term motion synthesis from large-scale datasets.
Findings
RFC outperforms state-of-the-art methods in motion quality and convergence speed.
The approach enables humanoids to perform complex ballet moves like pirouettes and arabesques.
First humanoid control method to learn from large-scale human motion datasets for diverse long-term motions.
Abstract
Reinforcement learning has shown great promise for synthesizing realistic human behaviors by learning humanoid control policies from motion capture data. However, it is still very challenging to reproduce sophisticated human skills like ballet dance, or to stably imitate long-term human behaviors with complex transitions. The main difficulty lies in the dynamics mismatch between the humanoid model and real humans. That is, motions of real humans may not be physically possible for the humanoid model. To overcome the dynamics mismatch, we propose a novel approach, residual force control (RFC), that augments a humanoid control policy by adding external residual forces into the action space. During training, the RFC-based policy learns to apply residual forces to the humanoid to compensate for the dynamics mismatch and better imitate the reference motion. Experiments on a wide range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Video Analysis and Summarization
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
