Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions
Jianan Li, Xiao Chen, Tao Huang, Tien-Tsin Wong

TL;DR
This paper introduces Mimic2DM, a framework that learns to control 3D characters in physics simulations by generating and mimicking 2D motions extracted from videos, eliminating the need for 3D motion data.
Contribution
It proposes a novel approach that trains a 2D motion tracking policy directly from 2D keypoints and integrates a transformer-based 2D motion generator for diverse, physically plausible 3D character motions.
Findings
Effective synthesis of diverse motions like dancing, soccer, and animal movements.
No reliance on explicit 3D motion data or reconstruction techniques.
Versatile framework applicable across multiple domains.
Abstract
Video data is more cost-effective than motion capture data for learning 3D character motion controllers, yet synthesizing realistic and diverse behaviors directly from videos remains challenging. Previous approaches typically rely on off-the-shelf motion reconstruction techniques to obtain 3D trajectories for physics-based imitation. These reconstruction methods struggle with generalizability, as they either require 3D training data (potentially scarce) or fail to produce physically plausible poses, hindering their application to challenging scenarios like human-object interaction (HOI) or non-human characters. We tackle this challenge by introducing Mimic2DM, a novel motion imitation framework that learns the control policy directly and solely from widely available 2D keypoint trajectories extracted from videos. By minimizing the reprojection error, we train a general single-view 2D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Robot Manipulation and Learning
