LaxMotion: Rethinking Supervision Granularity for 3D Human Motion Generation
Sheng Liu, Yuanzhi Liang, Sidan Du

TL;DR
LaxMotion introduces a novel supervision paradigm for 3D human motion generation that emphasizes structural consistency over exact coordinate fitting, enabling better generalization and diverse motion synthesis without direct 3D pose supervision.
Contribution
The paper proposes a new framework that learns 3D human motion through relaxed supervision using global trajectories and 2D cues, improving generalization and diversity.
Findings
Achieves comparable or better performance than fully supervised methods.
Generates diverse, coherent, and semantically aligned 3D motions.
Reduces reliance on precise 3D annotations, enhancing scalability.
Abstract
Recent 3D human motion generation models demonstrate remarkable reconstruction accuracy yet struggle to generalize beyond training distributions. This limitation arises partly from the use of precise 3D supervision, which encourages models to fit fixed coordinate patterns instead of learning the essential 3D structure and motion semantic cues required for robust generalization. To overcome this limitation, we propose LaxMotion, a framework that synthesizes realistic 3D motions without direct 3D pose supervision. Instead of regressing toward exact coordinates, LaxMotion learns 3D motion as a consistent explanation of global trajectories and monocular 2D kinematic cues. We introduce a structured motion factorization together with a reformulated training paradigm under relaxed observability. This design is further supported by relaxed regularization objectives that enforce view consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · 3D Shape Modeling and Analysis · Human Pose and Action Recognition
