Coordinating Multiple Conditions for Trajectory-Controlled Human Motion Generation
Deli Cai, Haoyang Ma, Changxing Ding

TL;DR
This paper introduces CMC, a novel framework for generating realistic human motions conditioned on text and trajectories, addressing conflicts and instability in existing methods through a two-stage divide-and-conquer approach.
Contribution
CMC is a decoupled, two-stage diffusion-based framework that improves control accuracy and motion quality by effectively coordinating text and trajectory conditions.
Findings
CMC achieves state-of-the-art control accuracy on HumanML3D and KIT datasets.
The Selective Inpainting Mechanism (SIM) enhances training stability and motion consistency.
Experiments demonstrate improved motion realism and multimodal condition coordination.
Abstract
Trajectory-controlled human motion generation aims to synthesize realistic human motions conditioned on both textual descriptions and spatial trajectories. However, existing methods suffer from two critical limitations: first, the conflict between text and trajectory conditions disrupts the denoising process, resulting in compromised motion quality or inaccurate trajectory following; second, the use of redundant motion representations introduces inconsistencies between motion components, leading to instability during trajectory control. To address these challenges, we propose CMC, a decoupled framework that effectively coordinates text and trajectory conditions through a divide-and-conquer strategy. CMC follows a divide-and-conquer paradigm, comprising two cascaded stages: Trajectory Control and Motion Completion. In the first stage, a diffusion model generates a simplified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
