Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
Peng Jin, Yang Wu, Yanbo Fan, Zhongqian Sun, Yang Wei, Li Yuan

TL;DR
This paper introduces hierarchical semantic graphs to enable fine-grained, controllable human motion generation from text, improving detail and specificity over traditional methods by leveraging a global-to-local structure.
Contribution
The paper proposes a hierarchical semantic graph framework for detailed control of motion synthesis, decomposing text descriptions into multiple semantic levels for enhanced accuracy.
Findings
Outperforms existing methods on HumanML3D and KIT datasets.
Enables continuous motion refinement by adjusting graph edge weights.
Demonstrates effective fine-grained control over generated motions.
Abstract
Most text-driven human motion generation methods employ sequential modeling approaches, e.g., transformer, to extract sentence-level text representations automatically and implicitly for human motion synthesis. However, these compact text representations may overemphasize the action names at the expense of other important properties and lack fine-grained details to guide the synthesis of subtly distinct motion. In this paper, we propose hierarchical semantic graphs for fine-grained control over motion generation. Specifically, we disentangle motion descriptions into hierarchical semantic graphs including three levels of motions, actions, and specifics. Such global-to-local structures facilitate a comprehensive understanding of motion description and fine-grained control of motion generation. Correspondingly, to leverage the coarse-to-fine topology of hierarchical semantic graphs, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Multimodal Machine Learning Applications
