StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
Yiheng Huang, Hui Yang, Chuanchen Luo, Yuxi Wang, Shibiao Xu,, Zhaoxiang Zhang, Man Zhang, Junran Peng

TL;DR
StableMoFusion is a new diffusion-based framework for human motion generation that improves efficiency and robustness by analyzing and tailoring network components, while also addressing foot skating issues through contact-aware corrections.
Contribution
It provides a comprehensive analysis of diffusion model components and introduces tailored designs for efficient, high-quality motion generation with foot skating correction.
Findings
Outperforms current state-of-the-art methods in human motion generation
Reduces computational overhead for real-time applications
Effectively eliminates foot skating in generated motions
Abstract
Thanks to the powerful generative capacity of diffusion models, recent years have witnessed rapid progress in human motion generation. Existing diffusion-based methods employ disparate network architectures and training strategies. The effect of the design of each component is still unclear. In addition, the iterative denoising process consumes considerable computational overhead, which is prohibitive for real-time scenarios such as virtual characters and humanoid robots. For this reason, we first conduct a comprehensive investigation into network architectures, training strategies, and inference processs. Based on the profound analysis, we tailor each component for efficient high-quality human motion generation. Despite the promising performance, the tailored model still suffers from foot skating which is an ubiquitous issue in diffusion-based solutions. To eliminate footskate, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Advanced Vision and Imaging
MethodsDiffusion
