Efficient Text-driven Motion Generation via Latent Consistency Training
Mengxian Hu, Minghao Zhu, Xun Zhou, Qingqing Yan, Shu Li, Chengju Liu,, Qijun Chen

TL;DR
This paper introduces MLCT, a novel framework for efficient text-driven human motion generation that precomputes diffusion trajectories during training, enabling fast inference with minimal computational overhead.
Contribution
The paper proposes a motion autoencoder with quantization, classifier-free guidance, and a clustering guidance module to improve efficiency and consistency in motion generation.
Findings
Outperforms traditional methods in efficiency and cost
Achieves stable training in non-pixel and latent spaces
Matches state-of-the-art performance with lower inference costs
Abstract
Text-driven human motion generation based on diffusion strategies establishes a reliable foundation for multimodal applications in human-computer interactions. However, existing advances face significant efficiency challenges due to the substantial computational overhead of iteratively solving for nonlinear reverse diffusion trajectories during the inference phase. To this end, we propose the motion latent consistency training framework (MLCT), which precomputes reverse diffusion trajectories from raw data in the training phase and enables few-step or single-step inference via self-consistency constraints in the inference phase. Specifically, a motion autoencoder with quantization constraints is first proposed for constructing concise and bounded solution distributions for motion diffusion processes. Subsequently, a classifier-free guidance format is constructed via an additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Natural Language Processing Techniques
MethodsDiffusion
