MotionFlux: Efficient Text-Guided Motion Generation through Rectified Flow Matching and Preference Alignment
Zhiting Gao, Dan Song, Diqiong Jiang, Chao Xue, An-An Liu

TL;DR
This paper introduces MotionFLUX, a real-time, text-guided motion generation framework using rectified flow matching, and TAPO, a preference optimization method that improves semantic alignment, resulting in faster and more accurate motion synthesis.
Contribution
It presents MotionFLUX for efficient, real-time motion generation and TAPO for enhanced semantic alignment, advancing the state-of-the-art in text-driven motion synthesis.
Findings
Outperforms existing methods in semantic consistency and motion quality
Achieves real-time motion synthesis with reduced inference time
Demonstrates significant speedup over traditional diffusion models
Abstract
Motion generation is essential for animating virtual characters and embodied agents. While recent text-driven methods have made significant strides, they often struggle with achieving precise alignment between linguistic descriptions and motion semantics, as well as with the inefficiencies of slow, multi-step inference. To address these issues, we introduce TMR++ Aligned Preference Optimization (TAPO), an innovative framework that aligns subtle motion variations with textual modifiers and incorporates iterative adjustments to reinforce semantic grounding. To further enable real-time synthesis, we propose MotionFLUX, a high-speed generation framework based on deterministic rectified flow matching. Unlike traditional diffusion models, which require hundreds of denoising steps, MotionFLUX constructs optimal transport paths between noise distributions and motion spaces, facilitating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
