TL;DR
This paper introduces UMF, a unified, number-free text-to-motion generation model that effectively handles variable agents and reduces errors through hierarchical flow-based methods.
Contribution
The paper proposes UMF, combining Pyramid and Semi-Noise Motion Flows, to enable efficient, unified training and generation of multi-person motion from text without fixed agent numbers.
Findings
UMF outperforms existing methods in multi-person motion generation.
User studies confirm the quality and realism of generated motions.
UMF demonstrates strong generalization across diverse motion datasets.
Abstract
Generative models excel at motion synthesis for a fixed number of agents but struggle to generalize with variable agents. Based on limited, domain-specific data, existing methods employ autoregressive models to generate motion recursively, which suffer from inefficiency and error accumulation. We propose Unified Motion Flow (UMF), which consists of Pyramid Motion Flow (P-Flow) and Semi-Noise Motion Flow (S-Flow). UMF decomposes the number-free motion generation into a single-pass motion prior generation stage and multi-pass reaction generation stages. Specifically, UMF utilizes a unified latent space to bridge the distribution gap between heterogeneous motion datasets, enabling effective unified training. For motion prior generation, P-Flow operates on hierarchical resolutions conditioned on different noise levels, thereby mitigating computational overheads. For reaction generation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
