Coordinate-Based Dual-Constrained Autoregressive Motion Generation
Kang Ding, Hongsong Wang, Jie Gui, and Liang Wang

TL;DR
This paper introduces CDAMD, a novel autoregressive framework for text-to-motion generation that enhances fidelity and semantic consistency by using coordinate inputs and dual constraints, establishing new benchmarks.
Contribution
The paper proposes a new coordinate-based autoregressive model with dual constraints, addressing limitations of existing diffusion and autoregressive methods in text-to-motion generation.
Findings
Achieves state-of-the-art performance on new benchmarks.
Enhances motion fidelity using diffusion-inspired multi-layer perceptrons.
Guides generation with a dual-constrained causal mask.
Abstract
Text-to-motion generation has attracted increasing attention in the research community recently, with potential applications in animation, virtual reality, robotics, and human-computer interaction. Diffusion and autoregressive models are two popular and parallel research directions for text-to-motion generation. However, diffusion models often suffer from error amplification during noise prediction, while autoregressive models exhibit mode collapse due to motion discretization. To address these limitations, we propose a flexible, high-fidelity, and semantically faithful text-to-motion framework, named Coordinate-based Dual-constrained Autoregressive Motion Generation (CDAMD). With motion coordinates as input, CDAMD follows the autoregressive paradigm and leverages diffusion-inspired multi-layer perceptrons to enhance the fidelity of predicted motions. Furthermore, a Dual-Constrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
