Absolute Coordinates Make Motion Generation Easy

Zichong Meng; Zeyu Han; Xiaogang Peng; Yiming Xie; Huaizu Jiang

arXiv:2505.19377·cs.CV·June 3, 2025

Absolute Coordinates Make Motion Generation Easy

Zichong Meng, Zeyu Han, Xiaogang Peng, Yiming Xie, Huaizu Jiang

PDF

Open Access

TL;DR

This paper demonstrates that using absolute joint coordinates in global space for text-to-motion generation improves fidelity, scalability, and downstream task support compared to traditional relative representations, simplifying the process.

Contribution

The authors propose a simple, absolute coordinate-based motion representation that outperforms relative methods, enabling better motion quality and easier downstream task integration.

Findings

01

Higher motion fidelity with absolute coordinates

02

Improved text alignment and scalability

03

Supports downstream tasks without additional reengineering

Abstract

State-of-the-art text-to-motion generation models rely on the kinematic-aware, local-relative motion representation popularized by HumanML3D, which encodes motion relative to the pelvis and to the previous frame with built-in redundancy. While this design simplifies training for earlier generation models, it introduces critical limitations for diffusion models and hinders applicability to downstream tasks. In this work, we revisit the motion representation and propose a radically simplified and long-abandoned alternative for text-to-motion generation: absolute joint coordinates in global space. Through systematic analysis of design choices, we show that this formulation achieves significantly higher motion fidelity, improved text alignment, and strong scalability, even with a simple Transformer backbone and no auxiliary kinematic-aware losses. Moreover, our formulation naturally…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Advanced Vision and Imaging · Robotic Mechanisms and Dynamics

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Residual Connection · Dense Connections · Softmax · Diffusion · Position-Wise Feed-Forward Layer · Absolute Position Encodings