T2MBench: A Benchmark for Out-of-Distribution Text-to-Motion Generation
Bin Yang, Rong Ou, Weisheng Xu, Jiaqi Xiong, Xintao Li, Taowen Wang, Luyu Zhu, Xu Jiang, Jing Tan, Renjing Xu

TL;DR
This paper introduces T2MBench, a comprehensive benchmark for evaluating text-to-motion models on out-of-distribution inputs, revealing current models' limitations and guiding future improvements.
Contribution
It presents a new OOD benchmark with a large prompt dataset and a unified evaluation framework for assessing text-to-motion models under complex, out-of-distribution conditions.
Findings
Models excel in semantic alignment but struggle with fine-grained accuracy.
Most models have limited generalization in OOD scenarios.
The benchmark provides practical insights for improving text-to-motion models.
Abstract
Most existing evaluations of text-to-motion generation focus on in-distribution textual inputs and a limited set of evaluation criteria, which restricts their ability to systematically assess model generalization and motion generation capabilities under complex out-of-distribution (OOD) textual conditions. To address this limitation, we propose a benchmark specifically designed for OOD text-to-motion evaluation, which includes a comprehensive analysis of 14 representative baseline models and the two datasets derived from evaluation results. Specifically, we construct an OOD prompt dataset consisting of 1,025 textual descriptions. Based on this prompt dataset, we introduce a unified evaluation framework that integrates LLM-based Evaluation, Multi-factor Motion evaluation, and Fine-grained Accuracy Evaluation. Our experimental results reveal that while different baseline models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
