You Think, You ACT: The New Task of Arbitrary Text to Motion Generation
Runqi Wang, Caoyuan Ma, Guopeng Li, Hanrui Xu, Yuke Li, Zheng Wang

TL;DR
This paper introduces a new task of generating human motions from arbitrary texts, extending beyond limited action labels, and proposes a dataset and framework to handle the inherent ambiguity and multiple valid outputs.
Contribution
It extends text-to-motion generation to arbitrary scene texts, creates a new dataset HUMANML3D++, and benchmarks multi-solution metrics for this challenging setting.
Findings
Text to motion in this setting is more challenging than traditional methods.
The proposed framework effectively extracts action instructions from arbitrary texts.
Benchmarking reveals the need for multi-solution evaluation metrics.
Abstract
Text to Motion aims to generate human motions from texts. Existing settings rely on limited Action Texts that include action labels, which limits flexibility and practicability in scenarios difficult to describe directly. This paper extends limited Action Texts to arbitrary ones. Scene texts without explicit action labels can enhance the practicality of models in complex and diverse industries such as virtual human interaction, robot behavior generation, and film production, while also supporting the exploration of potential implicit behavior patterns. However, newly introduced Scene Texts may yield multiple reasonable output results, causing significant challenges in existing data, framework, and evaluation. To address this practical issue, we first create a new dataset HUMANML3D++ by extending texts of the largest existing dataset HUMANML3D. Secondly, we propose a simple yet effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Multi-Agent Systems and Negotiation · Topic Modeling
