OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data
Bin Cao, Sipeng Zheng, Hao Luo, Boyuan Li, Jing Liu, Zongqing Lu

TL;DR
OpenT2M introduces a large-scale, high-quality open-source motion dataset and a simple yet effective motion model, significantly enhancing text-to-motion generation's generalization and zero-shot capabilities.
Contribution
The paper presents OpenT2M, a comprehensive, rigorously validated motion dataset, and MonoFrill, a novel motion model with a new tokenizer, advancing T2M research.
Findings
OpenT2M improves generalization of T2M models.
2D-PRQ tokenizer achieves superior reconstruction.
MonoFrill performs well in zero-shot scenarios.
Abstract
Text-to-motion (T2M) generation aims to create realistic human movements from text descriptions, with promising applications in animation and robotics. Despite recent progress, current T2M models perform poorly on unseen text descriptions due to the small scale and limited diversity of existing motion datasets. To address this problem, we introduce OpenT2M, a million-level, high-quality, and open-source motion dataset containing over 2800 hours of human motion. Each sequence undergoes rigorous quality control through physical feasibility validation and multi-granularity filtering, with detailed second-wise text annotations. We also develop an automated pipeline for creating long-horizon sequences, enabling complex motion generation. Building upon OpenT2M, we introduce MonoFrill, a pretrained motion model that achieves compelling T2M results without complicated designs or technique…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Human Pose and Action Recognition
