OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data

Bin Cao; Sipeng Zheng; Hao Luo; Boyuan Li; Jing Liu; Zongqing Lu

arXiv:2603.18623·cs.CV·March 20, 2026

OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data

Bin Cao, Sipeng Zheng, Hao Luo, Boyuan Li, Jing Liu, Zongqing Lu

PDF

Open Access

TL;DR

OpenT2M introduces a large-scale, high-quality open-source motion dataset and a simple yet effective motion model, significantly enhancing text-to-motion generation's generalization and zero-shot capabilities.

Contribution

The paper presents OpenT2M, a comprehensive, rigorously validated motion dataset, and MonoFrill, a novel motion model with a new tokenizer, advancing T2M research.

Findings

01

OpenT2M improves generalization of T2M models.

02

2D-PRQ tokenizer achieves superior reconstruction.

03

MonoFrill performs well in zero-shot scenarios.

Abstract

Text-to-motion (T2M) generation aims to create realistic human movements from text descriptions, with promising applications in animation and robotics. Despite recent progress, current T2M models perform poorly on unseen text descriptions due to the small scale and limited diversity of existing motion datasets. To address this problem, we introduce OpenT2M, a million-level, high-quality, and open-source motion dataset containing over 2800 hours of human motion. Each sequence undergoes rigorous quality control through physical feasibility validation and multi-granularity filtering, with detailed second-wise text annotations. We also develop an automated pipeline for creating long-horizon sequences, enabling complex motion generation. Building upon OpenT2M, we introduce MonoFrill, a pretrained motion model that achieves compelling T2M results without complicated designs or technique…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Human Pose and Action Recognition