MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu, Shaoyang Hua, Zili Lin, Yifan Liu, Feipeng Ma, Yichao Yan,, Xin Jin, Xiaokang Yang, Wenjun Zeng

TL;DR
MotionBank introduces a large-scale, diverse video motion dataset with rule-based annotations to improve the development and benchmarking of versatile motion models for various human motion tasks.
Contribution
The paper presents MotionBank, a comprehensive dataset with 13 datasets, 1.24 million motion sequences, and rule-based text annotations, addressing limitations of previous small-scale and context-limited motion datasets.
Findings
MotionBank enhances human motion generation and understanding tasks.
Rule-based annotations improve motion-text alignment.
Large-scale dataset benefits general motion modeling.
Abstract
In this paper, we tackle the problem of how to build and benchmark a large motion model (LMM). The ultimate goal of LMM is to serve as a foundation model for versatile motion-related tasks, e.g., human motion generation, with interpretability and generalizability. Though advanced, recent LMM-related works are still limited by small-scale motion data and costly text descriptions. Besides, previous motion benchmarks primarily focus on pure body movements, neglecting the ubiquitous motions in context, i.e., humans interacting with humans, objects, and scenes. To address these limitations, we consolidate large-scale video action datasets as knowledge banks to build MotionBank, which comprises 13 video action datasets, 1.24M motion sequences, and 132.9M frames of natural and diverse human motions. Different from laboratory-captured motions, in-the-wild human-centric videos contain abundant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
MethodsFocus
