Dense Motion Captioning
Shiyao Xu, Benedetta Liberatori, G\"ul Varol, Paolo Rota

TL;DR
This paper introduces Dense Motion Captioning, a new task for temporally localizing and describing actions in 3D human motion sequences, supported by a large-scale, richly annotated dataset and a novel model that outperforms existing methods.
Contribution
It proposes the Dense Motion Captioning task, creates the Complex Motion Dataset (CompMo), and develops the DEMO model that effectively generates dense, temporally grounded motion captions.
Findings
DEMO outperforms existing methods on CompMo
CompMo contains 60,000 richly annotated motion sequences
The approach establishes a new baseline for 3D motion understanding and captioning
Abstract
Recent advances in 3D human motion and language integration have primarily focused on text-to-motion generation, leaving the task of motion understanding relatively unexplored. We introduce Dense Motion Captioning, a novel task that aims to temporally localize and caption actions within 3D human motion sequences. Current datasets fall short in providing detailed temporal annotations and predominantly consist of short sequences featuring few actions. To overcome these limitations, we present the Complex Motion Dataset (CompMo), the first large-scale dataset featuring richly annotated, complex motion sequences with precise temporal boundaries. Built through a carefully designed data generation pipeline, CompMo includes 60,000 motion sequences, each composed of multiple actions ranging from at least two to ten, accurately annotated with their temporal extents. We further present DEMO, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Human Pose and Action Recognition
