MM-SEAL: A Large-scale Video Dataset of Multi-person Multi-grained Spatio-temporally Action Localization
Shimin Chen, Wei Li, Chen Chen, Jianyang Gu, Jiaming Chu, Xunqiang, Tao, Yandong Guo

TL;DR
This paper introduces MM-SEAL, a large-scale video dataset for multi-person multi-grained spatio-temporal action localization, along with a new network, Faster-TAD, to improve localization performance.
Contribution
The paper presents a novel large-scale dataset with detailed annotations and a new network architecture for simultaneous proposal generation and labeling in action localization.
Findings
Atomic action features enhance complex activity localization.
Pretrained features on MM-SEAL improve other benchmarks.
Faster-TAD effectively generates temporal proposals and labels.
Abstract
In this paper, we introduce a novel large-scale video dataset dubbed MM-SEAL for multi-person multi-grained spatio-temporal action localization among human daily life. We are the first to propose a new benchmark for multi-person spatio-temporal complex activity localization, where complex semantic and long duration bring new challenges to localization tasks. We observe that limited atomic actions can be combined into many complex activities. MM-SEAL provides both atomic action and complex activity annotations, producing 111.7k atomic actions spanning 172 action categories and 17.7k complex activities spanning 200 activity categories. We explore the relationship between atomic actions and complex activities, finding that atomic action features can improve the complex activity localization performance. Also, we propose a new network which generates temporal proposals and labels…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Context-Aware Activity Recognition Systems · Stroke Rehabilitation and Recovery
