SegTAD: Precise Temporal Action Detection via Semantic Segmentation

Chen Zhao; Merey Ramazanova; Mengmeng Xu; Bernard Ghanem

arXiv:2203.01542·cs.CV·March 4, 2022

SegTAD: Precise Temporal Action Detection via Semantic Segmentation

Chen Zhao, Merey Ramazanova, Mengmeng Xu, Bernard Ghanem

PDF

Open Access

TL;DR

SegTAD introduces a novel semantic segmentation approach to improve the precision of temporal action detection in videos, addressing label imprecision and scale variation issues.

Contribution

The paper proposes a new semantic segmentation framework for TAD that leverages fine-grained annotations and combines segmentation with proposal detection.

Findings

01

Enhanced detection precision over existing methods

02

Effective handling of scale variations in actions

03

Improved training efficiency with semantic supervision

Abstract

Temporal action detection (TAD) is an important yet challenging task in video analysis. Most existing works draw inspiration from image object detection and tend to reformulate it as a proposal generation - classification problem. However, there are two caveats with this paradigm. First, proposals are not equipped with annotated labels, which have to be empirically compiled, thus the information in the annotations is not necessarily precisely employed in the model training process. Second, there are large variations in the temporal scale of actions, and neglecting this fact may lead to deficient representation in the video features. To address these issues and precisely model temporal action detection, we formulate the task of temporal action detection in a novel perspective of semantic segmentation. Owing to the 1-dimensional property of TAD, we are able to convert the coarse-grained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications