Semi-Supervised Temporal Action Detection with Proposal-Free Masking
Sauradip Nag, Xiatian Zhu, Yi-Zhe Song, Tao Xiang

TL;DR
This paper introduces SPOT, a proposal-free semi-supervised method for temporal action detection that reduces error propagation and improves performance by parallel localization and classification with interaction mechanisms.
Contribution
The paper proposes a novel proposal-free architecture for semi-supervised temporal action detection that eliminates localization-classification error propagation and enhances prediction accuracy.
Findings
SPOT outperforms state-of-the-art methods on standard benchmarks.
The parallel localization and classification architecture reduces error propagation.
Interaction mechanisms improve prediction refinement.
Abstract
Existing temporal action detection (TAD) methods rely on a large number of training data with segment-level annotations. Collecting and annotating such a training set is thus highly expensive and unscalable. Semi-supervised TAD (SS-TAD) alleviates this problem by leveraging unlabeled videos freely available at scale. However, SS-TAD is also a much more challenging problem than supervised TAD, and consequently much under-studied. Prior SS-TAD methods directly combine an existing proposal-based TAD method and a SSL method. Due to their sequential localization (e.g, proposal generation) and classification design, they are prone to proposal error propagation. To overcome this limitation, in this work we propose a novel Semi-supervised Temporal action detection model based on PropOsal-free Temporal mask (SPOT) with a parallel localization (mask generation) and classification architecture.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
