STAT: Towards Generalizable Temporal Action Localization
Yangcen Liu, Ziyi Liu, Yuanhao Zhai, Wen Li, David Doerman, Junsong, Yuan

TL;DR
This paper introduces STAT, a self-supervised teacher-student framework that enhances the generalizability of temporal action localization methods, especially across different data distributions and action scales.
Contribution
The paper proposes a novel STAT framework with refinement and alignment modules to improve cross-distribution performance in WTAL.
Findings
Significant performance improvements on THUMOS14, ActivityNet1.2, and HACS datasets.
Approaching same-distribution performance in cross-distribution settings.
Effective scale adaptation through iterative refinement.
Abstract
Weakly-supervised temporal action localization (WTAL) aims to recognize and localize action instances with only video-level labels. Despite the significant progress, existing methods suffer from severe performance degradation when transferring to different distributions and thus may hardly adapt to real-world scenarios . To address this problem, we propose the Generalizable Temporal Action Localization task (GTAL), which focuses on improving the generalizability of action localization methods. We observed that the performance decline can be primarily attributed to the lack of generalizability to different action scales. To address this problem, we propose STAT (Self-supervised Temporal Adaptive Teacher), which leverages a teacher-student structure for iterative refinement. Our STAT features a refinement module and an alignment module. The former iteratively refines the model's output by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Analysis and Summarization
