Cross-Video Contextual Knowledge Exploration and Exploitation for Ambiguity Reduction in Weakly Supervised Temporal Action Localization
Songchun Zhang, and Chunhui Zhao

TL;DR
This paper introduces a novel framework for weakly supervised temporal action localization that leverages cross-video contextual knowledge to improve action understanding and reduce ambiguity, outperforming existing methods.
Contribution
It proposes an end-to-end framework with RMGCL and GKSA modules to exploit dataset-level semantic structures from weak labels, enhancing localization accuracy.
Findings
Outperforms state-of-the-art on THUMOS14, ActivityNet1.3, and FineAction datasets.
Effectively reduces ambiguity in classification and localization.
Can be integrated with other WSTAL methods.
Abstract
Weakly supervised temporal action localization (WSTAL) aims to localize actions in untrimmed videos using video-level labels. Despite recent advances, existing approaches mainly follow a localization-by-classification pipeline, generally processing each segment individually, thereby exploiting only limited contextual information. As a result, the model will lack a comprehensive understanding (e.g. appearance and temporal structure) of various action patterns, leading to ambiguity in classification learning and temporal localization. Our work addresses this from a novel perspective, by exploring and exploiting the cross-video contextual knowledge within the dataset to recover the dataset-level semantic structure of action instances via weak labels only, thereby indirectly improving the holistic understanding of fine-grained action patterns and alleviating the aforementioned ambiguities.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Stroke Rehabilitation and Recovery
MethodsContrastive Learning
