Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization
Pilhyeon Lee, Hyeran Byun

TL;DR
This paper introduces a novel weakly-supervised framework for temporal action localization that uses dense pseudo-labels and sequence searching to improve the completeness of action predictions, achieving state-of-the-art results.
Contribution
The paper proposes a new method that generates pseudo background points and searches for complete action sequences, enhancing weakly-supervised action localization performance.
Findings
Significant performance improvements at high IoU thresholds.
Outperforms existing weakly-supervised methods on four benchmarks.
Achieves comparable results to fully-supervised methods with less annotation cost.
Abstract
We tackle the problem of localizing temporal intervals of actions with only a single frame label for each action instance for training. Owing to label sparsity, existing work fails to learn action completeness, resulting in fragmentary action predictions. In this paper, we propose a novel framework, where dense pseudo-labels are generated to provide completeness guidance for the model. Concretely, we first select pseudo background points to supplement point-level action labels. Then, by taking the points as seeds, we search for the optimal sequence that is likely to contain complete action instances while agreeing with the seeds. To learn completeness from the obtained sequence, we introduce two novel losses that contrast action instances with background ones in terms of action score and feature similarity, respectively. Experimental results demonstrate that our completeness guidance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Context-Aware Activity Recognition Systems
