Boundary-sensitive Pre-training for Temporal Localization in Videos
Mengmeng Xu, Juan-Manuel Perez-Rua, Victor Escorcia, Brais Martinez,, Xiatian Zhu, Li Zhang, Bernard Ghanem, Tao Xiang

TL;DR
This paper introduces a boundary-sensitive pre-training method for temporal video localization that synthesizes boundary annotations from existing datasets, significantly improving transferability and achieving state-of-the-art results.
Contribution
The paper proposes a novel boundary-sensitive pretext task that synthesizes temporal boundaries for pre-training, enhancing transferability to localization tasks without needing manual boundary annotations.
Findings
BSP outperforms existing pre-training methods.
BSP achieves state-of-the-art results on multiple datasets.
Synthesizing boundaries improves temporal localization accuracy.
Abstract
Many video analysis tasks require temporal localization thus detection of content changes. However, most existing models developed for these tasks are pre-trained on general video action classification tasks. This is because large scale annotation of temporal boundaries in untrimmed videos is expensive. Therefore no suitable datasets exist for temporal boundary-sensitive pre-training. In this paper for the first time, we investigate model pre-training for temporal localization by introducing a novel boundary-sensitive pretext (BSP) task. Instead of relying on costly manual annotations of temporal boundaries, we propose to synthesize temporal boundaries in existing video action classification datasets. With the synthesized boundaries, BSP can be simply conducted via classifying the boundary types. This enables the learning of video representations that are much more transferable to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
