Self-supervised Spatiotemporal Representation Learning by Exploiting Video Continuity
Hanwen Liang, Niamul Quader, Zhixiang Chi, Lizhe Chen, Peng Dai, Juwei, Lu, Yang Wang

TL;DR
This paper introduces a novel self-supervised learning method called CPNet that leverages video continuity to improve video representation learning, outperforming previous methods on various downstream tasks.
Contribution
It proposes three new continuity-based pretext tasks and demonstrates their effectiveness in enhancing video representations beyond existing approaches.
Findings
Outperforms prior methods on action recognition, video retrieval, and localization.
Combining continuity tasks with other properties improves performance.
Learned representations capture local and long-range motion and context.
Abstract
Recent self-supervised video representation learning methods have found significant success by exploring essential properties of videos, e.g. speed, temporal order, etc. This work exploits an essential yet under-explored property of videos, the video continuity, to obtain supervision signals for self-supervised representation learning. Specifically, we formulate three novel continuity-related pretext tasks, i.e. continuity justification, discontinuity localization, and missing section approximation, that jointly supervise a shared backbone for video representation learning. This self-supervision approach, termed as Continuity Perception Network (CPNet), solves the three tasks altogether and encourages the backbone network to learn local and long-ranged motion and context representations. It outperforms prior arts on multiple downstream tasks, such as action recognition, video retrieval,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
