Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang, Jianbo Jiao, and Yun-Hui Liu

TL;DR
This paper introduces a novel self-supervised learning method for video representations by predicting video pace, leveraging human sensitivity to motion speed, and enhances learning with contrastive techniques, achieving state-of-the-art results.
Contribution
It proposes a new self-supervised task of video pace prediction combined with contrastive learning, improving video representation quality across multiple architectures and benchmarks.
Findings
Achieves state-of-the-art performance in action recognition.
Effective across different network architectures.
Improves video retrieval accuracy.
Abstract
This paper addresses the problem of self-supervised video representation learning from a new perspective -- by video pace prediction. It stems from the observation that human visual system is sensitive to video pace, e.g., slow motion, a widely used technique in film making. Specifically, given a video played in natural pace, we randomly sample training clips in different paces and ask a neural network to identify the pace for each video clip. The assumption here is that the network can only succeed in such a pace reasoning task when it understands the underlying video content and learns representative spatio-temporal features. In addition, we further introduce contrastive learning to push the model towards discriminating different paces by maximizing the agreement on similar video content. To validate the effectiveness of the proposed method, we conduct extensive experiments on action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Advanced Vision and Imaging
MethodsContrastive Learning
