Self-supervised Video Representation Learning by Pace Prediction

Jiangliu Wang; Jianbo Jiao; and Yun-Hui Liu

arXiv:2008.05861·cs.CV·September 7, 2020·61 cites

Self-supervised Video Representation Learning by Pace Prediction

Jiangliu Wang, Jianbo Jiao, and Yun-Hui Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel self-supervised learning method for video representations by predicting video pace, leveraging human sensitivity to motion speed, and enhances learning with contrastive techniques, achieving state-of-the-art results.

Contribution

It proposes a new self-supervised task of video pace prediction combined with contrastive learning, improving video representation quality across multiple architectures and benchmarks.

Findings

01

Achieves state-of-the-art performance in action recognition.

02

Effective across different network architectures.

03

Improves video retrieval accuracy.

Abstract

This paper addresses the problem of self-supervised video representation learning from a new perspective -- by video pace prediction. It stems from the observation that human visual system is sensitive to video pace, e.g., slow motion, a widely used technique in film making. Specifically, given a video played in natural pace, we randomly sample training clips in different paces and ask a neural network to identify the pace for each video clip. The assumption here is that the network can only succeed in such a pace reasoning task when it understands the underlying video content and learns representative spatio-temporal features. In addition, we further introduce contrastive learning to push the model towards discriminating different paces by maximizing the agreement on similar video content. To validate the effectiveness of the proposed method, we conduct extensive experiments on action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

laura-wang/video-pace
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Advanced Vision and Imaging

MethodsContrastive Learning