Back to the Future: Cycle Encoding Prediction for Self-supervised   Contrastive Video Representation Learning

Xinyu Yang; Majid Mirmehdi; Tilo Burghardt

arXiv:2010.07217·cs.CV·October 26, 2021·5 cites

Back to the Future: Cycle Encoding Prediction for Self-supervised Contrastive Video Representation Learning

Xinyu Yang, Majid Mirmehdi, Tilo Burghardt

PDF

Open Access 1 Repo

TL;DR

This paper introduces Cycle Encoding Prediction (CEP), a self-supervised learning method that encodes high-level spatio-temporal structures in videos by predicting temporal cycles, improving action recognition performance.

Contribution

The paper proposes a novel self-supervised learning approach, CEP, that captures temporal cycle structures in videos for better feature representation in action recognition.

Findings

01

Significantly improved accuracy on UCF101 and HMDB51 datasets.

02

Effective encoding of high-level temporal cycles enhances video feature learning.

03

Ablation studies confirm the importance of cycle closure and contrastive loss components.

Abstract

In this paper we show that learning video feature spaces in which temporal cycles are maximally predictable benefits action classification. In particular, we propose a novel learning approach termed Cycle Encoding Prediction (CEP) that is able to effectively represent high-level spatio-temporal structure of unlabelled video content. CEP builds a latent space wherein the concept of closed forward-backward as well as backward-forward temporal loops is approximately preserved. As a self-supervision signal, CEP leverages the bi-directional temporal coherence of the video stream and applies loss functions that encourage both temporal cycle closure as well as contrastive feature separation. Architecturally, the underpinning network structure utilises a single feature encoder for all video snippets, adding two predictive modules that learn temporal forward and backward transitions. We apply…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

youshyee/CEP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Advanced Vision and Imaging