Self-supervised Video Representation Learning with Cascade Positive   Retrieval

Cheng-En Wu; Farley Lai; Yu Hen Hu; Asim Kadav

arXiv:2201.07989·cs.CV·April 22, 2022

Self-supervised Video Representation Learning with Cascade Positive Retrieval

Cheng-En Wu, Farley Lai, Yu Hen Hu, Asim Kadav

PDF

Open Access 1 Repo

TL;DR

This paper introduces Cascade Positive Retrieval (CPR), a novel self-supervised learning method that progressively mines positive video examples across multiple views and stages, significantly improving video retrieval and action recognition performance.

Contribution

The paper proposes CPR, a new multi-stage positive example mining approach for self-supervised video representation learning, enhancing retrieval accuracy and downstream task performance.

Findings

01

CPR achieves 83.3% class mining recall, outperforming previous methods.

02

CPR improves state-of-the-art R@1 in video retrieval to 56.7%.

03

CPR enhances action recognition accuracy on UCF101 and HMDB51 datasets.

Abstract

Self-supervised video representation learning has been shown to effectively improve downstream tasks such as video retrieval and action recognition. In this paper, we present the Cascade Positive Retrieval (CPR) that successively mines positive examples w.r.t. the query for contrastive learning in a cascade of stages. Specifically, CPR exploits multiple views of a query example in different modalities, where an alternative view may help find another positive example dissimilar in the query view. We explore the effects of possible CPR configurations in ablations including the number of mining stages, the top similar example selection ratio in each stage, and progressive training with an incremental number of the final Top-k selection. The overall mining quality is measured to reflect the recall across training set classes. CPR reaches a median class mining recall of 83.3%, outperforming…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

necla-ml/cpr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Cancer-related molecular mechanisms research

MethodsContrastive Learning