Contrastive Learning of Image Representations with Cross-Video   Cycle-Consistency

Haiping Wu; Xiaolong Wang

arXiv:2105.06463·cs.CV·May 14, 2021

Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency

Haiping Wu, Xiaolong Wang

PDF

Open Access

TL;DR

This paper introduces a novel contrastive learning approach leveraging cross-video cycle-consistency to improve image representations, leading to better performance on various downstream tasks without requiring cross-video labels.

Contribution

It proposes a new contrastive learning method that exploits cross-video relations through cycle-consistency, enhancing high-level semantic representation learning.

Findings

01

Significant improvements on object tracking, classification, and action recognition tasks.

02

Effective use of cross-video relations without human-annotated labels.

03

Outperforms state-of-the-art contrastive learning methods.

Abstract

Recent works have advanced the performance of self-supervised representation learning by a large margin. The core among these methods is intra-image invariance learning. Two different transformations of one image instance are considered as a positive sample pair, where various tasks are designed to learn invariant representations by comparing the pair. Analogically, for video data, representations of frames from the same video are trained to be closer than frames from other videos, i.e. intra-video invariance. However, cross-video relation has barely been explored for visual representation learning. Unlike intra-video invariance, ground-truth labels of cross-video relation is usually unavailable without human labors. In this paper, we propose a novel contrastive learning method which explores the cross-video relation by using cycle-consistency for general image representation learning.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition

MethodsContrastive Learning