Nearest-Neighbor Inter-Intra Contrastive Learning from Unlabeled Videos

David Fan; Deyu Yang; Xinyu Li; Vimal Bhat; Rohith MV

arXiv:2303.07317·cs.CV·March 14, 2023·1 cites

Nearest-Neighbor Inter-Intra Contrastive Learning from Unlabeled Videos

David Fan, Deyu Yang, Xinyu Li, Vimal Bhat, Rohith MV

PDF

Open Access

TL;DR

This paper introduces Inter-Intra Video Contrastive Learning (IIVCL), a method that leverages nearest-neighbor videos as positives to enhance diversity and semantic scope in self-supervised video representation learning.

Contribution

The paper proposes a novel contrastive learning approach that incorporates nearest-neighbor videos as positives, extending beyond local clips and class boundaries for improved video representations.

Findings

01

Improved performance on various video tasks.

02

Enhanced positive key diversity.

03

More relaxed similarity notion extending beyond class boundaries.

Abstract

Contrastive learning has recently narrowed the gap between self-supervised and supervised methods in image and video domain. State-of-the-art video contrastive learning methods such as CVRL and $ρ$ -MoCo spatiotemporally augment two clips from the same video as positives. By only sampling positive clips locally from a single video, these methods neglect other semantically related videos that can also be useful. To address this limitation, we leverage nearest-neighbor videos from the global space as additional positive pairs, thus improving positive key diversity and introducing a more relaxed notion of similarity that extends beyond video and even class boundaries. Our method, Inter-Intra Video Contrastive Learning (IIVCL), improves performance on a range of video tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research

MethodsDense Connections · Temporally Consistent Spatial Augmentation · Contrastive Learning · 3D Convolution · Contrastive Video Representation Learning