Transitive Invariance for Self-supervised Visual Representation Learning
Xiaolong Wang, Kaiming He, Abhinav Gupta

TL;DR
This paper introduces a novel self-supervised learning approach that leverages a graph of objects and their invariances to learn more robust visual representations, improving performance on recognition tasks without relying on labeled data.
Contribution
It proposes organizing data with multiple invariances into a graph and applying transitivity to enhance self-supervised visual representation learning.
Findings
Achieves 63.2% mAP on PASCAL VOC 2007 with Fast R-CNN
Close to supervised performance on COCO dataset with 23.5% mAP
Outperforms ImageNet pre-trained network in surface normal estimation
Abstract
Learning visual representations with self-supervised learning has become popular in computer vision. The idea is to design auxiliary tasks where labels are free to obtain. Most of these tasks end up providing data to learn specific kinds of invariance useful for recognition. In this paper, we propose to exploit different self-supervised approaches to learn representations invariant to (i) inter-instance variations (two objects in the same class should have similar features) and (ii) intra-instance variations (viewpoint, pose, deformations, illumination, etc). Instead of combining two approaches with multi-task learning, we argue to organize and reason the data with multiple variations. Specifically, we propose to generate a graph with millions of objects mined from hundreds of thousands of videos. The objects are connected by two types of edges which correspond to two types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
MethodsRegion Proposal Network · Softmax · Convolution · RoIPool · Faster R-CNN
