Enhancing 2D Representation Learning with a 3D Prior
Mehmet Ayg\"un, Prithviraj Dhar, Zhicheng Yan, Oisin Mac Aodha, Rakesh, Ranjan

TL;DR
This paper introduces a method to improve 2D self-supervised visual representations by incorporating a 3D structural prior, resulting in more robust models that better mimic human shape perception.
Contribution
It presents a novel approach that explicitly enforces a 3D prior during training to enhance the robustness of self-supervised 2D representations.
Findings
3D-aware representations outperform baseline models
Enhanced robustness across multiple datasets
Explicit 3D prior improves shape-based visual processing
Abstract
Learning robust and effective representations of visual data is a fundamental task in computer vision. Traditionally, this is achieved by training models with labeled data which can be expensive to obtain. Self-supervised learning attempts to circumvent the requirement for labeled data by learning representations from raw unlabeled visual data alone. However, unlike humans who obtain rich 3D information from their binocular vision and through motion, the majority of current self-supervised methods are tasked with learning from monocular 2D image collections. This is noteworthy as it has been demonstrated that shape-centric visual processing is more robust compared to texture-biased automated methods. Inspired by this, we propose a new approach for strengthening existing self-supervised methods by explicitly enforcing a strong 3D structural prior directly into the model during training.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition
MethodsAttentive Walk-Aggregating Graph Neural Network
