Zero in on Shape: A Generic 2D-3D Instance Similarity Metric learned from Synthetic Data
Maciej Janik, Niklas Gard, Anna Hilsmann, Peter Eisert

TL;DR
This paper introduces a neural network that compares 2D images and 3D models based on shape similarity, trained solely on synthetic data, and capable of zero-shot shape retrieval.
Contribution
It proposes a view-based shape descriptor and siamese network architecture for zero-shot 2D-3D shape comparison using synthetic data, addressing dataset scarcity and domain gap issues.
Findings
Synthetic data variety improves retrieval accuracy.
Zero-shot performance matches instance-aware mode in top 10% search narrowing.
Training on synthetic data effectively bridges domain gap.
Abstract
We present a network architecture which compares RGB images and untextured 3D models by the similarity of the represented shape. Our system is optimised for zero-shot retrieval, meaning it can recognise shapes never shown in training. We use a view-based shape descriptor and a siamese network to learn object geometry from pairs of 3D models and 2D images. Due to scarcity of datasets with exact photograph-mesh correspondences, we train our network with only synthetic data. Our experiments investigate the effect of different qualities and quantities of training data on retrieval accuracy and present insights from bridging the domain gap. We show that increasing the variety of synthetic data improves retrieval accuracy and that our system's performance in zero-shot mode can match that of the instance-aware mode, as far as narrowing down the search to the top 10% of objects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSiamese Network
