Lookalike3D: Seeing Double in 3D
Chandan Yeshwanth, Angela Dai

TL;DR
Lookalike3D introduces a new task and model for detecting and classifying repeated and similar objects in indoor scenes using multiview images, enhancing 3D understanding and reconstruction.
Contribution
The paper proposes Lookalike3D, a multiview transformer model, and introduces the 3DTwins dataset for lookalike object detection in indoor scenes.
Findings
104% IoU improvement over baselines
Effective distinction of identical, similar, and different object pairs
Enhanced downstream 3D reconstruction and segmentation
Abstract
3D object understanding and generation methods produce impressive results, yet they often overlook a pervasive source of information in real-world scenes: repeated objects. We introduce the task of lookalike object detection in indoor scenes, which leverages repeated and complementary cues from identical and near-identical object pairs. Given an input scene, the task is to classify pairs of objects as identical, similar or different using multiview images as input. To address this, we present Lookalike3D, a multiview image transformer that effectively distinguishes such object pairs by harnessing strong semantic priors from large image foundation models. To support this task, we collected the 3DTwins dataset, containing 76k manually annotated identical, similar and different pairs of objects based on ScanNet++, and show an improvement of 104% IoU over baselines. We demonstrate how our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · 3D Shape Modeling and Analysis · Multimodal Machine Learning Applications
