Scan2CAD: Learning CAD Model Alignment in RGB-D Scans
Armen Avetisyan, Manuel Dahnert, Angela Dai, Manolis Savva, Angel X., Chang, Matthias Nie{\ss}ner

TL;DR
Scan2CAD introduces a data-driven approach that aligns CAD models to noisy RGB-D scans by learning keypoint correspondences with a novel 3D CNN, significantly improving alignment accuracy.
Contribution
The paper presents a new dataset, a novel 3D CNN architecture for joint embedding, and a variational energy minimization method for CAD alignment in RGB-D scans.
Findings
Outperforms previous methods by 21.39% on the Scan2CAD benchmark.
Creates a large annotated dataset with 97,607 keypoint pairs.
Demonstrates effective alignment of CAD models to real-world scans.
Abstract
We present Scan2CAD, a novel data-driven method that learns to align clean 3D CAD models from a shape database to the noisy and incomplete geometry of a commodity RGB-D scan. For a 3D reconstruction of an indoor scene, our method takes as input a set of CAD models, and predicts a 9DoF pose that aligns each model to the underlying scan geometry. To tackle this problem, we create a new scan-to-CAD alignment dataset based on 1506 ScanNet scans with 97607 annotated keypoint pairs between 14225 CAD models from ShapeNet and their counterpart objects in the scans. Our method selects a set of representative keypoints in a 3D scan for which we find correspondences to the CAD geometry. To this end, we design a novel 3D CNN architecture that learns a joint embedding between real and synthetic objects, and from this predicts a correspondence heatmap. Based on these correspondence heatmaps, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization
