ROCA: Robust CAD Model Retrieval and Alignment from a Single Image
Can G\"umeli, Angela Dai, Matthias Nie{\ss}ner

TL;DR
ROCA is an end-to-end method that retrieves and aligns 3D CAD models from a single image, enabling 3D scene understanding with improved accuracy through dense correspondences and differentiable optimization.
Contribution
It introduces a novel differentiable alignment optimization and a unified approach for CAD retrieval and alignment from a single RGB image.
Findings
Significantly improves CAD alignment accuracy on real-world data.
Leverages dense 2D-3D correspondences for robust alignment.
Achieves state-of-the-art performance, increasing accuracy from 9.5% to 17.6%.
Abstract
We present ROCA, a novel end-to-end approach that retrieves and aligns 3D CAD models from a shape database to a single input image. This enables 3D perception of an observed scene from a 2D RGB observation, characterized as a lightweight, compact, clean CAD representation. Core to our approach is our differentiable alignment optimization based on dense 2D-3D object correspondences and Procrustes alignment. ROCA can thus provide a robust CAD alignment while simultaneously informing CAD retrieval by leveraging the 2D-3D correspondences to learn geometrically similar CAD models. Experiments on challenging, real-world imagery from ScanNet show that ROCA significantly improves on state of the art, from 9.5% to 17.6% in retrieval-aware CAD alignment accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · Robotics and Sensor-Based Localization · Optical measurement and interference techniques
MethodsTriplet Loss · 3D Convolution · RoIAlign · Softmax · Convolution · Region Proposal Network · Mask R-CNN · Procrustes
