Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views
Francisco Massa, Bryan Russell, Mathieu Aubry

TL;DR
This paper introduces a CNN-based method for 2D-3D exemplar detection that adapts features from natural images to rendered views, improving accuracy and speed in object detection tasks.
Contribution
The paper presents a novel CNN approach that learns to adapt features from natural images to align with CAD rendered views for improved 2D-3D detection.
Findings
Achieved higher detection accuracy on IKEA dataset.
Outperformed previous methods on Pascal VOC chair detection.
Demonstrated effective feature adaptation between real and rendered views.
Abstract
This paper presents an end-to-end convolutional neural network (CNN) for 2D-3D exemplar detection. We demonstrate that the ability to adapt the features of natural images to better align with those of CAD rendered views is critical to the success of our technique. We show that the adaptation can be learned by compositing rendered views of textured object models on natural images. Our approach can be naturally incorporated into a CNN detection pipeline and extends the accuracy and speed benefits from recent advances in deep learning to 2D-3D exemplar detection. We applied our method to two tasks: instance detection, where we evaluated on the IKEA dataset, and object category detection, where we out-perform Aubry et al. for "chair" detection on a subset of the Pascal VOC dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
