Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views
Hao Su, Charles R. Qi, Yangyan Li, Leonidas Guibas

TL;DR
This paper introduces a framework combining rendered 3D model views and CNNs to improve object viewpoint estimation in images, addressing data scarcity and feature limitations, and achieving superior results on a benchmark dataset.
Contribution
It presents a scalable image synthesis pipeline using 3D models and a novel CNN architecture tailored for viewpoint estimation, enhancing performance over existing methods.
Findings
Significant outperforming of state-of-the-art methods on PASCAL 3D+ benchmark
Effective use of rendered images for training CNNs in viewpoint estimation
Robustness against overfitting in the proposed synthesis pipeline
Abstract
Object viewpoint estimation from 2D images is an essential task in computer vision. However, two issues hinder its progress: scarcity of training data with viewpoint annotations, and a lack of powerful features. Inspired by the growing availability of 3D models, we propose a framework to address both issues by combining render-based image synthesis and CNNs. We believe that 3D models have the potential in generating a large number of images of high variation, which can be well exploited by deep CNN with a high learning capacity. Towards this goal, we propose a scalable and overfit-resistant image synthesis pipeline, together with a novel CNN specifically tailored for the viewpoint estimation task. Experimentally, we show that the viewpoint estimation from our pipeline can significantly outperform state-of-the-art methods on PASCAL 3D+ benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
