Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from   Single and Multiple Images

Haozhe Xie; Hongxun Yao; Shengping Zhang; Shangchen Zhou; Wenxiu Sun

arXiv:2006.12250·cs.CV·October 6, 2020

Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images

Haozhe Xie, Hongxun Yao, Shengping Zhang, Shangchen Zhou, Wenxiu Sun

PDF

3 Repos

TL;DR

Pix2Vox++ introduces a multi-scale, context-aware framework for 3D object reconstruction from single or multiple images, overcoming RNN limitations and achieving superior accuracy and efficiency.

Contribution

The paper proposes a novel encoder-decoder architecture with a multi-scale fusion module and refiner, improving 3D reconstruction consistency and quality over existing RNN-based methods.

Findings

01

Outperforms state-of-the-art methods on ShapeNet, Pix3D, and Things3D datasets.

02

Achieves higher accuracy and efficiency in 3D reconstruction.

03

Effectively handles single-view and multi-view inputs.

Abstract

Recovering the 3D shape of an object from single or multiple images with deep neural networks has been attracting increasing attention in the past few years. Mainstream works (e.g. 3D-R2N2) use recurrent neural networks (RNNs) to sequentially fuse feature maps of input images. However, RNN-based approaches are unable to produce consistent reconstruction results when given the same input images with different orders. Moreover, RNNs may forget important features from early input images due to long-term memory loss. To address these issues, we propose a novel framework for single-view and multi-view 3D object reconstruction, named Pix2Vox++. By using a well-designed encoder-decoder, it generates a coarse 3D volume from each input image. A multi-scale context-aware fusion module is then introduced to adaptively select high-quality reconstructions for different parts from all coarse 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.