FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction
Zhenpei Yang, Zhile Ren, Miguel Angel Bautista, Zaiwei Zhang, Qi Shan,, Qixing Huang

TL;DR
FvOR is a learning-based method that jointly refines 3D shape and camera pose from few images with noisy inputs, achieving fast and accurate object reconstruction.
Contribution
It introduces a robust multi-view reconstruction algorithm that jointly optimizes shape and pose using neural networks, outperforming existing methods in speed and accuracy.
Findings
Achieves state-of-the-art accuracy on ShapeNet benchmark.
Runs two orders of magnitude faster than IDR.
Effectively handles noisy camera pose inputs.
Abstract
Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision. State-of-the-art approaches typically assume accurate camera poses as input, which could be difficult to obtain in realistic settings. In this paper, we present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses. The core of our approach is a fast and robust multi-view reconstruction algorithm to jointly refine 3D geometry and camera pose estimation using learnable neural network modules. We provide a thorough benchmark of state-of-the-art approaches for this problem on ShapeNet. Our approach achieves best-in-class results. It is also two orders of magnitude faster than the recent optimization-based approach IDR. Our code is released at \url{https://github.com/zhenpeiyang/FvOR/}
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Robotics and Sensor-Based Localization
