O$^2$-Recon: Completing 3D Reconstruction of Occluded Objects in the   Scene with a Pre-trained 2D Diffusion Model

Yubin Hu; Sheng Ye; Wang Zhao; Matthieu Lin; Yuze He; Yu-Hui Wen; Ying; He; Yong-Jin Liu

arXiv:2308.09591·cs.CV·March 20, 2024

O$^2$-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model

Yubin Hu, Sheng Ye, Wang Zhao, Matthieu Lin, Yuze He, Yu-Hui Wen, Ying, He, Yong-Jin Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces O$^2$-Recon, a novel framework that leverages a pre-trained 2D diffusion model and a cascaded network architecture to improve 3D reconstruction of occluded objects from RGB-D videos, achieving state-of-the-art results.

Contribution

The paper presents a new method combining 2D diffusion in-painting, minimal human-in-the-loop mask generation, and a cascaded network for predicting signed distance fields, enhancing 3D object reconstruction.

Findings

01

Achieves state-of-the-art accuracy in object reconstruction.

02

Effectively reconstructs occluded and hidden object parts.

03

Utilizes semantic consistency loss for better unseen view reconstruction.

Abstract

Occlusion is a common issue in 3D reconstruction from RGB-D videos, often blocking the complete reconstruction of objects and presenting an ongoing problem. In this paper, we propose a novel framework, empowered by a 2D diffusion-based in-painting model, to reconstruct complete surfaces for the hidden parts of objects. Specifically, we utilize a pre-trained diffusion model to fill in the hidden areas of 2D images. Then we use these in-painted images to optimize a neural implicit surface representation for each instance for 3D reconstruction. Since creating the in-painting masks needed for this process is tricky, we adopt a human-in-the-loop strategy that involves very little human engagement to generate high-quality masks. Moreover, some parts of objects can be totally hidden because the videos are usually shot from limited perspectives. To ensure recovering these invisible areas, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-lyj-lab/o2-recon
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques

MethodsDiffusion