DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao; Xiang Zhang; Haiyang Xu; Zeyuan Chen; Jianwen Xie; Yuan Gao; Zhuowen Tu

arXiv:2507.22825·cs.CV·July 31, 2025

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao, Xiang Zhang, Haiyang Xu, Zeyuan Chen, Jianwen Xie, Yuan Gao, Zhuowen Tu

PDF

1 Models 1 Datasets

TL;DR

DepR is a novel depth-guided single-view scene reconstruction framework that uses instance-level diffusion and depth throughout training and inference to produce accurate 3D scene layouts from a single image.

Contribution

It introduces depth-guided conditioning into diffusion models and a compositional approach for object-level reconstruction, improving accuracy and generalization.

Findings

01

Achieves state-of-the-art performance on synthetic datasets

02

Demonstrates strong generalization to real-world data

03

Effectively leverages depth information throughout the process

Abstract

We propose DepR, a depth-guided single-view scene reconstruction framework that integrates instance-level diffusion within a compositional paradigm. Instead of reconstructing the entire scene holistically, DepR generates individual objects and subsequently composes them into a coherent 3D layout. Unlike previous methods that use depth solely for object layout estimation during inference and therefore fail to fully exploit its rich geometric information, DepR leverages depth throughout both training and inference. Specifically, we introduce depth-guided conditioning to effectively encode shape priors into diffusion models. During inference, depth further guides DDIM sampling and layout optimization, enhancing alignment between the reconstruction and the input image. Despite being trained on limited synthetic data, DepR achieves state-of-the-art performance and demonstrates strong…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
zx1239856/DepR
model

Datasets

zx1239856/DepR-3D-FRONT
dataset· 166 dl
166 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.