TL;DR
DeOcc-1-to-3 introduces a self-supervised, occlusion-aware multi-view synthesis framework that enables accurate 3D object reconstruction from a single occluded image, outperforming prior methods that assume fully visible inputs.
Contribution
The paper presents the first end-to-end, self-supervised approach for occlusion-aware multi-view generation from a single image, eliminating the need for inpainting or manual annotations.
Findings
Achieves structurally consistent novel views from occluded images
Enables reliable 3D reconstruction without prior inpainting
Introduces a new benchmark for occlusion-aware reconstruction
Abstract
Reconstructing 3D objects from a single image remains challenging, especially under real-world occlusions. While recent diffusion-based view synthesis models can generate consistent novel views from a single RGB image, they typically assume fully visible inputs and fail when parts of the object are occluded, resulting in degraded 3D reconstruction quality. We propose DeOcc-1-to-3, an end-to-end framework for occlusion-aware multi-view generation that synthesizes six structurally consistent novel views directly from a single occluded image, enabling reliable 3D reconstruction without prior inpainting or manual annotations. Our self-supervised training pipeline leverages occluded-unoccluded image pairs and pseudo-ground-truth views to teach the model structure-aware completion and view consistency. Without modifying the original architecture, we fully fine-tune the view synthesis model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsInpainting
