3D-Consistent Image Inpainting with Diffusion Models
Leonid Antsfeld, Boris Chidlovskii

TL;DR
This paper introduces a novel diffusion-based inpainting method that ensures 3D consistency and semantic coherence by incorporating multi-view scene information without explicit 3D supervision, outperforming existing techniques.
Contribution
It proposes a new generative diffusion model that uses scene pairs and multi-view guidance to achieve 3D-consistent inpainting without explicit 3D training.
Findings
Achieves semantically coherent, 3D-consistent inpainting results.
Outperforms state-of-the-art methods on multiple datasets.
Effective without explicit 3D supervision.
Abstract
We address the problem of 3D inconsistency of image inpainting based on diffusion models. We propose a generative model using image pairs that belong to the same scene. To achieve the 3D-consistent and semantically coherent inpainting, we modify the generative diffusion model by incorporating an alternative point of view of the scene into the denoising process. This creates an inductive bias that allows to recover 3D priors while training to denoise in 2D, without explicit 3D supervision. Training unconditional diffusion models with additional images as in-context guidance allows to harmonize the masked and non-masked regions while repainting and ensures the 3D consistency. We evaluate our method on one synthetic and three real-world datasets and show that it generates semantically coherent and 3D-consistent inpaintings and outperforms the state-of-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Numerical Analysis Techniques
MethodsDiffusion · Inpainting
