3DFill:Reference-guided Image Inpainting by Self-supervised 3D Image Alignment
Liang Zhao, Xinyuan Zhao, Hailong Ma, Xinyu Zhang, Long Zeng

TL;DR
3DFill introduces a self-supervised, reference-guided image inpainting method that aligns images using 3D projection and 2D transformation, effectively handling large holes and complex scenes with improved accuracy and speed.
Contribution
The paper presents a novel 3D projection plus 2D transformation alignment approach for reference-guided inpainting, outperforming traditional 2D-only methods.
Findings
Achieves state-of-the-art inpainting performance across wide view shifts.
Faster inference speed compared to existing models.
Effectively handles large holes and complex scenes.
Abstract
Most existing image inpainting algorithms are based on a single view, struggling with large holes or the holes containing complicated scenes. Some reference-guided algorithms fill the hole by referring to another viewpoint image and use 2D image alignment. Due to the camera imaging process, simple 2D transformation is difficult to achieve a satisfactory result. In this paper, we propose 3DFill, a simple and efficient method for reference-guided image inpainting. Given a target image with arbitrary hole regions and a reference image from another viewpoint, the 3DFill first aligns the two images by a two-stage method: 3D projection + 2D transformation, which has better results than 2D image alignment. The 3D projection is an overall alignment between images and the 2D transformation is a local alignment focused on the hole region. The entire process of image alignment is self-supervised.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Generative Adversarial Networks and Image Synthesis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Inpainting
