HI-GAN: Hierarchical Inpainting GAN with Auxiliary Inputs for Combined   RGB and Depth Inpainting

Ankan Dash; Jingyi Gu; Guiling Wang

arXiv:2402.10334·cs.CV·February 19, 2024·2 cites

HI-GAN: Hierarchical Inpainting GAN with Auxiliary Inputs for Combined RGB and Depth Inpainting

Ankan Dash, Jingyi Gu, Guiling Wang

PDF

Open Access

TL;DR

HI-GAN introduces a hierarchical GAN framework utilizing auxiliary edge and label inputs to improve RGB-D inpainting, especially for mixed reality applications, achieving superior results over existing methods.

Contribution

This work is the first to incorporate label images into an end-to-end hierarchical inpainting GAN for RGB-D data, enhancing inpainting quality with auxiliary inputs.

Findings

01

Outperforms existing RGB-D inpainting methods in quality and consistency.

02

Effectively utilizes auxiliary edge and label images for improved inpainting.

03

Operates in an end-to-end training framework with hierarchical optimization.

Abstract

Inpainting involves filling in missing pixels or areas in an image, a crucial technique employed in Mixed Reality environments for various applications, particularly in Diminished Reality (DR) where content is removed from a user's visual environment. Existing methods rely on digital replacement techniques which necessitate multiple cameras and incur high costs. AR devices and smartphones use ToF depth sensors to capture scene depth maps aligned with RGB images. Despite speed and affordability, ToF cameras create imperfect depth maps with missing pixels. To address the above challenges, we propose Hierarchical Inpainting GAN (HI-GAN), a novel approach comprising three GANs in a hierarchical fashion for RGBD inpainting. EdgeGAN and LabelGAN inpaint masked edge and segmentation label images respectively, while CombinedRGBD-GAN combines their latent representation outputs and performs RGB…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Industrial Vision Systems and Defect Detection

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Inpainting