HI-GAN: Hierarchical Inpainting GAN with Auxiliary Inputs for Combined RGB and Depth Inpainting
Ankan Dash, Jingyi Gu, Guiling Wang

TL;DR
HI-GAN introduces a hierarchical GAN framework utilizing auxiliary edge and label inputs to improve RGB-D inpainting, especially for mixed reality applications, achieving superior results over existing methods.
Contribution
This work is the first to incorporate label images into an end-to-end hierarchical inpainting GAN for RGB-D data, enhancing inpainting quality with auxiliary inputs.
Findings
Outperforms existing RGB-D inpainting methods in quality and consistency.
Effectively utilizes auxiliary edge and label images for improved inpainting.
Operates in an end-to-end training framework with hierarchical optimization.
Abstract
Inpainting involves filling in missing pixels or areas in an image, a crucial technique employed in Mixed Reality environments for various applications, particularly in Diminished Reality (DR) where content is removed from a user's visual environment. Existing methods rely on digital replacement techniques which necessitate multiple cameras and incur high costs. AR devices and smartphones use ToF depth sensors to capture scene depth maps aligned with RGB images. Despite speed and affordability, ToF cameras create imperfect depth maps with missing pixels. To address the above challenges, we propose Hierarchical Inpainting GAN (HI-GAN), a novel approach comprising three GANs in a hierarchical fashion for RGBD inpainting. EdgeGAN and LabelGAN inpaint masked edge and segmentation label images respectively, while CombinedRGBD-GAN combines their latent representation outputs and performs RGB…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Industrial Vision Systems and Defect Detection
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Inpainting
