Instance-Aware Image Completion
Jinoh Cho, Minguk Kang, Vibhav Vineet, Jaesik Park

TL;DR
This paper introduces ImComplete, a transformer-based image completion model that hallucines missing instances to produce contextually consistent and photo-realistic images, outperforming existing methods.
Contribution
The novel ImComplete model combines semantic segmentation and transformer architecture to generate contextually aware and realistic image completions.
Findings
Outperforms existing methods in visual quality metrics (LPIPS, FID)
Achieves higher contextual preservation scores (CLIPscore, object detection accuracy)
Demonstrates effectiveness on COCO-panoptic and Visual Genome datasets
Abstract
Image completion is a task that aims to fill in the missing region of a masked image with plausible contents. However, existing image completion methods tend to fill in the missing region with the surrounding texture instead of hallucinating a visual instance that is suitable in accordance with the context of the scene. In this work, we propose a novel image completion model, dubbed ImComplete, that hallucinates the missing instance that harmonizes well with - and thus preserves - the original context. ImComplete first adopts a transformer architecture that considers the visible instances and the location of the missing region. Then, ImComplete completes the semantic segmentation masks within the missing region, providing pixel-level semantic and structural guidance. Finally, the image synthesis blocks generate photo-realistic content. We perform a comprehensive evaluation of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis · Image Enhancement Techniques
