A Task is Worth One Word: Learning with Task Prompts for High-Quality   Versatile Image Inpainting

Junhao Zhuang; Yanhong Zeng; Wenran Liu; Chun Yuan; Kai Chen

arXiv:2312.03594·cs.CV·July 24, 2024·2 cites

A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

Junhao Zhuang, Yanhong Zeng, Wenran Liu, Chun Yuan, Kai Chen

PDF

Open Access 1 Repo 6 Models

TL;DR

PowerPaint is a versatile image inpainting model that uses learnable task prompts and fine-tuning strategies to excel in multiple inpainting tasks, including background filling, object synthesis, and shape-guided inpainting, achieving state-of-the-art results.

Contribution

The paper introduces learnable task prompts and tailored fine-tuning strategies, enabling a single model to perform diverse inpainting tasks with high quality.

Findings

01

Achieves state-of-the-art performance in multiple inpainting tasks.

02

Effectively uses task prompts for object removal and shape-guided inpainting.

03

Demonstrates versatility and controllability in image inpainting applications.

Abstract

Advancing image inpainting is challenging as it requires filling user-specified regions for various intents, such as background filling and object synthesis. Existing approaches focus on either context-aware filling or object synthesis using text descriptions. However, achieving both tasks simultaneously is challenging due to differing training strategies. To overcome this challenge, we introduce PowerPaint, the first high-quality and versatile inpainting model that excels in multiple inpainting tasks. First, we introduce learnable task prompts along with tailored fine-tuning strategies to guide the model's focus on different inpainting targets explicitly. This enables PowerPaint to accomplish various inpainting tasks by utilizing different task prompts, resulting in state-of-the-art performance. Second, we demonstrate the versatility of the task prompt in PowerPaint by showcasing its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

open-mmlab/mmagic/tree/main/projects/powerpaint
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Image Retrieval and Classification Techniques

MethodsInpainting · Focus