Visual Instruction Inversion: Image Editing via Visual Prompting

Thao Nguyen; Yuheng Li; Utkarsh Ojha; Yong Jae Lee

arXiv:2307.14331·cs.CV·July 27, 2023·2 cites

Visual Instruction Inversion: Image Editing via Visual Prompting

Thao Nguyen, Yuheng Li, Utkarsh Ojha, Yong Jae Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces Visual Instruction Inversion, a method that converts visual prompts into editing instructions for image editing, leveraging pretrained diffusion models to perform edits based on example pairs.

Contribution

It presents a novel approach to image editing that uses visual prompts to invert into editing instructions, enabling effective edits with minimal examples.

Findings

01

Achieves competitive results with just one example pair.

02

Outperforms some existing text-conditioned editing methods.

03

Utilizes pretrained diffusion models for visual prompt inversion.

Abstract

Text-conditioned image editing has emerged as a powerful tool for editing images. However, in many situations, language can be ambiguous and ineffective in describing specific image edits. When faced with such challenges, visual prompts can be a more informative and intuitive way to convey ideas. We present a method for image editing via visual prompting. Given pairs of example that represent the "before" and "after" images of an edit, our goal is to learn a text-based editing direction that can be used to perform the same edit on new images. We leverage the rich, pretrained editing capabilities of text-to-image diffusion models by inverting visual prompts into editing instructions. Our results show that with just one example pair, we can achieve competitive results compared to state-of-the-art text-conditioned image editing frameworks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thaoshibe/visii
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques

MethodsDiffusion