FramePainter: Endowing Interactive Image Editing with Video Diffusion   Priors

Yabo Zhang; Xinpeng Zhou; Yihan Zeng; Hang Xu; Hui Li; Wangmeng Zuo

arXiv:2501.08225·cs.CV·January 15, 2025

FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

Yabo Zhang, Xinpeng Zhou, Yihan Zeng, Hang Xu, Hui Li, Wangmeng Zuo

PDF

Open Access 1 Repo 1 Models

TL;DR

FramePainter leverages video diffusion priors for efficient, coherent, and seamless interactive image editing, reducing training costs and enhancing generalization to unseen scenarios.

Contribution

It reformulates image editing as an image-to-video generation task, introducing a lightweight control encoder and matching attention to improve temporal consistency and handle large motions.

Findings

01

Outperforms previous methods with less training data

02

Achieves highly seamless and coherent image edits

03

Demonstrates strong generalization to unseen scenarios

Abstract

Interactive image editing allows users to modify images through visual interaction operations such as drawing, clicking, and dragging. Existing methods construct such supervision signals from videos, as they capture how objects change with various physical interactions. However, these models are usually built upon text-to-image diffusion models, so necessitate (i) massive training samples and (ii) an additional reference encoder to learn real-world dynamics and visual consistency. In this paper, we reformulate this task as an image-to-video generation problem, so that inherit powerful video diffusion priors to reduce training costs and ensure temporal consistency. Specifically, we introduce FramePainter as an efficient instantiation of this formulation. Initialized with Stable Video Diffusion, it only uses a lightweight sparse control encoder to inject editing signals. Considering the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ybybzhang/framepainter
pytorchOfficial

Models

🤗
Yabo/FramePainter
model· 19 dl· ♡ 5
19 dl♡ 5

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Image Retrieval and Classification Techniques

MethodsSoftmax · Attention Is All You Need · Diffusion