Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing

Tongtong Su; Chengyu Wang; Jun Huang; Dongming Lu

arXiv:2505.23134·cs.CV·May 30, 2025

Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing

Tongtong Su, Chengyu Wang, Jun Huang, Dongming Lu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Zero-to-Hero, a reference-based video editing method that achieves accurate, temporally consistent edits starting from zero-shot initialization, outperforming existing approaches in quality and robustness.

Contribution

It proposes a novel zero-shot initialization approach for reference-based video editing that enhances accuracy and temporal consistency, with a robust correspondence-guided attention mechanism.

Findings

01

Outperforms baseline with 2.6 dB PSNR improvement

02

Uses correspondence-guided attention for robustness against large motions

03

Provides a deterministic evaluation framework with Blender-generated videos

Abstract

Appearance editing according to user needs is a pivotal task in video editing. Existing text-guided methods often lead to ambiguities regarding user intentions and restrict fine-grained control over editing specific aspects of objects. To overcome these limitations, this paper introduces a novel approach named {Zero-to-Hero}, which focuses on reference-based video editing that disentangles the editing process into two distinct problems. It achieves this by first editing an anchor frame to satisfy user requirements as a reference image and then consistently propagating its appearance across other frames. We leverage correspondence within the original frames to guide the attention mechanism, which is more robust than previously proposed optical flow or temporal modules in memory-friendly video generative models, especially when dealing with objects exhibiting large motions. It offers a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tonniia/zero2hero
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Image Enhancement Techniques

MethodsAttention Is All You Need · Softmax · RoIAlign · RoIPool · Sparse Evolutionary Training