UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance   Editing

Jianhong Bai; Tianyu He; Yuchi Wang; Junliang Guo; Haoji Hu; Zuozhu; Liu; Jiang Bian

arXiv:2402.13185·cs.CV·April 9, 2024·2 cites

UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing

Jianhong Bai, Tianyu He, Yuchi Wang, Junliang Guo, Haoji Hu, Zuozhu, Liu, Jiang Bian

PDF

Open Access

TL;DR

UniEdit is a tuning-free, unified framework for video editing that effectively handles both motion and appearance modifications using a pre-trained text-to-video generator, advancing the capabilities of video editing technology.

Contribution

The paper introduces UniEdit, a novel framework that enables simultaneous video motion and appearance editing without tuning, utilizing a pre-trained generator and auxiliary branches for feature injection.

Findings

01

Outperforms state-of-the-art video editing methods

02

Supports diverse motion and appearance editing scenarios

03

Demonstrates effective preservation of source content

Abstract

Recent advances in text-guided video editing have showcased promising results in appearance editing (e.g., stylization). However, video motion editing in the temporal dimension (e.g., from eating to waving), which distinguishes video editing from image editing, is underexplored. In this work, we present UniEdit, a tuning-free framework that supports both video motion and appearance editing by harnessing the power of a pre-trained text-to-video generator within an inversion-then-generation framework. To realize motion editing while preserving source video content, based on the insights that temporal and spatial self-attention layers encode inter-frame and intra-frame dependency respectively, we introduce auxiliary motion-reference and reconstruction branches to produce text-guided motion and source features respectively. The obtained features are then injected into the main editing path…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition