Training-Free Semantic Video Composition via Pre-trained Diffusion Model

Jiaqi Guo; Sitong Su; Junchen Zhu; Lianli Gao; Jingkuan Song

arXiv:2401.09195·cs.CV·January 18, 2024·1 cites

Training-Free Semantic Video Composition via Pre-trained Diffusion Model

Jiaqi Guo, Sitong Su, Junchen Zhu, Lianli Gao, Jingkuan Song

PDF

Open Access

TL;DR

This paper introduces a training-free method for semantic video composition using a pre-trained diffusion model, effectively handling deep semantic disparities and ensuring visual harmony and coherence across frames.

Contribution

It proposes a novel training-free pipeline with Balanced Partial Inversion and Inter-Frame Augmented attention to improve semantic video composition without additional training.

Findings

01

Successfully manages broader semantic disparities in videos

02

Ensures inter-frame coherence and visual harmony

03

Outperforms existing methods in semantic video composition

Abstract

The video composition task aims to integrate specified foregrounds and backgrounds from different videos into a harmonious composite. Current approaches, predominantly trained on videos with adjusted foreground color and lighting, struggle to address deep semantic disparities beyond superficial adjustments, such as domain gaps. Therefore, we propose a training-free pipeline employing a pre-trained diffusion model imbued with semantic prior knowledge, which can process composite videos with broader semantic disparities. Specifically, we process the video frames in a cascading manner and handle each frame in two processes with the diffusion model. In the inversion process, we propose Balanced Partial Inversion to obtain generation initial points that balance reversibility and modifiability. Then, in the generation process, we further propose Inter-Frame Augmented attention to augment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image Processing Techniques

MethodsDiffusion