Training-Free Semantic Video Composition via Pre-trained Diffusion Model
Jiaqi Guo, Sitong Su, Junchen Zhu, Lianli Gao, Jingkuan Song

TL;DR
This paper introduces a training-free method for semantic video composition using a pre-trained diffusion model, effectively handling deep semantic disparities and ensuring visual harmony and coherence across frames.
Contribution
It proposes a novel training-free pipeline with Balanced Partial Inversion and Inter-Frame Augmented attention to improve semantic video composition without additional training.
Findings
Successfully manages broader semantic disparities in videos
Ensures inter-frame coherence and visual harmony
Outperforms existing methods in semantic video composition
Abstract
The video composition task aims to integrate specified foregrounds and backgrounds from different videos into a harmonious composite. Current approaches, predominantly trained on videos with adjusted foreground color and lighting, struggle to address deep semantic disparities beyond superficial adjustments, such as domain gaps. Therefore, we propose a training-free pipeline employing a pre-trained diffusion model imbued with semantic prior knowledge, which can process composite videos with broader semantic disparities. Specifically, we process the video frames in a cascading manner and handle each frame in two processes with the diffusion model. In the inversion process, we propose Balanced Partial Inversion to obtain generation initial points that balance reversibility and modifiability. Then, in the generation process, we further propose Inter-Frame Augmented attention to augment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image Processing Techniques
MethodsDiffusion
