UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame   Organizer

Delong Liu; Zhaohui Hou; Mingjie Zhan; Shihao Han; Zhicheng Zhao; Fei; Su

arXiv:2412.09389·cs.CV·December 13, 2024

UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer

Delong Liu, Zhaohui Hou, Mingjie Zhan, Shihao Han, Zhicheng Zhao, Fei, Su

PDF

Open Access 1 Repo

TL;DR

UFO is a versatile plug-in that improves the consistency and quality of diffusion-based video generation models by using adaptive adapters, without altering original models and supporting transferability and stylized training.

Contribution

We introduce UFO, a modular, efficient, and transferable plug-in that enhances diffusion-based video generation quality and consistency without retraining original models.

Findings

01

UFO significantly improves video consistency and quality.

02

UFO demonstrates superior performance on public benchmarks.

03

UFO supports stylized and transfer learning across models.

Abstract

Recently, diffusion-based video generation models have achieved significant success. However, existing models often suffer from issues like weak consistency and declining image quality over time. To overcome these challenges, inspired by aesthetic principles, we propose a non-invasive plug-in called Uniform Frame Organizer (UFO), which is compatible with any diffusion-based video generation model. The UFO comprises a series of adaptive adapters with adjustable intensities, which can significantly enhance the consistency between the foreground and background of videos and improve image quality without altering the original model parameters when integrated. The training for UFO is simple, efficient, requires minimal resources, and supports stylized training. Its modular design allows for the combination of multiple UFOs, enabling the customization of personalized video generation models.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

delong-liu-bupt/ufo
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Advanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis