Video Colorization with Pre-trained Text-to-Image Diffusion Models

Hanyuan Liu; Minshan Xie; Jinbo Xing; Chengze Li; Tien-Tsin Wong

arXiv:2306.01732·cs.CV·June 5, 2023·2 cites

Video Colorization with Pre-trained Text-to-Image Diffusion Models

Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong

PDF

Open Access

TL;DR

ColorDiffuser leverages pre-trained text-to-image diffusion models with novel techniques to achieve state-of-the-art video colorization, ensuring high color fidelity and temporal consistency.

Contribution

The paper introduces ColorDiffuser, a novel adaptation of pre-trained diffusion models for video colorization with new attention and sampling strategies.

Findings

01

Achieves state-of-the-art performance on benchmark datasets.

02

Improves temporal coherence and color vividness.

03

Outperforms existing methods in color fidelity and visual quality.

Abstract

Video colorization is a challenging task that involves inferring plausible and temporally consistent colors for grayscale frames. In this paper, we present ColorDiffuser, an adaptation of a pre-trained text-to-image latent diffusion model for video colorization. With the proposed adapter-based approach, we repropose the pre-trained text-to-image model to accept input grayscale video frames, with the optional text description, for video colorization. To enhance the temporal coherence and maintain the vividness of colorization across frames, we propose two novel techniques: the Color Propagation Attention and Alternated Sampling Strategy. Color Propagation Attention enables the model to refine its colorization decision based on a reference latent frame, while Alternated Sampling Strategy captures spatiotemporal dependencies by using the next and previous adjacent latent frames…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsDiffusion · Latent Diffusion Model · Colorization