Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and   Diffusion Model

Chen Rao; Guangyuan Li; Zehua Lan; Jiakai Sun; Junsheng Luan; Wei; Xing; Lei Zhao; Huaizhong Lin; Jianfeng Dong; Dalong Zhang

arXiv:2408.13459·cs.CV·August 27, 2024

Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model

Chen Rao, Guangyuan Li, Zehua Lan, Jiakai Sun, Junsheng Luan, Wei, Xing, Lei Zhao, Huaizhong Lin, Jianfeng Dong, Dalong Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces VD-Diff, a novel video deblurring framework combining diffusion models with a wavelet-aware transformer to better recover high-frequency details and outperform state-of-the-art methods.

Contribution

The paper proposes integrating diffusion models into a wavelet-aware transformer for efficient high-frequency detail recovery in video deblurring.

Findings

01

VD-Diff outperforms SOTA methods on multiple datasets.

02

The diffusion model generates high-frequency priors in a compact latent space.

03

The wavelet-aware transformer effectively preserves low-frequency information.

Abstract

Current video deblurring methods have limitations in recovering high-frequency information since the regression losses are conservative with high-frequency details. Since Diffusion Models (DMs) have strong capabilities in generating high-frequency details, we consider introducing DMs into the video deblurring task. However, we found that directly applying DMs to the video deblurring task has the following problems: (1) DMs require many iteration steps to generate videos from Gaussian noise, which consumes many computational resources. (2) DMs are easily misled by the blurry artifacts in the video, resulting in irrational content and distortion of the deblurred video. To address the above issues, we propose a novel video deblurring framework VD-Diff that integrates the diffusion model into the Wavelet-Aware Dynamic Transformer (WADT). Specifically, we perform the diffusion model in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chen-rao/vd-diff
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Digital Media Forensic Detection

MethodsAttention Is All You Need · Linear Layer · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Dense Connections · Residual Connection · Multi-Head Attention · Byte Pair Encoding · Absolute Position Encodings