VDTR: Video Deblurring with Transformer

Mingdeng Cao; Yanbo Fan; Yong Zhang; Jue Wang; Yujiu Yang

arXiv:2204.08023·cs.CV·April 19, 2022

VDTR: Video Deblurring with Transformer

Mingdeng Cao, Yanbo Fan, Yong Zhang, Jue Wang, Yujiu Yang

PDF

Open Access 1 Repo

TL;DR

VDTR introduces a Transformer-based approach for video deblurring, leveraging long-range relation modeling to improve spatial and temporal restoration, outperforming CNN-based methods on multiple benchmarks.

Contribution

First adaptation of Transformer architecture for video deblurring, addressing non-uniform blurs, misalignment, and high computational costs with a hierarchical, window-based attention model.

Findings

01

Achieves competitive results on DVD, GOPRO, REDS, and BSD benchmarks.

02

Outperforms CNN-based methods in video deblurring tasks.

03

Demonstrates the effectiveness of Transformer for spatial and temporal modeling in videos.

Abstract

Video deblurring is still an unsolved problem due to the challenging spatio-temporal modeling process. While existing convolutional neural network-based methods show a limited capacity for effective spatial and temporal modeling for video deblurring. This paper presents VDTR, an effective Transformer-based model that makes the first attempt to adapt Transformer for video deblurring. VDTR exploits the superior long-range and relation modeling capabilities of Transformer for both spatial and temporal modeling. However, it is challenging to design an appropriate Transformer-based model for video deblurring due to the complicated non-uniform blurs, misalignment across multiple frames and the high computational costs for high-resolution spatial modeling. To address these problems, VDTR advocates performing attention within non-overlapping windows and exploiting the hierarchical structure for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ljzycmd/vdtr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Image and Signal Denoising Methods

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dropout · Label Smoothing · Adam · Residual Connection · Absolute Position Encodings · Byte Pair Encoding