VDTR: Video Deblurring with Transformer
Mingdeng Cao, Yanbo Fan, Yong Zhang, Jue Wang, Yujiu Yang

TL;DR
VDTR introduces a Transformer-based approach for video deblurring, leveraging long-range relation modeling to improve spatial and temporal restoration, outperforming CNN-based methods on multiple benchmarks.
Contribution
First adaptation of Transformer architecture for video deblurring, addressing non-uniform blurs, misalignment, and high computational costs with a hierarchical, window-based attention model.
Findings
Achieves competitive results on DVD, GOPRO, REDS, and BSD benchmarks.
Outperforms CNN-based methods in video deblurring tasks.
Demonstrates the effectiveness of Transformer for spatial and temporal modeling in videos.
Abstract
Video deblurring is still an unsolved problem due to the challenging spatio-temporal modeling process. While existing convolutional neural network-based methods show a limited capacity for effective spatial and temporal modeling for video deblurring. This paper presents VDTR, an effective Transformer-based model that makes the first attempt to adapt Transformer for video deblurring. VDTR exploits the superior long-range and relation modeling capabilities of Transformer for both spatial and temporal modeling. However, it is challenging to design an appropriate Transformer-based model for video deblurring due to the complicated non-uniform blurs, misalignment across multiple frames and the high computational costs for high-resolution spatial modeling. To address these problems, VDTR advocates performing attention within non-overlapping windows and exploiting the hierarchical structure for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Image and Signal Denoising Methods
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Dropout · Label Smoothing · Adam · Residual Connection · Absolute Position Encodings · Byte Pair Encoding
