Recurrent Video Restoration Transformer with Guided Deformable Attention
Jingyun Liang, Yuchen Fan, Xiaoyu Xiang, Rakesh Ranjan and, Eddy Ilg, Simon Green, Jiezhang Cao, Kai Zhang, Radu Timofte and, Luc Van Gool

TL;DR
This paper introduces RVRT, a recurrent video restoration transformer that balances model size, effectiveness, and efficiency by processing local frames in parallel within a globally recurrent framework, utilizing guided deformable attention.
Contribution
The paper proposes a novel recurrent transformer architecture for video restoration that combines parallel local processing with global recurrence and clip-to-clip alignment.
Findings
Achieves state-of-the-art performance on video super-resolution, deblurring, and denoising tasks.
Balances model size, memory usage, and runtime effectively.
Demonstrates superior results on benchmark datasets.
Abstract
Video restoration aims at restoring multiple high-quality frames from multiple low-quality frames. Existing video restoration methods generally fall into two extreme cases, i.e., they either restore all frames in parallel or restore the video frame by frame in a recurrent way, which would result in different merits and drawbacks. Typically, the former has the advantage of temporal information fusion. However, it suffers from large model size and intensive memory consumption; the latter has a relatively small model size as it shares parameters across frames; however, it lacks long-range dependency modeling ability and parallelizability. In this paper, we attempt to integrate the advantages of the two cases by proposing a recurrent video restoration transformer, namely RVRT. RVRT processes local neighboring frames in parallel within a globally recurrent framework which can achieve a good…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Image Processing Techniques · Image Processing Techniques and Applications · Image and Signal Denoising Methods
MethodsContrastive Language-Image Pre-training
