Rethinking Alignment in Video Super-Resolution Transformers

Shuwei Shi; Jinjin Gu; Liangbin Xie; Xintao Wang; Yujiu Yang; Chao; Dong

arXiv:2207.08494·cs.CV·October 11, 2022·35 cites

Rethinking Alignment in Video Super-Resolution Transformers

Shuwei Shi, Jinjin Gu, Liangbin Xie, Xintao Wang, Yujiu Yang, Chao, Dong

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper challenges the necessity of alignment modules in video super-resolution Transformers, showing they can perform well without alignment or with a new patch alignment method, leading to state-of-the-art results.

Contribution

It reveals that removing traditional alignment modules and using patch alignment can improve VSR Transformer performance and efficiency.

Findings

01

VSR Transformers can utilize unaligned multi-frame information effectively.

02

Existing alignment methods may sometimes harm VSR Transformer performance.

03

Patch alignment achieves state-of-the-art results on benchmarks.

Abstract

The alignment of adjacent frames is considered an essential operation in video super-resolution (VSR). Advanced VSR models, including the latest VSR Transformers, are generally equipped with well-designed alignment modules. However, the progress of the self-attention mechanism may violate this common sense. In this paper, we rethink the role of alignment in VSR Transformers and make several counter-intuitive observations. Our experiments show that: (i) VSR Transformers can directly utilize multi-frame information from unaligned videos, and (ii) existing alignment methods are sometimes harmful to VSR Transformers. These observations indicate that we can further improve the performance of VSR Transformers simply by removing the alignment module and adopting a larger attention window. Nevertheless, such designs will dramatically increase the computational burden, and cannot deal with large…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xpixelgroup/rethinkvsralignment
pytorchOfficial

Videos

Rethinking Alignment in Video Super-Resolution Transformers· slideslive

Taxonomy

TopicsAdvanced Image Processing Techniques · Advanced Vision and Imaging