Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede, Ma

TL;DR
This paper introduces a novel arbitrary-scale video super-resolution method that leverages multi-scale structural and textural priors to improve quality, generalization, and speed, outperforming existing methods.
Contribution
The paper proposes ST-AVSR, a new AVSR framework that integrates structural and textural priors from a pre-trained VGG network to enhance super-resolution performance.
Findings
Significant improvement in super-resolution quality.
Enhanced generalization ability across scales.
Faster inference speed than state-of-the-art methods.
Abstract
Arbitrary-scale video super-resolution (AVSR) aims to enhance the resolution of video frames, potentially at various scaling factors, which presents several challenges regarding spatial detail reproduction, temporal consistency, and computational complexity. In this paper, we first describe a strong baseline for AVSR by putting together three variants of elementary building blocks: 1) a flow-guided recurrent unit that aggregates spatiotemporal information from previous frames, 2) a flow-refined cross-attention unit that selects spatiotemporal information from future frames, and 3) a hyper-upsampling unit that generates scaleaware and content-independent upsampling kernels. We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network. This prior has proven effective in discriminating structure and texture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image and Video Quality Assessment
MethodsMax Pooling · Dropout · Dense Connections · Convolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Softmax
