Arbitrary-Scale Video Super-Resolution with Structural and Textural   Priors

Wei Shang; Dongwei Ren; Wanying Zhang; Yuming Fang; Wangmeng Zuo; Kede; Ma

arXiv:2407.09919·cs.CV·July 16, 2024

Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors

Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede, Ma

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel arbitrary-scale video super-resolution method that leverages multi-scale structural and textural priors to improve quality, generalization, and speed, outperforming existing methods.

Contribution

The paper proposes ST-AVSR, a new AVSR framework that integrates structural and textural priors from a pre-trained VGG network to enhance super-resolution performance.

Findings

01

Significant improvement in super-resolution quality.

02

Enhanced generalization ability across scales.

03

Faster inference speed than state-of-the-art methods.

Abstract

Arbitrary-scale video super-resolution (AVSR) aims to enhance the resolution of video frames, potentially at various scaling factors, which presents several challenges regarding spatial detail reproduction, temporal consistency, and computational complexity. In this paper, we first describe a strong baseline for AVSR by putting together three variants of elementary building blocks: 1) a flow-guided recurrent unit that aggregates spatiotemporal information from previous frames, 2) a flow-refined cross-attention unit that selects spatiotemporal information from future frames, and 3) a hyper-upsampling unit that generates scaleaware and content-independent upsampling kernels. We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network. This prior has proven effective in discriminating structure and texture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shangwei5/st-avsr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Video Quality Assessment

MethodsMax Pooling · Dropout · Dense Connections · Convolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Softmax