SATVSR: Scenario Adaptive Transformer for Cross Scenarios Video Super-Resolution
Yongjie Chen, Tieru Wu

TL;DR
SATVSR introduces a novel transformer-based approach for video super-resolution that adaptively selects scene-relevant information using optical flow and cross-scale aggregation, improving performance across diverse scenarios.
Contribution
The paper proposes an adaptive scenario-aware transformer model that effectively distinguishes relevant information in videos with scene changes, enhancing robustness and accuracy.
Findings
Significant performance improvements on single-scene videos.
Enhanced robustness on cross-scene datasets.
Effective handling of scale variations with a new module.
Abstract
Video Super-Resolution (VSR) aims to recover sequences of high-resolution (HR) frames from low-resolution (LR) frames. Previous methods mainly utilize temporally adjacent frames to assist the reconstruction of target frames. However, in the real world, there is a lot of irrelevant information in adjacent frames of videos with fast scene switching, these VSR methods cannot adaptively distinguish and select useful information. In contrast, with a transformer structure suitable for temporal tasks, we devise a novel adaptive scenario video super-resolution method. Specifically, we use optical flow to label the patches in each video frame, only calculate the attention of patches with the same label. Then select the most relevant label among them to supplement the spatial-temporal information of the target frame. This design can directly make the supplementary information come from the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Advanced Vision and Imaging · Image Processing Techniques and Applications
