You Only Align Once: Bidirectional Interaction for Spatial-Temporal Video Super-Resolution
Mengshun Hu, Kui Jiang, Zhixiang Nie, Zheng Wang

TL;DR
This paper introduces a bidirectional interaction network for spatial-temporal video super-resolution that reduces redundant alignments, improves information flow, and enhances efficiency, outperforming existing methods.
Contribution
It proposes a novel recurrent network with only one alignment and fusion step, utilizing bidirectional inference and a Hybrid Fusion Module for superior ST-VSR performance.
Findings
Outperforms state-of-the-art methods in accuracy.
Reduces computational cost by approximately 22%.
Effectively exploits bidirectional motion and spatial information.
Abstract
Spatial-Temporal Video Super-Resolution (ST-VSR) technology generates high-quality videos with higher resolution and higher frame rates. Existing advanced methods accomplish ST-VSR tasks through the association of Spatial and Temporal video super-resolution (S-VSR and T-VSR). These methods require two alignments and fusions in S-VSR and T-VSR, which is obviously redundant and fails to sufficiently explore the information flow of consecutive spatial LR frames. Although bidirectional learning (future-to-past and past-to-future) was introduced to cover all input frames, the direct fusion of final predictions fails to sufficiently exploit intrinsic correlations of bidirectional motion learning and spatial information from all frames. We propose an effective yet efficient recurrent network with bidirectional interaction for ST-VSR, where only one alignment and fusion is needed. Specifically,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Advanced Vision and Imaging
