TL;DR
This paper introduces TCNet, a novel end-to-end network for video super-resolution that emphasizes temporal consistency through self-alignment, correlation, and attention mechanisms, leading to superior results.
Contribution
The paper proposes a comprehensive temporal consistency learning framework with a spatio-temporal stability module, self-attention, and hybrid recurrent architecture for improved VSR performance.
Findings
TCNet outperforms state-of-the-art methods on benchmark datasets.
The proposed modules enhance temporal stability and structural preservation.
Experiments validate the effectiveness of the multi-stage fusion approach.
Abstract
Video super-resolution (VSR) is a task that aims to reconstruct high-resolution (HR) frames from the low-resolution (LR) reference frame and multiple neighboring frames. The vital operation is to utilize the relative misaligned frames for the current frame reconstruction and preserve the consistency of the results. Existing methods generally explore information propagation and frame alignment to improve the performance of VSR. However, few studies focus on the temporal consistency of inter-frames. In this paper, we propose a Temporal Consistency learning Network (TCNet) for VSR in an end-to-end manner, to enhance the consistency of the reconstructed videos. A spatio-temporal stability module is designed to learn the self-alignment from inter-frames. Especially, the correlative matching is employed to exploit the spatial dependency from each frame to maintain structural stability.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
