TL;DR
This paper introduces a joint video frame interpolation and super-resolution framework that enhances both spatial and temporal resolution of videos, using a multi-scale temporal loss for improved temporal consistency.
Contribution
It presents a novel joint VFI-SR model with a multi-scale temporal loss, enabling effective up-scaling of videos from 2K 30 fps to 4K 60 fps with enhanced temporal regularization.
Findings
The proposed method outperforms existing approaches in quality and temporal consistency.
The multi-scale temporal loss improves motion smoothness in up-scaled videos.
Extensive experiments validate the effectiveness of the joint framework.
Abstract
Super-resolution (SR) has been widely used to convert low-resolution legacy videos to high-resolution (HR) ones, to suit the increasing resolution of displays (e.g. UHD TVs). However, it becomes easier for humans to notice motion artifacts (e.g. motion judder) in HR videos being rendered on larger-sized display devices. Thus, broadcasting standards support higher frame rates for UHD (Ultra High Definition) videos (4K@60 fps, 8K@120 fps), meaning that applying SR only is insufficient to produce genuine high quality videos. Hence, to up-convert legacy videos for realistic applications, not only SR but also video frame interpolation (VFI) is necessitated. In this paper, we first propose a joint VFI-SR framework for up-scaling the spatio-temporal resolution of videos from 2K 30 fps to 4K 60 fps. For this, we propose a novel training scheme with a multi-scale temporal loss that imposes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
