VFHQ: A High-Quality Dataset and Benchmark for Video Face   Super-Resolution

Liangbin Xie. Xintao Wang; Honglun Zhang; Chao Dong; Ying Shan

arXiv:2205.03409·eess.IV·May 10, 2022·1 cites

VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution

Liangbin Xie. Xintao Wang, Honglun Zhang, Chao Dong, Ying Shan

PDF

Open Access

TL;DR

This paper introduces VFHQ, a high-quality video face dataset, and demonstrates that models trained on it produce sharper, more consistent super-resolved videos than those trained on lower-quality datasets, advancing VFSR performance.

Contribution

The paper presents a new high-quality dataset for video face super-resolution and provides a benchmarking study of state-of-the-art algorithms using this dataset.

Findings

01

Models trained on VFHQ produce sharper edges and finer textures.

02

Temporal information significantly improves video consistency and visual quality.

03

Benchmarking reveals the superiority of models trained on VFHQ over existing datasets.

Abstract

Most of the existing video face super-resolution (VFSR) methods are trained and evaluated on VoxCeleb1, which is designed specifically for speaker identification and the frames in this dataset are of low quality. As a consequence, the VFSR models trained on this dataset can not output visual-pleasing results. In this paper, we develop an automatic and scalable pipeline to collect a high-quality video face dataset (VFHQ), which contains over $16, 000$ high-fidelity clips of diverse interview scenarios. To verify the necessity of VFHQ, we further conduct experiments and demonstrate that VFSR models trained on our VFHQ dataset can generate results with sharper edges and finer textures than those trained on VoxCeleb1. In addition, we show that the temporal information plays a pivotal role in eliminating video consistency issues as well as further improving visual performance. Based on VFHQ,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Speech and Audio Processing · Face recognition and analysis