LLV-FSR: Exploiting Large Language-Vision Prior for Face   Super-resolution

Chenyang Wang; Wenjie An; Kui Jiang; Xianming Liu; Junjun Jiang

arXiv:2411.09293·cs.CV·November 15, 2024

LLV-FSR: Exploiting Large Language-Vision Prior for Face Super-resolution

Chenyang Wang, Wenjie An, Kui Jiang, Xianming Liu, Junjun Jiang

PDF

Open Access

TL;DR

This paper introduces LLV-FSR, a face super-resolution framework that leverages large vision-language models and pluralistic priors like captions and depth maps to enhance reconstruction quality and perceptual realism.

Contribution

The novel integration of vision-language priors into face super-resolution to utilize higher-order semantic and non-visual information for improved results.

Findings

01

Surpasses SOTA by 0.43dB PSNR on MMCelebA-HQ dataset

02

Significantly improves both reconstruction and perceptual quality

03

Effectively incorporates pluralistic priors like captions and depth maps

Abstract

Existing face super-resolution (FSR) methods have made significant advancements, but they primarily super-resolve face with limited visual information, original pixel-wise space in particular, commonly overlooking the pluralistic clues, like the higher-order depth and semantics, as well as non-visual inputs (text caption and description). Consequently, these methods struggle to produce a unified and meaningful representation from the input face. We suppose that introducing the language-vision pluralistic representation into unexplored potential embedding space could enhance FSR by encoding and exploiting the complementarity across language-vision prior. This motivates us to propose a new framework called LLV-FSR, which marries the power of large vision-language model and higher-order visual prior with the challenging task of FSR. Specifically, besides directly absorbing knowledge from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Face recognition and analysis