FaSDiff: Balancing Perception and Semantics in Face Compression via Stable Diffusion Priors

Yimin Zhou; Yichong Xia; Bin Chen; Mingyao Hong; Jiawei Li; Zhi Wang; Yaowei Wang

arXiv:2505.05870·cs.CV·November 12, 2025

FaSDiff: Balancing Perception and Semantics in Face Compression via Stable Diffusion Priors

Yimin Zhou, Yichong Xia, Bin Chen, Mingyao Hong, Jiawei Li, Zhi Wang, Yaowei Wang

PDF

Open Access

TL;DR

FaSDiff is a novel face image compression method that balances visual quality and semantic preservation by integrating a high-frequency-sensitive compressor with a diffusion prior, outperforming existing techniques.

Contribution

It introduces a hybrid low-frequency enhancement module and a stable diffusion framework tailored for face compression, improving both perceptual fidelity and semantic consistency.

Findings

01

Outperforms state-of-the-art methods in perceptual quality metrics.

02

Enhances downstream machine vision task performance.

03

Effectively balances human visual fidelity and machine understanding.

Abstract

With the increasing deployment of facial image data across a wide range of applications, efficient compression tailored to facial semantics has become critical for both storage and transmission. While recent learning-based face image compression methods have achieved promising results, they often suffer from degraded reconstruction quality at low bit rates. Directly applying diffusion-based generative priors to this task leads to suboptimal performance in downstream machine vision tasks, primarily due to poor preservation of high-frequency details. In this work, we propose FaSDiff (\textbf{Fa}cial Image Compression with a \textbf{S}table \textbf{Diff}usion Prior), a novel diffusion-driven compression framework designed to enhance both visual fidelity and semantic consistency. FaSDiff incorporates a high-frequency-sensitive compressor to capture fine-grained details and generate robust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment

MethodsDiffusion