FaSDiff: Balancing Perception and Semantics in Face Compression via Stable Diffusion Priors
Yimin Zhou, Yichong Xia, Bin Chen, Mingyao Hong, Jiawei Li, Zhi Wang, Yaowei Wang

TL;DR
FaSDiff is a novel face image compression method that balances visual quality and semantic preservation by integrating a high-frequency-sensitive compressor with a diffusion prior, outperforming existing techniques.
Contribution
It introduces a hybrid low-frequency enhancement module and a stable diffusion framework tailored for face compression, improving both perceptual fidelity and semantic consistency.
Findings
Outperforms state-of-the-art methods in perceptual quality metrics.
Enhances downstream machine vision task performance.
Effectively balances human visual fidelity and machine understanding.
Abstract
With the increasing deployment of facial image data across a wide range of applications, efficient compression tailored to facial semantics has become critical for both storage and transmission. While recent learning-based face image compression methods have achieved promising results, they often suffer from degraded reconstruction quality at low bit rates. Directly applying diffusion-based generative priors to this task leads to suboptimal performance in downstream machine vision tasks, primarily due to poor preservation of high-frequency details. In this work, we propose FaSDiff (\textbf{Fa}cial Image Compression with a \textbf{S}table \textbf{Diff}usion Prior), a novel diffusion-driven compression framework designed to enhance both visual fidelity and semantic consistency. FaSDiff incorporates a high-frequency-sensitive compressor to capture fine-grained details and generate robust…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment
MethodsDiffusion
