Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset

Jakub Wasala; Bartlomiej Wrzalski; Kornelia Noculak; Yuliia Tarasenko; Oliwer Krupa; Jan Kocon; Grzegorz Chodak

arXiv:2505.02255·cs.CV·May 12, 2025

Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset

Jakub Wasala, Bartlomiej Wrzalski, Kornelia Noculak, Yuliia Tarasenko, Oliwer Krupa, Jan Kocon, Grzegorz Chodak

PDF

Open Access

TL;DR

This paper introduces a cost-effective method to improve the quality of images generated by distilled diffusion models using synthetic datasets and a refinement network, achieving near-baseline quality at significantly reduced computational costs.

Contribution

The study proposes a novel synthetic dataset-based training approach for enhancing distilled diffusion models, reducing computational costs while maintaining high image quality.

Findings

01

Achieved up to 82% reduction in computational cost.

02

Generated photorealistic portraits comparable to baseline models.

03

Demonstrated effectiveness of synthetic paired datasets for model refinement.

Abstract

This study presents a novel approach to enhance the cost-to-quality ratio of image generation with diffusion models. We hypothesize that differences between distilled (e.g. FLUX.1-schnell) and baseline (e.g. FLUX.1-dev) models are consistent and, therefore, learnable within a specialized domain, like portrait generation. We generate a synthetic paired dataset and train a fast image-to-image translation head. Using two sets of low- and high-quality synthetic images, our model is trained to refine the output of a distilled generator (e.g., FLUX.1-schnell) to a level comparable to a baseline model like FLUX.1-dev, which is more computationally intensive. Our results show that the pipeline, which combines a distilled version of a large generative model with our enhancement layer, delivers similar photorealistic portraits to the baseline version with up to an 82% decrease in computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Image Enhancement Techniques

MethodsDiffusion