TL;DR
SkinGenBench systematically evaluates how different generative models and preprocessing pipelines affect synthetic dermoscopic image quality and melanoma diagnosis performance, highlighting architecture choice over preprocessing complexity.
Contribution
This work introduces SkinGenBench, a benchmark comparing StyleGAN2-ADA and diffusion models for synthetic dermoscopic data augmentation in melanoma detection.
Findings
StyleGAN2-ADA produces images with lower FID and KID scores, closer to real data.
Synthetic augmentation improves melanoma detection F1-score by 8-15%.
Advanced artifact removal offers limited benefits for diagnostic performance.
Abstract
This work introduces SkinGenBench, a systematic biomedical imaging benchmark that investigates how preprocessing complexity interacts with generative model choice for synthetic dermoscopic image augmentation and downstream melanoma diagnosis. Using a curated dataset of dermoscopic images from HAM10000 and MILK10K across five lesion classes, we evaluate the two representative generative paradigms: StyleGAN2-ADA and Denoising Diffusion Probabilistic Models (DDPMs) under basic geometric augmentation and advanced artifact removal pipelines. Synthetic melanoma images are assessed using established perceptual and distributional metrics (FID, KID, IS), feature space analysis, and their impact on diagnostic performance across five downstream classifiers. Experimental results demonstrate that generative architecture choice has a stronger influence on both image fidelity and diagnostic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
