Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks
Eugenio Lomurno, Matteo Matteucci

TL;DR
This paper introduces a Knowledge Recycling pipeline with Generative Knowledge Distillation to enhance synthetic data quality for training classifiers, achieving high performance and privacy preservation against membership inference attacks.
Contribution
It proposes a novel pipeline and GKD technique that improve synthetic data utility and privacy in training models, especially for medical imaging.
Findings
Models trained on synthetic data outperform real data in some cases.
The pipeline significantly reduces the performance gap between real and synthetic data models.
Models exhibit near-complete immunity to membership inference attacks.
Abstract
Generative artificial intelligence has transformed the generation of synthetic data, providing innovative solutions to challenges like data scarcity and privacy, which are particularly critical in fields such as medicine. However, the effective use of this synthetic data to train high-performance models remains a significant challenge. This paper addresses this issue by introducing Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers. At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information provided to classifiers through a synthetic dataset regeneration and soft labelling mechanism. The KR pipeline has been tested on a variety of datasets, with a focus on six highly heterogeneous medical image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection
MethodsFocus · Knowledge Distillation
