Towards Biologically Plausible and Private Gene Expression Data Generation
Dingfan Chen, Marie Oestreich, Tejumade Afonja, Raouf Kerkouche,, Matthias Becker, Mario Fritz

TL;DR
This paper systematically evaluates differential privacy generative models on real-world gene expression data, revealing that while they maintain utility on standard metrics, they fail to capture biological plausibility, highlighting the need for improved methods.
Contribution
It provides a comprehensive analysis of five DP gene expression generation methods, assessing their utility, statistical, and biological properties, which is lacking in prior research.
Findings
Most methods achieve reasonable utility on standard metrics.
None of the methods accurately capture biological characteristics.
Current DP models may overestimate their effectiveness in real-world biological data.
Abstract
Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions. In this paper, we initiate a systematic analysis of how DP generative models perform in their natural application scenarios, specifically focusing on real-world gene expression data. We conduct a comprehensive analysis of five representative DP generation methods, examining them from various angles, such as downstream utility, statistical properties, and biological plausibility. Our extensive evaluation illuminates the unique characteristics of each DP generation method, offering critical insights into the strengths and weaknesses of each approach,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced biosensing and bioanalysis techniques
