Differential Privacy in Machine Learning: A Survey from Symbolic AI to LLMs
Francisco Aguilera-Mart\'inez, Fernando Berzal

TL;DR
This survey comprehensively reviews the evolution and application of differential privacy in machine learning, highlighting foundational concepts, integration methods, and practical evaluation techniques to promote secure AI development.
Contribution
It provides an extensive overview of differential privacy's theoretical foundations and its practical integration into machine learning models, covering recent advances from symbolic AI to large language models.
Findings
Differential privacy effectively limits information leakage in ML models.
Various methods exist for integrating DP into training processes.
Evaluation techniques for DP in ML are well-established.
Abstract
Machine learning models should not reveal particular information that is not otherwise accessible. Differential privacy provides a formal framework to mitigate privacy risks by ensuring that the inclusion or exclusion of any single data point does not significantly alter the output of an algorithm, thus limiting the exposure of private information. This survey reviews the foundational definitions of differential privacy and traces their evolution through key theoretical and applied contributions. It then provides an in-depth examination of how DP has been integrated into machine learning models, analyzing existing proposals and methods to preserve privacy when training ML models. Finally, it describes how DP-based ML techniques can be evaluated in practice. By offering a comprehensive overview of differential privacy in machine learning, this work aims to contribute to the ongoing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
