An Algebraic Topological Approach to Privacy: Numerical and Categorical Data
Alberto Speranzon, Shaunak D. Bopardikar

TL;DR
This paper introduces an algebraic topological framework for achieving k-anonymity in databases, utilizing simplicial complexes and persistent homology to analyze and optimize data anonymization.
Contribution
It presents a novel application of algebraic topology to data privacy, providing new algorithms and insights for anonymizing both metric and categorical data.
Findings
Topological methods can characterize k-anonymity conditions.
Weighted barcode diagrams help balance data privacy and utility.
Simulations demonstrate the effectiveness of the approach.
Abstract
In this paper, we cast the classic problem of achieving k-anonymity for a given database as a problem in algebraic topology. Using techniques from this field of mathematics, we propose a framework for k-anonymity that brings new insights and algorithms to anonymize a database. We begin by addressing the simpler case when the data lies in a metric space. This case is instrumental to introduce the main ideas and notation. Specifically, by mapping a database to the Euclidean space and by considering the distance between datapoints, we introduce a simplicial representation of the data and show how concepts from algebraic topology, such as the nerve complex and persistent homology, can be applied to efficiently obtain the entire spectrum of k-anonymity of the database for various values of k and levels of generalization. For this representation, we provide an analytic characterization of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Privacy-Preserving Technologies in Data · Complexity and Algorithms in Graphs
