Uniqueness ratio as a predictor of a privacy leakage
Danah A. AlSalem AlKhashti

TL;DR
This paper proposes using the uniqueness ratio of join attributes as an interpretable, pre-join indicator to predict potential privacy leakage after database joins, aiding data privacy management.
Contribution
It introduces the uniqueness ratio as a simple, effective pre-join metric for estimating re-identification risk, filling a gap in privacy risk assessment tools.
Findings
High pre-join uniqueness correlates with increased post-join re-identification.
Uniqueness ratio provides an explainable signal for privacy risk assessment.
Experimental results validate the relationship between attribute uniqueness and identity exposure.
Abstract
Identity leakage can emerge when independent databases are joined, even when each dataset is anonymized individually. While previous work focuses on post-join detection or complex privacy models, little attention has been given to simple, interpretable pre-join indicators that can warn data engineers and database administrators before integration occurs. This study investigates the uniqueness ratio of candidate join attributes as an early predictor of re-identification risk. Using synthetic multi-table datasets, we compute the uniqueness ratio of attribute combinations within each database and examine how these ratios correlate with identity exposure after the join. Experimental results show a strong relationship between high pre-join uniqueness and increased post-join leakage, measured by the proportion of records that become uniquely identifiable or fall into very small groups. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Privacy, Security, and Data Protection
