# A Concentration of Measure Approach to Database De-anonymization

**Authors:** Farhad Shirani, Siddharth Garg, Elza Erkip

arXiv: 1901.07655 · 2019-05-06

## TL;DR

This paper applies concentration of measure techniques to analyze the problem of matching correlated high-dimensional databases, deriving conditions for successful de-anonymization and demonstrating their tightness.

## Contribution

It introduces a novel approach using concentration of measure theorems to establish necessary and sufficient conditions for database de-anonymization.

## Key findings

- Derived tight conditions for successful database matching.
- Analyzed independent and Markovian database models.
- Provided converse results for non-matching distributions.

## Abstract

In this paper, matching of correlated high-dimensional databases is investigated. A stochastic database model is considered where the correlation among the database entries is governed by an arbitrary joint distribution. Concentration of measure theorems such as typicality and laws of large numbers are used to develop a database matching scheme and derive necessary conditions for successful matching. Furthermore, it is shown that these conditions are tight through a converse result which characterizes a set of distributions on the database entries for which reliable matching is not possible. The necessary and sufficient conditions for reliable matching are evaluated in the cases when the database entries are independent and identically distributed as well as under Markovian database models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.07655/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/1901.07655/full.md

---
Source: https://tomesphere.com/paper/1901.07655