Database Matching Under Noisy Synchronization Errors
Serhat Bakirtas, Elza Erkip

TL;DR
This paper develops a unified framework for database matching that accounts for obfuscation and synchronization errors, providing theoretical conditions for successful re-identification under noisy, time-indexed data.
Contribution
It introduces new algorithms and theoretical conditions for matching databases with synchronization errors and obfuscation, expanding understanding of privacy risks in time-indexed data.
Findings
Derived necessary and sufficient conditions for successful database matching.
Proposed replica detection and seeded deletion detection algorithms.
Analyzed the impact of adversarial deletion, seedless matching, and zero-rate regimes.
Abstract
The re-identification or de-anonymization of users from anonymized data through matching with publicly available correlated user data has raised privacy concerns, leading to the complementary measure of obfuscation in addition to anonymization. Recent research provides a fundamental understanding of the conditions under which privacy attacks, in the form of database matching, are successful in the presence of obfuscation. Motivated by synchronization errors stemming from the sampling of time-indexed databases, this paper presents a unified framework considering both obfuscation and synchronization errors and investigates the matching of databases under noisy entry repetitions. By investigating different structures for the repetition pattern, replica detection and seeded deletion detection algorithms are devised and sufficient and necessary conditions for successful matching are derived.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Internet Traffic Analysis and Secure E-voting
