Utility-Privacy Tradeoff in Databases: An Information-theoretic Approach
Lalitha Sankar, S. Raj Rajagopalan, and H. Vincent Poor

TL;DR
This paper introduces an information-theoretic framework to quantify and optimize the tradeoff between data utility and privacy in electronic databases, providing models, bounds, and encoding schemes.
Contribution
It develops a comprehensive analytical model with tight bounds for utility-privacy tradeoffs, including data models, encoding schemes, and prior knowledge considerations.
Findings
Derived utility-privacy tradeoff regions.
Proposed optimal encoding schemes.
Modeled prior knowledge impacts.
Abstract
Ensuring the usefulness of electronic data sources while providing necessary privacy guarantees is an important unsolved problem. This problem drives the need for an analytical framework that can quantify the safety of personally identifiable information (privacy) while still providing a quantifable benefit (utility) to multiple legitimate information consumers. This paper presents an information-theoretic framework that promises an analytical model guaranteeing tight bounds of how much utility is possible for a given level of privacy and vice-versa. Specific contributions include: i) stochastic data models for both categorical and numerical data; ii) utility-privacy tradeoff regions and the encoding (sanization) schemes achieving them for both classes and their practical relevance; and iii) modeling of prior knowledge at the user and/or data source and optimal encoding schemes for both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
