Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs
Anupam Datta, Matthew Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak, Sen

TL;DR
This paper introduces a formal approach to detect and repair proxy use of protected information in data-driven systems, ensuring privacy while maintaining model accuracy through analysis, detection, and transformation techniques.
Contribution
It presents a novel formal definition of proxy use, a program analysis method for detection, and a repair algorithm to eliminate inappropriate proxy use in models.
Findings
Effective detection of proxy use in social datasets
Ability to remove proxy use without significant accuracy loss
Detection surpasses existing techniques in challenging cases
Abstract
This paper presents an approach to formalizing and enforcing a class of use privacy properties in data-driven systems. In contrast to prior work, we focus on use restrictions on proxies (i.e. strong predictors) of protected information types. Our definition relates proxy use to intermediate computations that occur in a program, and identify two essential properties that characterize this behavior: 1) its result is strongly associated with the protected information type in question, and 2) it is likely to causally affect the final output of the program. For a specific instantiation of this definition, we present a program analysis technique that detects instances of proxy use in a model, and provides a witness that identifies which parts of the corresponding program exhibit the behavior. Recognizing that not all instances of proxy use of a protected information type are inappropriate, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Ethics and Social Impacts of AI · Blockchain Technology Applications and Security
