I Prefer not to Say: Protecting User Consent in Models with Optional Personal Data
Tobias Leemann, Martin Pawelczyk, Christian Thomas Eberle, Gjergji, Kasneci

TL;DR
This paper introduces Protected User Consent (PUC), a novel framework ensuring machine learning models respect users' choice to share optional personal data, balancing privacy with model performance.
Contribution
It formalizes privacy protection for models using only explicitly consented data, proposes PUC as a loss-optimal solution, and offers a data augmentation method with convergence guarantees.
Findings
PUC effectively protects user privacy without sacrificing model accuracy.
Models can leverage additional data while respecting explicit user consent.
The proposed approach is validated on real datasets and various models.
Abstract
We examine machine learning models in a setup where individuals have the choice to share optional personal information with a decision-making system, as seen in modern insurance pricing models. Some users consent to their data being used whereas others object and keep their data undisclosed. In this work, we show that the decision not to share data can be considered as information in itself that should be protected to respect users' privacy. This observation raises the overlooked problem of how to ensure that users who protect their personal data do not suffer any disadvantages as a result. To address this problem, we formalize protection requirements for models which only use the information for which active user consent was obtained. This excludes implicit information contained in the decision to share data or not. We offer the first solution to this problem by proposing the notion of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Privacy-Preserving Technologies in Data · Ethics and Social Impacts of AI
MethodsLogistic Regression
