The Price of Privacy in Untrusted Recommendation Engines
Siddhartha Banerjee, Nidhi Hegde, Laurent Massouli\'e

TL;DR
This paper investigates the impact of local differential privacy on the accuracy of recommender systems, revealing a fundamental trade-off between privacy and the amount of user data needed for effective item-clustering.
Contribution
It introduces new bounds on sample complexity for private item-clustering, develops a novel algorithm MaxSense, and explores the differences between information-rich and information-scarce regimes.
Findings
Spectral clustering achieves optimal sample complexity in information-rich regime.
MaxSense algorithm is optimal in information-scarce regime.
New techniques for bounding mutual information under channel-mismatch.
Abstract
Recent increase in online privacy concerns prompts the following question: can a recommender system be accurate if users do not entrust it with their private data? To answer this, we study the problem of learning item-clusters under local differential privacy, a powerful, formal notion of data privacy. We develop bounds on the sample-complexity of learning item-clusters from privatized user inputs. Significantly, our results identify a sample-complexity separation between learning in an information-rich and an information-scarce regime, thereby highlighting the interaction between privacy and the amount of information (ratings) available to each user. In the information-rich regime, where each user rates at least a constant fraction of items, a spectral clustering approach is shown to achieve a sample-complexity lower bound derived from a simple information-theoretic argument based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSpectral Clustering
