Differential Data Analysis for Recommender Systems
Richard Chow, Hongxia Jin, Bart Knijnenburg, Gokay Saldamli

TL;DR
This paper introduces differential data analysis to identify important data in recommender systems, aiming to improve privacy and reduce storage costs while maintaining accuracy.
Contribution
It proposes a novel differential data analysis technique to assess data importance, demonstrating its effectiveness on location and rating datasets for privacy and efficiency benefits.
Findings
Significant data reduction achievable without loss of accuracy
User ratings and location attributes influence data importance
Enhanced privacy levels with maintained recommendation quality
Abstract
We present techniques to characterize which data is important to a recommender system and which is not. Important data is data that contributes most to the accuracy of the recommendation algorithm, while less important data contributes less to the accuracy or even decreases it. Characterizing the importance of data has two potential direct benefits: (1) increased privacy and (2) reduced data management costs, including storage. For privacy, we enable increased recommendation accuracy for comparable privacy levels using existing data obfuscation techniques. For storage, our results indicate that we can achieve large reductions in recommendation data and yet maintain recommendation accuracy. Our main technique is called differential data analysis. The name is inspired by other sorts of differential analysis, such as differential power analysis and differential cryptanalysis, where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Recommender Systems and Techniques · Privacy, Security, and Data Protection
