Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy
Th\^ong T. Nguy\^en, Xiaokui Xiao, Yin Yang, Siu Cheung Hui, Hyejin, Shin, Junbum Shin

TL;DR
Harmony is a practical system that enables organizations to collect and analyze multi-dimensional user data with local differential privacy, balancing privacy guarantees with utility for statistical and machine learning tasks.
Contribution
We introduce Harmony, a system that supports practical, accurate data collection and analysis under local differential privacy for complex data types and machine learning models.
Findings
Harmony achieves high accuracy in statistical estimates.
Harmony effectively supports machine learning tasks with privacy guarantees.
Experiments confirm Harmony's practicality and efficiency.
Abstract
Organizations with a large user base, such as Samsung and Google, can potentially benefit from collecting and mining users' data. However, doing so raises privacy concerns, and risks accidental privacy breaches with serious consequences. Local differential privacy (LDP) techniques address this problem by only collecting randomized answers from each user, with guarantees of plausible deniability; meanwhile, the aggregator can still build accurate models and predictors by analyzing large amounts of such randomized data. So far, existing LDP solutions either have severely restricted functionality, or focus mainly on theoretical aspects such as asymptotical bounds rather than practical usability and performance. Motivated by this, we propose Harmony, a practical, accurate and efficient system for collecting and analyzing data from smart device users, while satisfying LDP. Harmony applies to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Mobile Crowdsensing and Crowdsourcing
