Differentially Private $K$-means Clustering Applied to Meter Data Analysis and Synthesis
Nikhil Ravi, Anna Scaglione, Sachin Kadam, Reinhard Gentz, Sean, Peisert, Brent Lunghino, Emmanuel Levijarvi, and Aram Shumavon

TL;DR
This paper develops a differentially private $K$-means clustering method for smart meter data, enabling privacy-preserving analysis and synthetic data generation, addressing current policy shortcomings.
Contribution
It introduces a novel differentially private $K$-means algorithm tailored for load data, facilitating privacy-preserving customer segmentation and synthetic data creation.
Findings
Successfully clusters load data while preserving privacy.
Generates synthetic load data consistent with original data.
Maintains utility in real-world power load analysis.
Abstract
The proliferation of smart meters has resulted in a large amount of data being generated. It is increasingly apparent that methods are required for allowing a variety of stakeholders to leverage the data in a manner that preserves the privacy of the consumers. The sector is scrambling to define policies, such as the so called `15/15 rule', to respond to the need. However, the current policies fail to adequately guarantee privacy. In this paper, we address the problem of allowing third parties to apply -means clustering, obtaining customer labels and centroids for a set of load time series by applying the framework of differential privacy. We leverage the method to design an algorithm that generates differentially private synthetic load data consistent with the labeled data. We test our algorithm's utility by answering summary statistics such as average daily load profiles for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Mobile Crowdsensing and Crowdsourcing
