Protecting Time Series Data with Minimal Forecast Loss
Matthew J. Schneider, Jinwook Lee

TL;DR
This paper introduces novel methods for anonymizing time series data that effectively protect privacy while preserving forecast accuracy, addressing legislative compliance and practical utility.
Contribution
It develops $k$-nTS Swapping and $k$-mTS Shuffling techniques with an optimization framework to minimize forecast loss during data anonymization.
Findings
Maintains forecast accuracy comparable to original data
Effectively prevents privacy breaches in time series data
Applicable to large datasets with minimal pattern distortion
Abstract
Forecasting could be negatively impacted due to anonymization requirements in data protection legislation. To measure the potential severity of this problem, we derive theoretical bounds for the loss to forecasts from additive exponential smoothing models using protected data. Following the guidelines of anonymization from the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA), we develop the -nearest Time Series (-nTS) Swapping and -means Time Series (-mTS) Shuffling methods to create protected time series data that minimizes the loss to forecasts while preventing a data intruder from detecting privacy issues. For efficient and effective decision making, we formally model an integer programming problem for a perfect matching for simultaneous data swapping in each cluster. We call it a two-party data privacy framework since our optimization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Traffic Prediction and Management Techniques
