Protecting Time Series Data with Minimal Forecast Loss

Matthew J. Schneider; Jinwook Lee

arXiv:2106.16085·cs.CR·March 8, 2023

Protecting Time Series Data with Minimal Forecast Loss

Matthew J. Schneider, Jinwook Lee

PDF

Open Access

TL;DR

This paper introduces novel methods for anonymizing time series data that effectively protect privacy while preserving forecast accuracy, addressing legislative compliance and practical utility.

Contribution

It develops $k$-nTS Swapping and $k$-mTS Shuffling techniques with an optimization framework to minimize forecast loss during data anonymization.

Findings

01

Maintains forecast accuracy comparable to original data

02

Effectively prevents privacy breaches in time series data

03

Applicable to large datasets with minimal pattern distortion

Abstract

Forecasting could be negatively impacted due to anonymization requirements in data protection legislation. To measure the potential severity of this problem, we derive theoretical bounds for the loss to forecasts from additive exponential smoothing models using protected data. Following the guidelines of anonymization from the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA), we develop the $k$ -nearest Time Series ( $k$ -nTS) Swapping and $k$ -means Time Series ( $k$ -mTS) Shuffling methods to create protected time series data that minimizes the loss to forecasts while preventing a data intruder from detecting privacy issues. For efficient and effective decision making, we formally model an integer programming problem for a perfect matching for simultaneous data swapping in each cluster. We call it a two-party data privacy framework since our optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Traffic Prediction and Management Techniques