Embedding-Space Data Augmentation to Prevent Membership Inference Attacks in Clinical Time Series Forecasting
Marius Fracarolli, Michael Staniek, Stefan Riezler

TL;DR
This paper investigates how embedding-space data augmentation techniques, especially ZOO-PCA, can effectively defend clinical time series forecasting models against membership inference attacks while maintaining high predictive accuracy.
Contribution
It introduces and evaluates multiple synthetic data augmentation strategies, demonstrating that ZOO-PCA significantly reduces attack success rates without compromising model performance.
Findings
ZOO-PCA reduces TPR/FPR ratio in MIA attacks
Synthetic data augmentation maintains forecasting accuracy
ZOO-PCA outperforms other augmentation methods
Abstract
Balancing strong privacy guarantees with high predictive performance is critical for time series forecasting (TSF) tasks involving Electronic Health Records (EHR). In this study, we explore how data augmentation can mitigate Membership Inference Attacks (MIA) on TSF models. We show that retraining with synthetic data can substantially reduce the effectiveness of loss-based MIAs by reducing the attacker's true-positive to false-positive ratio. The key challenge is generating synthetic samples that closely resemble the original training data to confuse the attacker, while also introducing enough novelty to enhance the model's ability to generalize to unseen data. We examine multiple augmentation strategies - Zeroth-Order Optimization (ZOO), a variant of ZOO constrained by Principal Component Analysis (ZOO-PCA), and MixUp - to strengthen model resilience without sacrificing accuracy. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education
