Evaluating Privacy-Utility Tradeoffs in Synthetic Smart Grid Data
Andre Catarino, Rui Melo, Rui Abreu, Luis Cruz

TL;DR
This paper compares four synthetic data generation methods for smart grid data, evaluating their utility, fidelity, and privacy, and finds that diffusion models excel in utility while CTGAN offers better privacy resistance.
Contribution
It provides a comprehensive comparison of four synthetic data methods for smart grid applications, highlighting the importance of architectural design in balancing utility and privacy.
Findings
Diffusion models achieve up to 88.2% macro-F1 in utility.
CTGAN offers stronger resistance to reconstruction attacks.
Architectural choices significantly impact utility and privacy tradeoffs.
Abstract
The widespread adoption of dynamic Time-of-Use (dToU) electricity tariffs requires accurately identifying households that would benefit from such pricing structures. However, the use of real consumption data poses serious privacy concerns, motivating the adoption of synthetic alternatives. In this study, we conduct a comparative evaluation of four synthetic data generation methods, Wasserstein-GP Generative Adversarial Networks (WGAN), Conditional Tabular GAN (CTGAN), Diffusion Models, and Gaussian noise augmentation, under different synthetic regimes. We assess classification utility, distribution fidelity, and privacy leakage. Our results show that architectural design plays a key role: diffusion models achieve the highest utility (macro-F1 up to 88.2%), while CTGAN provide the strongest resistance to reconstruction attacks. These findings highlight the potential of structured…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSmart Grid Security and Resilience · Internet Traffic Analysis and Secure E-voting · Privacy-Preserving Technologies in Data
