Data anonymization in the presence of outliers via invariant coordinate selection
Katariina Perkonoja, Joni Virta

TL;DR
This paper introduces ICSA, a robust data anonymization method using invariant coordinate selection, which outperforms spectral anonymization in the presence of outliers by enhancing privacy and utility.
Contribution
The paper proposes ICSA, a novel robust latent space anonymization technique based on invariant coordinate selection, addressing vulnerabilities of PCA-based methods like spectral anonymization.
Findings
ICSA achieves stronger privacy protection than spectral anonymization.
ICSA maintains comparable or improved data utility in contaminated datasets.
Empirical results show ICSA's superior privacy-utility performance on clinical data.
Abstract
Protecting confidential data while preserving utility is particularly challenging when data sets contain outlying observations. Existing latent space anonymization methods, such as spectral anonymization (SA), rely on principal component analysis (PCA) and may therefore be vulnerable to contamination. We investigate anonymization in the presence of outliers and propose ICSA, a robust alternative to SA based on invariant coordinate selection (ICS). By replacing the PCA transformation with ICS, the robustness of the anonymization procedure can be regulated through the choice of scatter matrices. Alongside the methodological development, we derive a theoretical result showing that SA fails under sufficiently influential outliers. To assess the practical implications of this result, we compare the privacy-utility trade-off of ICSA and SA through simulation experiments under varying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
