NeuroMixGDP: A Neural Collapse-Inspired Random Mixup for Private Data   Release

Donghao Li; Yang Cao; Yuan Yao

arXiv:2202.06467·cs.LG·December 6, 2023

NeuroMixGDP: A Neural Collapse-Inspired Random Mixup for Private Data Release

Donghao Li, Yang Cao, Yuan Yao

PDF

Open Access 1 Repo

TL;DR

NeuroMixGDP introduces a neural collapse-inspired mixup technique using Gaussian Differential Privacy to enhance utility and privacy in data release, especially for high-class datasets, outperforming existing methods.

Contribution

The paper proposes a novel mixup scheme based on Neural Collapse and ETF structure, combined with hierarchical sampling and GDP, to improve privacy-preserving data release utility.

Findings

01

Significantly improves utility over DPSGD on CIFAR100 and MiniImagenet.

02

Effectively protects against attacks while maintaining data utility.

03

Addresses label collapse with hierarchical stratified sampling.

Abstract

Privacy-preserving data release algorithms have gained increasing attention for their ability to protect user privacy while enabling downstream machine learning tasks. However, the utility of current popular algorithms is not always satisfactory. Mixup of raw data provides a new way of data augmentation, which can help improve utility. However, its performance drastically deteriorates when differential privacy (DP) noise is added. To address this issue, this paper draws inspiration from the recently observed Neural Collapse (NC) phenomenon, which states that the last layer features of a neural network concentrate on the vertices of a simplex as Equiangular Tight Frame (ETF). We propose a scheme to mixup the Neural Collapse features to exploit the ETF simplex structure and release noisy mixed features to enhance the utility of the released data. By using Gaussian Differential Privacy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lidonghao1996/neuromixgdp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques

MethodsLinear Regression · Mixup