Enhancing Learning with Label Differential Privacy by Vector   Approximation

Puning Zhao; Rongfei Fan; Huiwen Wu; Qingming Li; Jiafei Wu; Zhe Liu

arXiv:2405.15150·cs.LG·May 27, 2024

Enhancing Learning with Label Differential Privacy by Vector Approximation

Puning Zhao, Rongfei Fan, Huiwen Wu, Qingming Li, Jiafei Wu, Zhe Liu

PDF

Open Access

TL;DR

This paper introduces a vector approximation method for label differential privacy that preserves more information than scalar flipping, maintaining performance even as the number of classes increases.

Contribution

The paper proposes a novel vector approximation approach for label DP that is easy to implement and less affected by the number of classes, improving privacy-preserving learning.

Findings

01

Performance decays slightly with increasing classes

02

Method outperforms scalar flipping approaches

03

Validated on synthetic and real datasets

Abstract

Label differential privacy (DP) is a framework that protects the privacy of labels in training datasets, while the feature vectors are public. Existing approaches protect the privacy of labels by flipping them randomly, and then train a model to make the output approximate the privatized label. However, as the number of classes $K$ increases, stronger randomization is needed, thus the performances of these methods become significantly worse. In this paper, we propose a vector approximation approach, which is easy to implement and introduces little additional computational overhead. Instead of flipping each label into a single scalar, our method converts each label into a random vector with $K$ components, whose expectations reflect class conditional probabilities. Intuitively, vector approximation retains more information than scalar labels. A brief theoretical analysis shows that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Machine Learning and Data Classification · Imbalanced Data Classification Techniques