On the privacy-utility trade-off in differentially private hierarchical   text classification

Dominik Wunderlich; Daniel Bernau; Francesco Ald\`a; Javier; Parra-Arnau; Thorsten Strufe

arXiv:2103.02895·cs.CR·December 10, 2021·1 cites

On the privacy-utility trade-off in differentially private hierarchical text classification

Dominik Wunderlich, Daniel Bernau, Francesco Ald\`a, Javier, Parra-Arnau, Thorsten Strufe

PDF

Open Access 1 Repo

TL;DR

This paper explores how different neural network architectures for hierarchical text classification balance privacy and utility when trained with differential privacy, highlighting that larger privacy parameters can effectively prevent data leakage with minimal utility loss.

Contribution

It empirically compares neural network architectures under differential privacy, identifying models that offer optimal privacy-utility trade-offs for different dataset sizes and text lengths.

Findings

01

Large privacy parameters mitigate membership inference attacks effectively.

02

Transformer models perform well on large, long-text datasets.

03

CNNs are preferable for smaller datasets with shorter texts.

Abstract

Hierarchical text classification consists in classifying text documents into a hierarchy of classes and sub-classes. Although artificial neural networks have proved useful to perform this task, unfortunately they can leak training data information to adversaries due to training data memorization. Using differential privacy during model training can mitigate leakage attacks against trained models, enabling the models to be shared safely at the cost of reduced model accuracy. This work investigates the privacy-utility trade-off in hierarchical text classification with differential privacy guarantees, and identifies neural network architectures that offer superior trade-offs. To this end, we use a white-box membership inference attack to empirically assess the information leakage of three widely used neural network architectures. We show that large differential privacy parameters already…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SAP-samples/security-research-dp-hierarchical-text
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning