Label Inference Attacks from Log-loss Scores
Abhinav Aggarwal, Shiva Prasad Kasiviswanathan, Zekun Xu, Oluwaseyi, Feyisetan, Nathanael Teissier

TL;DR
This paper demonstrates that it is possible to accurately infer dataset labels from log-loss scores alone, using number theory and combinatorics, even with noise and limited precision, without training models.
Contribution
It introduces novel label inference attacks from log-loss scores that work under noise and limited precision, without requiring model training.
Findings
Label inference from log-loss scores is feasible with arbitrary precision.
Attacks succeed even with noisy and limited-precision scores.
Algorithms do not require model training or access to data beyond log-loss scores.
Abstract
Log-loss (also known as cross-entropy loss) metric is ubiquitously used across machine learning applications to assess the performance of classification algorithms. In this paper, we investigate the problem of inferring the labels of a dataset from single (or multiple) log-loss score(s), without any other access to the dataset. Surprisingly, we show that for any finite number of label classes, it is possible to accurately infer the labels of the dataset from the reported log-loss score of a single carefully constructed prediction vector if we allow arbitrary precision arithmetic. Additionally, we present label inference algorithms (attacks) that succeed even under addition of noise to the log-loss scores and under limited precision arithmetic. All our algorithms rely on ideas from number theory and combinatorics and require no model training. We run experimental simulations on some real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
