Little is Enough: Boosting Privacy by Sharing Only Hard Labels in Federated Semi-Supervised Learning
Amr Abourayya, Jens Kleesiek, Kanishka Rao, Erman Ayday and, Bharat Rao, Geoff Webb, Michael Kamp

TL;DR
This paper introduces FedCT, a federated semi-supervised learning method that enhances privacy by sharing only hard labels, enabling the use of diverse local models and improving performance in federated fine-tuning of large language models.
Contribution
The paper proposes a novel federated co-training approach that shares only hard labels, improving privacy and allowing non-gradient-based models in federated learning.
Findings
FedCT improves privacy without sacrificing model accuracy.
It enables training of models like decision trees and random forests in federated settings.
FedCT is effective in federated fine-tuning of large language models.
Abstract
In many critical applications, sensitive data is inherently distributed and cannot be centralized due to privacy concerns. A wide range of federated learning approaches have been proposed to train models locally at each client without sharing their sensitive data, typically by exchanging model parameters, or probabilistic predictions (soft labels) on a public dataset or a combination of both. However, these methods still disclose private information and restrict local models to those that can be trained using gradient-based methods. We propose a federated co-training (FedCT) approach that improves privacy by sharing only definitive (hard) labels on a public unlabeled dataset. Clients use a consensus of these shared labels as pseudo-labels for local training. This federated co-training approach empirically enhances privacy without compromising model quality. In addition, it allows the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Privacy, Security, and Data Protection
