Heterogeneous Federated Learning Using Knowledge Codistillation
Jared Lichtarge, Ehsan Amid, Shankar Kumar, Tien-Ju Yang and, Rohan Anil, Rajiv Mathews

TL;DR
This paper introduces a heterogeneous federated learning approach using bidirectional knowledge distillation, enabling different model architectures across clients and improving performance on classification and language tasks.
Contribution
It proposes a novel bidirectional knowledge distillation method allowing diverse model architectures in federated learning, addressing capacity utilization and domain transfer issues.
Findings
Improved federated learning performance on image and language tasks.
Effective domain transfer with out-of-domain distillation data.
Supports heterogeneous model architectures across clients.
Abstract
Federated Averaging, and many federated learning algorithm variants which build upon it, have a limitation: all clients must share the same model architecture. This results in unused modeling capacity on many clients, which limits model performance. To address this issue, we propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity. The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters. We present two variants of our method, which improve upon federated averaging on image classification and language modeling tasks. We show this technique can be useful even if only out-of-domain or limited in-domain distillation data is available. Additionally, the bi-directional knowledge distillation allows for domain transfer between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
MethodsKnowledge Distillation
