Federated Training of Dual Encoding Models on Small Non-IID Client Datasets
Raviteja Vemulapalli, Warren Richard Morningstar, Philip Andrew, Mansfield, Hubert Eichner, Karan Singhal, Arash Afkanpour, Bradley Green

TL;DR
This paper introduces DCCO, a novel federated training method for dual encoding models on small, non-IID client datasets, improving performance over existing federated approaches by leveraging encoding statistics.
Contribution
The paper proposes DCCO, a new federated training algorithm that uses encoding statistics for dual encoding models on decentralized, non-IID data, addressing limitations of existing methods.
Findings
DCCO outperforms federated variants of existing approaches by a large margin.
Simulating large-batch loss computation on clients improves training stability.
Encoding statistics aggregation enables effective federated dual encoding model training.
Abstract
Dual encoding models that encode a pair of inputs are widely used for representation learning. Many approaches train dual encoding models by maximizing agreement between pairs of encodings on centralized training data. However, in many scenarios, datasets are inherently decentralized across many clients (user devices or organizations) due to privacy concerns, motivating federated learning. In this work, we focus on federated training of dual encoding models on decentralized data composed of many small, non-IID (independent and identically distributed) client datasets. We show that existing approaches that work well in centralized settings perform poorly when naively adapted to this setting using federated averaging. We observe that, we can simulate large-batch loss computation on individual clients for loss functions that are based on encoding statistics. Based on this insight, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques
