On Sampling Strategies for Spectral Model Sharding

Denis Korzhenkov; Christos Louizos

arXiv:2410.24106·cs.LG·November 1, 2024

On Sampling Strategies for Spectral Model Sharding

Denis Korzhenkov, Christos Louizos

PDF

Open Access

TL;DR

This paper introduces two novel sampling strategies for spectral model sharding in federated learning, improving efficiency and performance by optimizing model partitioning and approximation accuracy.

Contribution

It proposes two new sampling methods for spectral sharding that are solutions to specific optimization problems, enhancing federated learning efficiency.

Findings

01

Both methods produce effective unbiased and low-error estimators.

02

Empirical results show improved performance on standard datasets.

03

Strategies can be integrated into federated learning workflows.

Abstract

The problem of heterogeneous clients in federated learning has recently drawn a lot of attention. Spectral model sharding, i.e., partitioning the model parameters into low-rank matrices based on the singular value decomposition, has been one of the proposed solutions for more efficient on-device training in such settings. In this work, we present two sampling strategies for such sharding, obtained as solutions to specific optimization problems. The first produces unbiased estimators of the original weights, while the second aims to minimize the squared approximation error. We discuss how both of these estimators can be incorporated in the federated learning loop and practical considerations that arise during local training. Empirically, we demonstrate that both of these methods can lead to improved performance on various commonly used datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Time Series Analysis and Forecasting · Face and Expression Recognition