A Historical Trajectory Assisted Optimization Method for Zeroth-Order   Federated Learning

Chenlin Wu; Xiaoyu He; Zike Li; Jing Gong; Zibin Zheng

arXiv:2409.15955·cs.LG·October 25, 2024

A Historical Trajectory Assisted Optimization Method for Zeroth-Order Federated Learning

Chenlin Wu, Xiaoyu He, Zike Li, Jing Gong, Zibin Zheng

PDF

Open Access

TL;DR

This paper introduces a non-isotropic sampling method for zeroth-order federated learning that leverages historical solution trajectories to improve gradient estimation and convergence without increasing computational costs.

Contribution

It proposes a novel subspace-based sampling approach using historical trajectories to enhance gradient estimation in zeroth-order federated learning.

Findings

01

Convergence rate matches existing methods.

02

No significant overhead in communication or computation.

03

Effective in numerical experiments compared to standard algorithms.

Abstract

Federated learning heavily relies on distributed gradient descent techniques. In the situation where gradient information is not available, the gradients need to be estimated from zeroth-order information, which typically involves computing finite-differences along isotropic random directions. This method suffers from high estimation errors, as the geometric features of the objective landscape may be overlooked during the isotropic sampling. In this work, we propose a non-isotropic sampling method to improve the gradient estimation procedure. Gradients in our method are estimated in a subspace spanned by historical trajectories of solutions, aiming to encourage the exploration of promising regions and hence improve the convergence. The proposed method uses a covariance matrix for sampling which is a convex combination of two parts. The first part is a thin projection matrix containing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Machine Learning and ELM