Decentralized Hyper-Gradient Computation over Time-Varying Directed Networks
Naoyuki Terashita, Satoshi Hara

TL;DR
This paper proposes a communication-efficient method for hyper-gradient computation in decentralized federated learning over time-varying directed networks, enabling influence estimation and personalization with theoretical convergence guarantees.
Contribution
It introduces a new optimality condition and uses Push-Sum for hyper-gradient estimation, reducing communication costs and allowing operation over dynamic directed networks.
Findings
The estimator converges to the true hyper-gradient both theoretically and empirically.
It enables decentralized influence estimation and personalization in dynamic network settings.
The method reduces communication overhead compared to prior Hessian-based approaches.
Abstract
This paper addresses the communication issues when estimating hyper-gradients in decentralized federated learning (FL). Hyper-gradients in decentralized FL quantifies how the performance of globally shared optimal model is influenced by the perturbations in clients' hyper-parameters. In prior work, clients trace this influence through the communication of Hessian matrices over a static undirected network, resulting in (i) excessive communication costs and (ii) inability to make use of more efficient and robust networks, namely, time-varying directed networks. To solve these issues, we introduce an alternative optimality condition for FL using an averaging operation on model parameters and gradients. We then employ Push-Sum as the averaging operation, which is a consensus optimization technique for time-varying directed networks. As a result, the hyper-gradient estimator derived from our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Distributed Sensor Networks and Detection Algorithms · Distributed Control Multi-Agent Systems
