Deep-Relative-Trust-Based Diffusion for Decentralized Deep Learning

Muyun Li; Aaron Fainman; Stefan Vlaski

arXiv:2501.03162·cs.LG·January 24, 2025

Deep-Relative-Trust-Based Diffusion for Decentralized Deep Learning

Muyun Li, Aaron Fainman, Stefan Vlaski

PDF

Open Access

TL;DR

This paper introduces DRT diffusion, a decentralized deep learning method that promotes agreement on neural network outputs rather than parameters, improving generalization especially in sparse network topologies.

Contribution

The paper develops a novel decentralized learning algorithm based on deep relative trust, with convergence analysis and demonstrated benefits in image classification tasks.

Findings

01

Enhanced generalization in sparse topologies

02

Convergence guarantees for DRT diffusion

03

Improved performance over traditional averaging methods

Abstract

Decentralized learning strategies allow a collection of agents to learn efficiently from local data sets without the need for central aggregation or orchestration. Current decentralized learning paradigms typically rely on an averaging mechanism to encourage agreement in the parameter space. We argue that in the context of deep neural networks, which are often over-parameterized, encouraging consensus of the neural network outputs, as opposed to their parameters can be more appropriate. This motivates the development of a new decentralized learning algorithm, termed DRT diffusion, based on deep relative trust (DRT), a recently introduced similarity measure for neural networks. We provide convergence analysis for the proposed strategy, and numerically establish its benefit to generalization, especially with sparse topologies, in an image classification task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques