DistDD: Distributed Data Distillation Aggregation through Gradient   Matching

Peiran Wang; Haohan Wang

arXiv:2410.08665·cs.LG·October 14, 2024

DistDD: Distributed Data Distillation Aggregation through Gradient Matching

Peiran Wang, Haohan Wang

PDF

Open Access

TL;DR

DistDD introduces a one-time data distillation method for federated learning that reduces communication costs, preserves privacy, and enables efficient model tuning and neural architecture search, especially in complex data scenarios.

Contribution

The paper presents a novel data distillation approach for federated learning that minimizes communication and supports multiple downstream tasks without iterative model updates.

Findings

01

Effective in non-i.i.d. and mislabeled data scenarios

02

Reduces communication costs significantly

03

Enables neural architecture search without full FL retraining

Abstract

In this paper, we introduce DistDD, a novel approach within the federated learning framework that reduces the need for repetitive communication by distilling data directly on clients' devices. Unlike traditional federated learning that requires iterative model updates across nodes, DistDD facilitates a one-time distillation process that extracts a global distilled dataset, maintaining the privacy standards of federated learning while significantly cutting down communication costs. By leveraging the DistDD's distilled dataset, the developers of the FL can achieve just-in-time parameter tuning and neural architecture search over FL without repeating the whole FL process multiple times. We provide a detailed convergence proof of the DistDD algorithm, reinforcing its mathematical stability and reliability for practical applications. Our experiments demonstrate the effectiveness and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Data Stream Mining Techniques