Federated Learning with a Single Shared Image

Sunny Soni; Aaqib Saeed; Yuki M. Asano

arXiv:2406.12658·cs.CV·June 19, 2024

Federated Learning with a Single Shared Image

Sunny Soni, Aaqib Saeed, Yuki M. Asano

PDF

Open Access 1 Repo 4 Reviews

TL;DR

This paper introduces a federated learning method that uses only a single shared image and an adaptive pruning algorithm to improve knowledge transfer, enabling heterogeneous client models without large shared datasets.

Contribution

The paper presents a novel federated learning approach that relies on a single shared image and adaptive cropping, facilitating knowledge distillation with minimal shared data and supporting heterogeneous models.

Findings

01

Single shared image improves distillation efficiency.

02

Adaptive cropping selects most informative image parts.

03

Method supports heterogeneous client architectures.

Abstract

Federated Learning (FL) enables multiple machines to collaboratively train a machine learning model without sharing of private training data. Yet, especially for heterogeneous models, a key bottleneck remains the transfer of knowledge gained from each client model with the server. One popular method, FedDF, uses distillation to tackle this task with the use of a common, shared dataset on which predictions are exchanged. However, in many contexts such a dataset might be difficult to acquire due to privacy and the clients might not allow for storage of a large shared dataset. To this end, in this paper, we introduce a new method that improves this knowledge distillation method to only rely on a single shared image between clients and server. In particular, we propose a novel adaptive dataset pruning algorithm that selects the most informative crops generated from only a single image. With…

Peer Reviews

Decision·ICLR 2024 Conference Withdrawn Submission

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

* Using dataset pruning and single data KD is new in federated learning. * Authors show results with different model architectures and domain datasets, which is valuable and interesting. * The evaluations show the practicality of the method for the target datasets.

Weaknesses

* Authors should consider more recent baselines for KD-based FL methods. * Could you please elaborate on how your method differs from synthetic data generation (by the server or clients) or dataset distillation in federated learning? * Computation cost, especially for clients, is missing.

Reviewer 02Rating 3· reject, not good enoughConfidence 4

Strengths

The knowledge distillation method under the FL framework is an interesting area for research, as it allows heterogeneous client model architecture to be able to aggregate at the central server.

Weaknesses

- The writing in general requires significant improvement, with quite a lot of grammar mistakes and some confusing sentences. - The main method of the paper is not presented well. Normally, in the method section (section 3), the authors should state the problem setups, the objective of the problem with clear definitions, etc. Also, it lacks detailed references; for example, if the patchification techniques are used previously in the KD methods, etc. Again, some notations in the 'entropy selecti

Reviewer 03Rating 3· reject, not good enoughConfidence 5

Strengths

1. It is impressive that only one image is needed to perform KD. 2. The provided dataset pruning strategies are helpful.

Weaknesses

1. What would happen if we increase the number of KD images? I would appreciate it if the authors could provide more related ablation results. 2. Comparisons against FedAvg and other federated distillation baselines are missing. 3. Some data-free approaches such as (Zhu et al., 2021b) and “DENSE: Data-Free One-Shot Federated Learning” [NeurIPS 2022] that had completely removed any shared image between server and client, so what would be the unique advantages of using single images in this work?

Reviewer 04Rating 3· reject, not good enoughConfidence 4

Strengths

1. It is interesting to apply knowledge distillation-based aggregation with a single image in federated learning. 2. The authors give a comprehensive discussion of the experiments, and the results seem reasonable and promising.

Weaknesses

1. Followed by the first advantage, I think this work is a bit overclaimed. In my opinion, the size of a single shared image should be the same as the training images. However, in this paper, the shared image is of high resolution, which misleads the readers. 2. Followed by the first point, I am not convinced why a high-resolution image is more obtainable than a public dataset that contains the same size as the training data. According to Table 1, the high-resolution image cannot be randomly ge

Code & Models

Repositories

sunnysoni97/single_image_fl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Cryptography and Data Security

MethodsDataset Pruning · Knowledge Distillation · Pruning