Federated Wasserstein Distance
Alain Rakotomamonjy, Kimia Nadjahi, Liva Ralaivola

TL;DR
This paper presents FedWad, a novel federated algorithm for estimating Wasserstein distances between distributed data samples without sharing raw data, leveraging geometric properties and geodesic manipulations.
Contribution
The paper introduces FedWad, a new method for federated Wasserstein distance estimation that preserves data privacy and is backed by convergence analysis.
Findings
FedWad accurately estimates Wasserstein distances in federated settings.
Empirical results demonstrate improved federated model performance.
The method is applicable to federated coreset construction and dataset distance measurement.
Abstract
We introduce a principled way of computing the Wasserstein distance between two distributions in a federated manner. Namely, we show how to estimate the Wasserstein distance between two samples stored and kept on different devices/clients whilst a central entity/server orchestrates the computations (again, without having access to the samples). To achieve this feat, we take advantage of the geometric properties of the Wasserstein distance -- in particular, the triangle inequality -- and that of the associated {\em geodesics}: our algorithm, FedWad (for Federated Wasserstein Distance), iteratively approximates the Wasserstein distance by manipulating and exchanging distributions from the space of geodesics in lieu of the input samples. In addition to establishing the convergence properties of FedWad, we provide empirical results on federated coresets and federate optimal transport…
Peer Reviews
Decision·ICLR 2024 poster
The paper is overall well presented and the writing is clear. The contribution is original to the knowledge of the reviewer. Some main strengths: 1: The main problem the paper, i.e. FL for WD, is novel and deserves attention. As WD heavily depends on both data sets at the same time, how to minimize the exposure of the raw data is of theoretical importance. The paper proposes a framework that is in line with FL. 2. The paper covers most major aspects of practicality of FL for WD, including both t
Although the idea of FL for WD is indeed interesting and deserves attention, I have a major concern on whether the proposed algorithm is really compliant with FL principles. Specifically, for each iteration, the server sends a distribution $\xi$ to client with data $\mu$, then a $t$-barycenter $\xi_\mu$ is sent back to the server. By structure of WD geodesic, with knowledge of $\xi$ and $\xi_\mu$ it is already very immediate to reconstruct $\mu$. For instance, if $\xi = \frac{1}{n}\sum_i \delta_
I think the authors propose an algorithm which could be valuable in federated learning and other data-science applications in the federated setting. The paper is well written and the concepts and development is easy to follow. I especially found Figure 1 to be very insightful. The Theorems support the claims and are valuable. While the setting of Theorem 3 is arguably a bit restrictive, the result is nonetheless very interesting. Specifically the fact that you can compute the Wasserstein distanc
Two main desiderata of federated learning are privacy and low communcation cost. While the problem of communication cost is addressed with (10), the problem of privacy remains largely unanswered. If the authors convincingly address the problem of privacy leak, I am open to change my recommendation. If I correctly understand the reasoning at the bottom of page 4, the authors propose to randomly select $t$ such that the server cannot easily infer $d_{\mu,\xi^{(k)}}$. While I agree that this is im
I found the problem is well motivated. The theoretical treatment is nice and sound. The numerical experiments are convincing.
The presentation of the manuscript could be better.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Topological and Geometric Data Analysis
MethodsCoresets
