Approximating 1-Wasserstein Distance between Persistence Diagrams by Graph Sparsification

Tamal K. Dey; Simon Zhang

arXiv:2110.14734·cs.CG·May 13, 2025

Approximating 1-Wasserstein Distance between Persistence Diagrams by Graph Sparsification

Tamal K. Dey, Simon Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a scalable, near-linear approximation scheme for computing the 1-Wasserstein distance between persistence diagrams, leveraging graph sparsification, condensation, and parallel computing to handle large datasets efficiently.

Contribution

It presents a novel algorithm combining graph sparsification, condensation, and parallelism to efficiently approximate PD distances with theoretical guarantees and practical improvements.

Findings

01

Achieves near-linear time approximation with low empirical error.

02

Outperforms existing methods in speed and accuracy.

03

Provides open-source software PDoptFlow for practical use.

Abstract

Persistence diagrams (PD)s play a central role in topological data analysis. This analysis requires computing distances among such diagrams such as the $1$ -Wasserstein distance. Accurate computation of these PD distances for large data sets that render large diagrams may not scale appropriately with the existing methods. The main source of difficulty ensues from the size of the bipartite graph on which a matching needs to be computed for determining these PD distances. We address this problem by making several algorithmic and computational observations in order to obtain, in theory, a near-linear fully polynomial-time approximation scheme. This is theoretically optimal assuming the $(1 + ϵ)$ -approximate EMD conjecture in constant dimension, which is that the EMD problem on the plane cannot be approximated by a PTAS in time $O (\frac{1}{ϵ ^{2}} n)$ up to polylog factors. In our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

simonzhang00/pdoptflow
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Advanced Neuroimaging Techniques and Applications