A generalization for the expected value of the earth mover's distance
William Q. Erickson

TL;DR
This paper extends the earth mover's distance to multiple distributions, introduces an efficient algorithm, and computes its expected value on random distributions, with applications to real-world data analysis.
Contribution
It generalizes the EMD to multiple distributions, provides a combinatorics-inspired efficient algorithm, and calculates the expected EMD using a generating function linked to algebraic geometry.
Findings
The generalized EMD can compare more than two distributions.
The special case of three distributions simplifies to half the sum of pairwise EMDs.
The expected value of the generalized EMD is computed using a generating function.
Abstract
The earth mover's distance (EMD), also called the first Wasserstein distance, can be naturally extended to compare arbitrarily many probability distributions, rather than only two, on the set . We present the details for this generalization, along with a highly efficient algorithm inspired by combinatorics; it turns out that in the special case of three distributions, the EMD is half the sum of the pairwise EMD's. Extending the methods of Bourn and Willenbring (arXiv:1903.03673), we compute the expected value of this generalized EMD on random -tuples of distributions, using a generating function which coincides with the Hilbert series of the Segre embedding. We then use the EMD to analyze a real-world data set of grade distributions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
