Canonical Variates in Wasserstein Metric Space
Jia Li, Lin Lin

TL;DR
This paper introduces a novel dimension reduction method in Wasserstein space for classifying distributional data, improving accuracy and robustness over existing techniques through an iterative optimal transport-based algorithm.
Contribution
We propose a new approach for dimension reduction in Wasserstein space using Fisher ratio maximization, enhancing distributional data classification.
Findings
The method improves classification accuracy significantly.
It outperforms existing algorithms on distributional data.
The approach is robust to variations in data representations.
Abstract
In this paper, we address the classification of instances each characterized not by a singular point, but by a distribution on a vector space. We employ the Wasserstein metric to measure distances between distributions, which are then used by distance-based classification algorithms such as k-nearest neighbors, k-means, and pseudo-mixture modeling. Central to our investigation is dimension reduction within the Wasserstein metric space to enhance classification accuracy. We introduce a novel approach grounded in the principle of maximizing Fisher's ratio, defined as the quotient of between-class variation to within-class variation. The directions in which this ratio is maximized are termed discriminant coordinates or canonical variates axes. In practice, we define both between-class and within-class variations as the average squared distances between pairs of instances, with the pairs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeometric Analysis and Curvature Flows · Fixed Point Theorems Analysis · Advanced Differential Geometry Research
