Fused Gromov-Wasserstein Distance with Feature Selection
Harlin Lee, Ying Yu, Mingxin Li, Ranthony Clark

TL;DR
This paper introduces feature selection into Fused Gromov-Wasserstein distances, improving interpretability and robustness by adaptively downweighting irrelevant features during object alignment.
Contribution
It proposes novel FGW formulations with feature suppression, including regularized and simplex-constrained approaches, along with theoretical analysis and an efficient optimization algorithm.
Findings
Feature suppression enhances interpretability in high-dimensional data.
The proposed methods improve robustness against noisy or irrelevant features.
Experiments demonstrate better task-relevant structure discovery, e.g., in redistricting.
Abstract
Fused Gromov-Wasserstein (FGW) distances provide a principled framework for comparing objects by jointly aligning structure and node features. However, existing FGW formulations treat all features uniformly, which limits interpretability and robustness in high-dimensional settings where many features may be irrelevant or noisy. We introduce FGW distances with feature selection, which incorporate adaptive feature suppression weights into the FGW objective to selectively downweight or suppress differentiating features during alignment. We propose two approaches: (1) regularized FGW with Lasso and Ridge penalties, and (2) FGW with simplex-constrained weights, including groupwise extensions. We analyze the resulting models and establish their key theoretical properties, including bounds relative to classical FGW and Gromov-Wasserstein distances, and metric behavior. An efficient alternating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
