Fused Partial Gromov-Wasserstein for Structured Objects
Yikun Bai, Shuang Wang, Huy Tran, Hengrong Du, Juexin Wang, Soheil Kolouri

TL;DR
This paper introduces Fused Partial Gromov-Wasserstein (FPGW), a novel extension of FGW that handles unbalanced structured data, with theoretical properties, efficient algorithms, and strong empirical performance in graph tasks.
Contribution
It extends FGW to unbalanced data, establishes its theoretical properties, and develops efficient algorithms for practical applications.
Findings
FPGW is effective for graph matching, classification, and clustering.
FPGW demonstrates robust performance across various structured data tasks.
The proposed algorithms enable scalable computation of FPGW distances.
Abstract
Structured data, such as graphs, is vital in machine learning due to its capacity to capture complex relationships and interactions. In recent years, the Fused Gromov-Wasserstein (FGW) distance has attracted growing interest because it enables the comparison of structured data by jointly accounting for feature similarity and geometric structure. However, as a variant of optimal transport (OT), classical FGW assumes an equal mass constraint on the compared data. In this work, we relax this mass constraint and propose the Fused Partial Gromov-Wasserstein (FPGW) framework, which extends FGW to accommodate unbalanced data. Theoretically, we establish the relationship between FPGW and FGW and prove the metric properties of FPGW. Numerically, we introduce Frank-Wolfe solvers and Sinkhorn solvers for the proposed FPGW framework. Finally, we evaluate the FPGW distance through graph matching,…
Peer Reviews
Decision·Submitted to ICLR 2026
- introduce new variants for partial FGW matching, namely FPGW and an entropically regularized variant EFPGW. With solvers to estimate solutions to these OT problems, respectively a conditional gradient solver for FPGW and a sinkhorn-like solver for EFPGW. - Provide theoretical results on the convergence of the CG solver for FPGW and FMPGW (supplementary). - Provide theoretical results in Theorem 3.1 showing that: (1) FPGW can be formulated as a non-convex QP; (2) FMPGW and FPGW admit minimizers
- **W1: overall clarity of the paper**: I believe the paper contains many writing issues which are detrimental to the clarity of the paper including: - L51: "similarity between datasets". I suggest authors to be more specific about what they mean here as it does not include datasets of structured data like graphs with or without node features. - L57-58: "structured feature data". This terminology seems very strange to me and unique to the paper. I would advice authors to change that for in
The strengths of the article are : 1. It is well written, with (almost) all required elements defined (e.g., there are 2 pages of notations in the supplements, and background on OT and all the elements up to GW distances). 2. There is an effort to prove some theoretical properties: existence of minimiser for the unbalanced FGW or FPGW; properties related to being a metric or semi-metric; details on the described algorithms and convergence properties. 3. I have found one point with some nov
The article has several weaknesses and I find that they overcome the strengths of the article: 1. First of all, the work is way too **incremental**: the problem of FPGW (and its variants) was already well considered before, with methods to solve it and better discussions than in the present article about why the problem is relevant. Here, the main novelties are technical results and algorithms in a specific choice of distance ; and these technicalities and algorithms come almost straight from t
+ The paper is overall well written and the proposed method is clearly explained. + In applications, one might have access to structured objects with feature on the nodes that are only partially comparable. The proposed distance is a useful extension of the existing FGW distance to handle such cases. + The experiments suggests that the proposed distance indeed work better than FGW when only part of the objects are comparable.
+ The contribution novelty is extremely limited. The proposed distance is a straightforward extension of the existing FGW distance (Vayer 2019) to the partial case, following the same principles as the Partial GW with only a linear term added from (Chapel 2020). While this is obviously useful in practice, the adaptation is easy to implement and probably already used in the community (see next point). + The optimization schemes are also straightforward adaptations of existing sche
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Image Processing Techniques
