Differentiable Cluster Graph Neural Network
Yanfei Dong, Mohammed Haroon Dupty, Lambert Deng, Zhuanghua Liu, Yong, Liang Goh, Wee Sun Lee

TL;DR
This paper introduces a differentiable clustering mechanism within Graph Neural Networks that enhances long-range information propagation and handles heterophilous neighborhoods through an optimal transport-based implicit clustering framework.
Contribution
It proposes a novel end-to-end trainable GNN framework with an implicit clustering objective solved via differentiable message passing steps, improving information propagation in complex graphs.
Findings
Effective in heterophilous and homophilous datasets
Improves long-range information propagation
Seamless integration of clustering into GNN training
Abstract
Graph Neural Networks often struggle with long-range information propagation and in the presence of heterophilous neighborhoods. We address both challenges with a unified framework that incorporates a clustering inductive bias into the message passing mechanism, using additional cluster-nodes. Central to our approach is the formulation of an optimal transport based implicit clustering objective function. However, the algorithm for solving the implicit objective function needs to be differentiable to enable end-to-end learning of the GNN. To facilitate this, we adopt an entropy regularized objective function and propose an iterative optimization process, alternating between solving for the cluster assignments and updating the node/cluster-node embeddings. Notably, our derived closed-form optimization steps are themselves simple yet elegant message passing steps operating seamlessly on a…
Peer Reviews
Decision·Submitted to ICLR 2025
1. The paper is clearly written and easy to read. 2. The proposed end-to-end differentiable model is convincing and the proposed model is trying to solve the important problems encountered in graph representation learning models such as over-squashing and heterophily. 3. Experimental results show that the model works well and experimental details are provided in the appendix.
1. The model has multiple hyperparameters, which is very confusing for the potential user of the proposed model to select the optimal hyperparameters. 2. The proposed model seems difficult to train and converge despite Theorem 3.3 can provide some guarantee for convergence. I'm not sure if the model can converge only in a very narrow range of hyperparameters, and the code is not open-sourced. 3. $|\Omega|$ can not be removed from asymptotic time complexity $O(T|\mathcal{V}||\Omega|)$ simply beca
1. Converting the graph structure (adjacency or graph Laplacian) into a bipartite graph as GNN input is interesting.
1. No model access, no code access. (major) 2. Most baseline results reported in Table1 do not align with the results reported in original papers. Where did you get the baseline results? If you run the baselines, please provide codes/loggers or any proof. If not, please cite the sources that you used. (major) 3. Overall contribution is not much, not so exciting. (minor)
1. The paper addresses two critical challenges—over-squashing and heterophilous neighborhood aggregation with a unified framework. 2. The iterative optimization with soft cluster assignment makes it possible to learn both node and cluster-node embeddings efficiently. 3. Experimental results are sufficient to demonstrate the effectiveness of the proposed model.
1. The complexity analysis is very rough, the preprocessing of constructing bipartite graph may cost time, so it would be better to supplement the complexity of preprocessing data. 2. The introduction of so-called “cluster nodes” is basically assigning a pseudo label to each node, which is quite common among methods on graph heterophily. 3. The novelty of directly integrate clustering into the message-passing mechanism is somehow weak.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
