Differentiable Cluster Graph Neural Network

Yanfei Dong; Mohammed Haroon Dupty; Lambert Deng; Zhuanghua Liu; Yong; Liang Goh; Wee Sun Lee

arXiv:2405.16185·cs.LG·May 28, 2024·1 cites

Differentiable Cluster Graph Neural Network

Yanfei Dong, Mohammed Haroon Dupty, Lambert Deng, Zhuanghua Liu, Yong, Liang Goh, Wee Sun Lee

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a differentiable clustering mechanism within Graph Neural Networks that enhances long-range information propagation and handles heterophilous neighborhoods through an optimal transport-based implicit clustering framework.

Contribution

It proposes a novel end-to-end trainable GNN framework with an implicit clustering objective solved via differentiable message passing steps, improving information propagation in complex graphs.

Findings

01

Effective in heterophilous and homophilous datasets

02

Improves long-range information propagation

03

Seamless integration of clustering into GNN training

Abstract

Graph Neural Networks often struggle with long-range information propagation and in the presence of heterophilous neighborhoods. We address both challenges with a unified framework that incorporates a clustering inductive bias into the message passing mechanism, using additional cluster-nodes. Central to our approach is the formulation of an optimal transport based implicit clustering objective function. However, the algorithm for solving the implicit objective function needs to be differentiable to enable end-to-end learning of the GNN. To facilitate this, we adopt an entropy regularized objective function and propose an iterative optimization process, alternating between solving for the cluster assignments and updating the node/cluster-node embeddings. Notably, our derived closed-form optimization steps are themselves simple yet elegant message passing steps operating seamlessly on a…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 6Confidence 3

Strengths

1. The paper is clearly written and easy to read. 2. The proposed end-to-end differentiable model is convincing and the proposed model is trying to solve the important problems encountered in graph representation learning models such as over-squashing and heterophily. 3. Experimental results show that the model works well and experimental details are provided in the appendix.

Weaknesses

1. The model has multiple hyperparameters, which is very confusing for the potential user of the proposed model to select the optimal hyperparameters. 2. The proposed model seems difficult to train and converge despite Theorem 3.3 can provide some guarantee for convergence. I'm not sure if the model can converge only in a very narrow range of hyperparameters, and the code is not open-sourced. 3. $|\Omega|$ can not be removed from asymptotic time complexity $O(T|\mathcal{V}||\Omega|)$ simply beca

Reviewer 02Rating 3Confidence 3

Strengths

1. Converting the graph structure (adjacency or graph Laplacian) into a bipartite graph as GNN input is interesting.

Weaknesses

1. No model access, no code access. (major) 2. Most baseline results reported in Table1 do not align with the results reported in original papers. Where did you get the baseline results? If you run the baselines, please provide codes/loggers or any proof. If not, please cite the sources that you used. (major) 3. Overall contribution is not much, not so exciting. (minor)

Reviewer 03Rating 8Confidence 3

Strengths

1. The paper addresses two critical challenges—over-squashing and heterophilous neighborhood aggregation with a unified framework. 2. The iterative optimization with soft cluster assignment makes it possible to learn both node and cluster-node embeddings efficiently. 3. Experimental results are sufficient to demonstrate the effectiveness of the proposed model.

Weaknesses

1. The complexity analysis is very rough, the preprocessing of constructing bipartite graph may cost time, so it would be better to supplement the complexity of preprocessing data. 2. The introduction of so-called “cluster nodes” is basically assigning a pseudo label to each node, which is quite common among methods on graph heterophily. 3. The novelty of directly integrate clustering into the message-passing mechanism is somehow weak.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications