Rethinking Independent Cross-Entropy Loss For Graph-Structured Data
Rui Miao, Kaixiong Zhou, Yili Wang, Ninghao Liu, Ying Wang, Xin Wang

TL;DR
This paper introduces a joint-cluster supervised learning framework for graph neural networks that models node-cluster joint distributions, improving classification accuracy and robustness against adversarial attacks.
Contribution
It proposes a novel joint distribution modeling approach that enhances GNNs' discrimination ability and adversarial robustness compared to traditional independent loss methods.
Findings
Improved node classification accuracy on benchmark datasets.
Enhanced robustness of GNNs against adversarial attacks.
Effective utilization of local cluster information for training.
Abstract
Graph neural networks (GNNs) have exhibited prominent performance in learning graph-structured data. Considering node classification task, based on the i.i.d assumption among node labels, the traditional supervised learning simply sums up cross-entropy losses of the independent training nodes and applies the average loss to optimize GNNs' weights. But different from other data formats, the nodes are naturally connected. It is found that the independent distribution modeling of node labels restricts GNNs' capability to generalize over the entire graph and defend adversarial attacks. In this work, we propose a new framework, termed joint-cluster supervised learning, to model the joint distribution of each node with its corresponding cluster. We learn the joint distribution of node and cluster labels conditioned on their representations, and train GNNs with the obtained joint loss. In this…
Peer Reviews
Decision·ICML 2024 Poster
1. The paper is well written and easy to follow 2. The proposed method seems reasonable and sound 3. Experiments entail a lot of datasets and different GNNs as backbones
The major concern on this work is the potential over-claiming. The authors argued that existing approaches ignore the inter-dependence among node points for loss computation, which is incorrect. There are in fact quite a few existing works that already considered designing inter-dependent loss for graph learning tasks. For example, [1] proposes a new objective based on conditional random field for node classification, and [2] harnesses label propagation as a re-weighted loss. Besides, there ar
1. The studied problem is important. 2. The idea of the proposed method is novel. 3. The method is effective in comparison to the baseline cross-entropy loss.
1. Some notations or definitions haven't been clearly explained. See the questions. 2. The discussion about the connection between (5) and (4d) is missing. This makes it difficult to follow (5).
1. The proposed joint modeling of node and cluster is novel and sound. 2. The scope of the experimental evaluation is broad, including small graphs and large graphs. 3. The empirical analyses are comprehensive in terms of necessary discussions, comparisons, and visualizations.
1. The backbones adopted for experiments are mostly not those that perform the best on these benchmarks. It would be more convincing to see how the proposed loss boost the performance of strong GNN models, e.g., GCNII on Cora, that may give rise to sota performance. 2. Similar concern to 1 also exists for analyses like Table 6 (with GCN) and Table 7 (with MLP). [1] Chen et al. Simple and deep graph convolutional networks. In ICML.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Bayesian Modeling and Causal Inference
