BANGS: Game-Theoretic Node Selection for Graph Self-Training

Fangxin Wang; Kay Liu; Sourav Medya; Philip S. Yu

arXiv:2410.09348·cs.LG·October 15, 2024

BANGS: Game-Theoretic Node Selection for Graph Self-Training

Fangxin Wang, Kay Liu, Sourav Medya, Philip S. Yu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

BANGS introduces a game-theoretic framework for node selection in graph self-training, optimizing collective node sets based on mutual information to enhance robustness and performance of GNNs.

Contribution

It unifies node labeling with mutual information maximization using game theory, enabling collective node selection with theoretical robustness guarantees.

Findings

01

Outperforms existing methods across multiple datasets.

02

Demonstrates robustness under noisy conditions.

03

Provides theoretical guarantees for node selection robustness.

Abstract

Graph self-training is a semi-supervised learning method that iteratively selects a set of unlabeled data to retrain the underlying graph neural network (GNN) model and improve its prediction performance. While selecting highly confident nodes has proven effective for self-training, this pseudo-labeling strategy ignores the combinatorial dependencies between nodes and suffers from a local view of the distribution. To overcome these issues, we propose BANGS, a novel framework that unifies the labeling strategy with conditional mutual information as the objective of node selection. Our approach -- grounded in game theory -- selects nodes in a combinatorial fashion and provides theoretical guarantees for robustness under noisy objective. More specifically, unlike traditional methods that rank and select nodes independently, BANGS considers nodes as a collective set in the self-training…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 4

Strengths

This paper introduces a new direction in graph self-training by integrating conditional mutual information into the pseudo-labeling process. The proposed method may have the potential to improve the effectiveness of semi-supervised learning on graphs, particularly in scenarios where node dependencies play a significant role. The empirical studies are solid.

Weaknesses

1. One main concern is the rationale behind forming a node set for graph self-training from a submodular optimization perspective. The paper argues that pseudo-labels should be evaluated and fed into the model as a set, contrasting with most existing self-training strategies that evaluate each pseudo-label individually. The justification of the traditional strategy is that adding pseudo-labels to the training set satisfies submodularity, allowing for the use of a greedy strategy to achieve an op

Reviewer 02Rating 5Confidence 3

Strengths

The structure of this article is clear and easy to understand. The experiments are detailed, and the analysis of the results is also quite clear. Moreover, the code has been made publicly available.

Weaknesses

1. The motivation of the article is unclear and not strong enough. The core work of self-training is to select suitable nodes and assign pseudo-labels. This article focuses on the combinatorial dependencies in node selection; however, the impact of such information on the model's performance is not discussed in depth and lacks supporting experiments. 2. Furthermore, I do not see self-training as an interesting research direction; it seems more like a variant of data augmentation to me. Assignin

Reviewer 03Rating 8Confidence 4

Strengths

The writing is clear. The discussion is comprehensive.

Weaknesses

The novelty of the studied problem is limited - the authors studied the node selection problem using a novel formulation with mutual information. The paper relies on a lot on (Wang & Jia, 2023) technically. It would be good to demonstrate the authors' unique contribution in the context of (Wang & Jia, 2023) - is it a straightforward extension of the referenced work?

Code & Models

Repositories

fangxin-wang/bangs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Recommender Systems and Techniques · Data Stream Mining Techniques

MethodsGraph Neural Network · Balanced Selection · Sparse Evolutionary Training