Confidence May Cheat: Self-Training on Graph Neural Networks under   Distribution Shift

Hongrui Liu; Binbin Hu; Xiao Wang; Chuan Shi; Zhiqiang Zhang; Jun Zhou

arXiv:2201.11349·cs.LG·January 28, 2022

Confidence May Cheat: Self-Training on Graph Neural Networks under Distribution Shift

Hongrui Liu, Binbin Hu, Xiao Wang, Chuan Shi, Zhiqiang Zhang, Jun Zhou

PDF

Open Access 1 Repo

TL;DR

This paper identifies the limitations of current self-training methods for graph neural networks under distribution shift and proposes a novel framework, DR-GST, to recover the original data distribution and improve training effectiveness.

Contribution

The paper introduces DR-GST, a new self-training framework that corrects distribution shifts in graph neural network training by weighting pseudo labels based on estimated information gain.

Findings

01

DR-GST effectively recovers the original data distribution.

02

The method improves GCN performance on benchmark datasets.

03

Loss correction enhances pseudo label quality.

Abstract

Graph Convolutional Networks (GCNs) have recently attracted vast interest and achieved state-of-the-art performance on graphs, but its success could typically hinge on careful training with amounts of expensive and time-consuming labeled data. To alleviate labeled data scarcity, self-training methods have been widely adopted on graphs by labeling high-confidence unlabeled nodes and then adding them to the training step. In this line, we empirically make a thorough study for current self-training methods on graphs. Surprisingly, we find that high-confidence unlabeled nodes are not always useful, and even introduce the distribution shift issue between the original labeled dataset and the augmented dataset by self-training, severely hindering the capability of self-training on graphs. To this end, in this paper, we propose a novel Distribution Recovered Graph Self-Training framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bupt-gamma/dr-gst
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks

MethodsVariational Inference · Dropout