HC-GST: Heterophily-aware Distribution Consistency based Graph Self-training
Fali Wang, Tianxiang Zhao, Junjie Xu, Suhang Wang

TL;DR
This paper introduces HC-GST, a novel graph self-training framework that reduces distribution shifts in homophily ratios on heterophilic graphs, improving performance by aligning pseudo-labels with global homophily distributions.
Contribution
It proposes a new method for estimating homophily ratios with soft labels and aligning pseudo-labels to the global distribution, addressing training bias in heterophilic graphs.
Findings
HC-GST reduces training bias and improves self-training accuracy.
The method effectively aligns pseudo-labels with the global homophily distribution.
Experiments show enhanced performance on both homophilic and heterophilic graphs.
Abstract
Graph self-training (GST), which selects and assigns pseudo-labels to unlabeled nodes, is popular for tackling label sparsity in graphs. However, recent study on homophily graphs show that GST methods could introduce and amplify distribution shift between training and test nodes as they tend to assign pseudo-labels to nodes they are good at. As GNNs typically perform better on homophilic nodes, there could be potential shifts towards homophilic pseudo-nodes, which is underexplored. Our preliminary experiments on heterophilic graphs verify that these methods can cause shifts in homophily ratio distributions, leading to \textit{training bias} that improves performance on homophilic nodes while degrading it on heterophilic ones. Therefore, we study a novel problem of reducing homophily ratio distribution shifts during self-training on heterophilic graphs. A key challenge is the accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsALIGN
