ScatterSample: Diversified Label Sampling for Data Efficient Graph Neural Network Learning
Zhenwei Dai, Vasileios Ioannidis, Soji Adeshina, Zak Jost, Christos, Faloutsos, George Karypis

TL;DR
ScatterSample introduces a diversified active sampling method for GNN training that efficiently reduces labeling costs by selecting uncertain and representative nodes, outperforming existing methods across multiple datasets.
Contribution
The paper proposes ScatterSample, a novel active learning framework for GNNs that combines uncertainty and diversity in sampling, supported by theoretical analysis and empirical validation.
Findings
Reduces sampling cost by up to 50%
Achieves comparable test accuracy with fewer labels
Outperforms existing GNN active learning baselines
Abstract
What target labels are most effective for graph neural network (GNN) training? In some applications where GNNs excel-like drug design or fraud detection, labeling new instances is expensive. We develop a data-efficient active sampling framework, ScatterSample, to train GNNs under an active learning setting. ScatterSample employs a sampling module termed DiverseUncertainty to collect instances with large uncertainty from different regions of the sample space for labeling. To ensure diversification of the selected nodes, DiverseUncertainty clusters the high uncertainty nodes and selects the representative nodes from each cluster. Our ScatterSample algorithm is further supported by rigorous theoretical analysis demonstrating its advantage compared to standard active sampling methods that aim to simply maximize the uncertainty and not diversify the samples. In particular, we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning and Algorithms · Machine Learning in Materials Science
MethodsGraph Neural Network · Test
