GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning
Guibin Zhang, Haonan Dong, Yuchen Zhang, Zhixun Li, Dingshuo Chen, Kai, Wang, Tianlong Chen, Yuxuan Liang, Dawei Cheng, Kun Wang

TL;DR
GDeR is a novel graph pruning method that dynamically selects balanced and representative training subsets using trainable prototypes, significantly reducing data and computational costs while maintaining or improving GNN performance on large, imbalanced, and noisy datasets.
Contribution
Introduces GDeR, a dynamic soft-pruning approach with trainable prototypes that constructs a graph embedding hypersphere for balanced subset sampling, enhancing efficiency and robustness.
Findings
Achieves 30-50% data reduction while maintaining performance.
Provides up to 2.81x training speedup without loss.
Outperforms existing pruning methods in imbalanced and noisy scenarios.
Abstract
Training high-quality deep models necessitates vast amounts of data, resulting in overwhelming computational and memory demands. Recently, data pruning, distillation, and coreset selection have been developed to streamline data volume by retaining, synthesizing, or selecting a small yet informative subset from the full set. Among these methods, data pruning incurs the least additional training cost and offers the most practical acceleration benefits. However, it is the most vulnerable, often suffering significant performance degradation with imbalanced or biased data schema, thus raising concerns about its accuracy and reliability in on-device deployment. Therefore, there is a looming need for a new data pruning paradigm that maintains the efficiency of previous practices while ensuring balance and robustness. Unlike the fields of computer vision and natural language processing, where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
MethodsDataset Pruning · Pruning
