Simple yet Effective Graph Distillation via Clustering

Yurui Lai; Taiyan Zhang; Renchi Yang

arXiv:2505.20807·cs.LG·May 28, 2025

Simple yet Effective Graph Distillation via Clustering

Yurui Lai, Taiyan Zhang, Renchi Yang

PDF

Open Access

TL;DR

This paper introduces ClustGDD, a graph data distillation method that uses clustering to efficiently create compact graphs, enabling faster and effective training of GNNs with minimal loss in performance.

Contribution

ClustGDD is a novel graph data distillation approach that combines clustering with attribute refinement, improving efficiency and quality over existing methods.

Findings

01

ClustGDD achieves comparable or better node classification accuracy.

02

It significantly reduces distillation time compared to state-of-the-art methods.

03

Experimental results on five datasets validate its effectiveness.

Abstract

Despite plentiful successes achieved by graph representation learning in various domains, the training of graph neural networks (GNNs) still remains tenaciously challenging due to the tremendous computational overhead needed for sizable graphs in practice. Recently, graph data distillation (GDD), which seeks to distill large graphs into compact and informative ones, has emerged as a promising technique to enable efficient GNN training. However, most existing GDD works rely on heuristics that align model gradients or representation distributions on condensed and original graphs, leading to compromised result quality, expensive training for distilling large graphs, or both. Motivated by this, this paper presents an efficient and effective GDD approach, ClustGDD. Under the hood, ClustGDD resorts to synthesizing the condensed graph and node attributes through fast and theoretically-grounded…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Neural Networks and Applications · Data Mining Algorithms and Applications

MethodsALIGN