Bonsai: Gradient-free Graph Condensation for Node Classification
Mridul Gupta, Samyak Jain, Vansh Ramani, Hariprasad Kodamana, and Sayan Ranu

TL;DR
Bonsai introduces a gradient-free, linear-time graph condensation method that efficiently compresses datasets for node classification, outperforming existing methods in speed and accuracy while being model-agnostic and robust.
Contribution
Bonsai is the first linear-time, model-agnostic graph condensation algorithm based on computation trees, addressing key limitations of prior gradient-based methods.
Findings
Outperforms existing baselines in accuracy across 7 datasets.
Operates 22 times faster on average than previous methods.
Provides mathematical guarantees on approximation strategies.
Abstract
Graph condensation has emerged as a promising avenue to enable scalable training of GNNs by compressing the training dataset while preserving essential graph characteristics. Our study uncovers significant shortcomings in current graph condensation techniques. First, the majority of the algorithms paradoxically require training on the full dataset to perform condensation. Second, due to their gradient-emulating approach, these methods require fresh condensation for any change in hyperparameters or GNN architecture, limiting their flexibility and reusability. Finally, they fail to achieve substantial size reduction due to synthesizing fully-connected, edge-weighted graphs. To address these challenges, we present Bonsai, a novel graph condensation method empowered by the observation that \textit{computation trees} form the fundamental processing units of message-passing GNNs. Bonsai…
Peer Reviews
Decision·ICLR 2025 Poster
1. Compared to previous works, BONSAI is novel. 2. The experimental results look very good, especially regarding the training time. 3. The theoretical analysis is solid.
1. This paper is not easy to understand. 2. In some cases, BONSAI does not perform the best, such as with citeseer. 3. Regarding table 5, can you provide experimental results for other compression rates? 4. PPR and RkNN involve many parameters, and the ablation study in Fig. 4(b) is insufficient.
1. The proposed gradient-free approach bypasses the need for computationally expensive gradient calculations, resulting in a significantly faster distillation process. This efficiency makes Bonsai highly scalable even for large datasets. 2. This model-agnostic method is interesting and saves efforts in hyperparameter tuning when changing condensation models. 3. It is the first distillation method that retains the original node features and synthesizes graphs with unweighted edges, which more fai
1. The idea of Bonsai is very similar to MIRAGE[1], as both methods select frequent trees. This similarity makes Bonsai appear to be a minor adaptation of MIRAGE. Furthermore, much of the theoretical analysis, such as Graph Isomorphism, is borrowed from MIRAGE. Although these two works focus on different tasks, it's strongly recommended to discuss the differences between Bonsai and MIRAGE in the related work section. 2. This paper claims that "distilling to fully-condensed graph" is a problem fo
1. The performance of bonsai is impressive, including the cross-arch results. 2. The proof of maximizing the representative power of exemplars is NP-hard is simple but attracting.
See questions.
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning and ELM
