TL;DR
This paper introduces a low-bit quantization method for deep GNNs that maintains accuracy and mitigates oversmoothing, enabling efficient processing and significant speedups in resource-constrained environments.
Contribution
The paper proposes a novel quantization approach with a smoothness-aware message propagation mechanism, effectively reducing model size and addressing oversmoothing in deep GNNs.
Findings
Achieves high accuracy with INT2 quantization across all GNN stages.
Demonstrates over 5x speedup in inference with low-bit models.
Outperforms existing quantization and deep GNN methods.
Abstract
Graph Neural Network (GNN) training and inference involve significant challenges of scalability with respect to both model sizes and number of layers, resulting in degradation of efficiency and accuracy for large and deep GNNs. We present an end-to-end solution that aims to address these challenges for efficient GNNs in resource constrained environments while avoiding the oversmoothing problem in deep GNNs. We introduce a quantization based approach for all stages of GNNs, from message passing in training to node classification, compressing the model and enabling efficient processing. The proposed GNN quantizer learns quantization ranges and reduces the model size with comparable accuracy even under low-bit quantization. To scale with the number of layers, we devise a message propagation mechanism in training that controls layer-wise changes of similarities between neighboring nodes.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodsfail
