TL;DR
This paper presents MixQ-GNN, a framework for mixed precision quantization in GNNs that reduces computational costs significantly while maintaining prediction accuracy, by systematically selecting optimal integer bit-widths for various GNN components.
Contribution
It introduces a theorem for efficient integer message aggregation and a systematic method for selecting mixed precision bit-widths in GNN layers, enhancing efficiency without sacrificing performance.
Findings
Achieved 5.5x reduction in bit operations for node classification
Achieved 5.1x reduction in bit operations for graph classification
Maintained comparable prediction performance with reduced computational cost
Abstract
Graph Neural Networks (GNNs) have become essential for handling large-scale graph applications. However, the computational demands of GNNs necessitate the development of efficient methods to accelerate inference. Mixed precision quantization emerges as a promising solution to enhance the efficiency of GNN architectures without compromising prediction performance. Compared to conventional deep learning architectures, GNN layers contain a wider set of components that can be quantized, including message passing functions, aggregation functions, update functions, the inputs, learnable parameters, and outputs of these functions. In this paper, we introduce a theorem for efficient quantized message passing to aggregate integer messages. It guarantees numerical equality of the aggregated messages using integer values with respect to those obtained with full (FP32) precision. Based on this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
