GHN-Q: Parameter Prediction for Unseen Quantized Convolutional Architectures via Graph Hypernetworks
Stone Yun, Alexander Wong

TL;DR
This paper introduces GHN-Q, a graph hypernetwork-based method that predicts quantization-robust parameters for unseen CNN architectures, enabling effective 8-bit and even 4-bit quantized models without retraining.
Contribution
It is the first to leverage graph hypernetworks for predicting parameters of unseen quantized CNNs, demonstrating promising results in quantization robustness.
Findings
GHN-Q can predict quantization-robust parameters for unseen CNNs.
Decent accuracy achieved with 8-bit quantization, even without training on quantized data.
Potential for further improvements with quantized finetuning at lower bitwidths.
Abstract
Deep convolutional neural network (CNN) training via iterative optimization has had incredible success in finding optimal parameters. However, modern CNN architectures often contain millions of parameters. Thus, any given model for a single architecture resides in a massive parameter space. Models with similar loss could have drastically different characteristics such as adversarial robustness, generalizability, and quantization robustness. For deep learning on the edge, quantization robustness is often crucial. Finding a model that is quantization-robust can sometimes require significant efforts. Recent works using Graph Hypernetworks (GHN) have shown remarkable performance predicting high-performant parameters of varying CNN architectures. Inspired by these successes, we wonder if the graph representations of GHN-2 can be leveraged to predict quantization-robust parameters as well,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Materials Science · Adversarial Robustness in Machine Learning
