QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization
Xiantao Jiang

TL;DR
QuIDE introduces a unified metric for evaluating quantized neural networks, balancing compression, accuracy, and latency, and guides optimal quantization strategies across diverse tasks.
Contribution
It proposes the Intelligence Index as a comprehensive evaluation metric and offers a reproducible protocol for mixed-precision quantization search.
Findings
4-bit quantization is optimal for MNIST and large LLMs.
8-bit quantization is optimal for ResNet-18 on ImageNet.
4-bit PTQ collapses accuracy on complex CNN tasks.
Abstract
There is currently no unified metric for evaluating the efficiency of quantized neural networks. We propose QuIDE, built around the Intelligence Index I = (C x P)/log_2(T+1), which collapses the compression-accuracy-latency trade-off into a single score. Experiments across six settings -- SimpleCNN (MNIST, CIFAR), ResNet-18 (ImageNet-1K), and Llama-3-8B -- show a task-dependent Pareto Knee. 4-bit quantization is optimal for MNIST and large LLMs, while 8-bit is the sweet spot for complex CNN tasks (ResNet-18 on ImageNet), where 4-bit PTQ collapses accuracy catastrophically. The accuracy-gated variant I' correctly flags these non-viable configurations that the raw I would reward. QuIDE provides a reproducible evaluation protocol and a ready-to-use fitness function for mixed-precision search.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
