Uncertainty Makes It Stable: Curiosity-Driven Quantized Mixture-of-Experts
Sebasti\'an Andr\'es Cajas Ord\'o\~nez, Luis Fernando Torres Torres, Mackenzie J. Meni, Carlos Andr\'es Duran Paredes, Eric Arazo, Cristian Bosch, Ricardo Simon Carbajo, Yuan Lai, Leo Anthony Celi

TL;DR
This paper introduces a curiosity-driven quantized Mixture-of-Experts framework that maintains high accuracy and stability in resource-constrained neural network deployment by leveraging Bayesian uncertainty for dynamic expert routing.
Contribution
It presents a novel uncertainty-based routing mechanism in quantized Mixture-of-Experts models, improving accuracy, stability, and interpretability for edge AI applications.
Findings
4-bit quantization achieves 99.9% of full-precision F1 score with 4x compression.
Curiosity-driven routing reduces variance by up to 85% across datasets.
High-precision experts automatically handle uncertain samples, enhancing stability.
Abstract
Deploying deep neural networks on resource-constrained devices faces two critical challenges: maintaining accuracy under aggressive quantization while ensuring predictable inference latency. We present a curiosity-driven quantized Mixture-of-Experts framework that addresses both through Bayesian epistemic uncertainty-based routing across heterogeneous experts (BitNet ternary, 1-16 bit BitLinear, post-training quantization). Evaluated on audio classification benchmarks (ESC-50, Quinn, UrbanSound8K), our 4-bit quantization maintains 99.9 percent of full-precision F1 (0.858 vs 0.859) with 4x compression and 31 percent energy savings versus 8-bit, while both achieve statistical parity with full precision (p > 0.05). Crucially, curiosity-driven routing simultaneously improves accuracy and stability: on Quinn, F1 increases from 0.802 to 0.809 while cross-fold variance drops by 85 percent (p…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Wireless Signal Modulation Classification
