MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI

Zijun Jiang; Yangdi Lyu

arXiv:2508.09500·cs.LG·August 14, 2025

MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI

Zijun Jiang, Yangdi Lyu

PDF

TL;DR

MiCo is an end-to-end framework that optimizes mixed-precision quantization schemes for neural networks on edge devices, balancing accuracy and latency for efficient deployment.

Contribution

The paper introduces MiCo, a novel framework that efficiently explores and deploys mixed-precision quantized neural networks tailored for edge AI hardware.

Findings

01

Optimizes quantization schemes with high accuracy under latency constraints.

02

Builds hardware-aware latency models for fast exploration.

03

Enables direct deployment from PyTorch to C with minimal accuracy loss.

Abstract

Quantized Neural Networks (QNN) with extremely low-bitwidth data have proven promising in efficient storage and computation on edge devices. To further reduce the accuracy drop while increasing speedup, layer-wise mixed-precision quantization (MPQ) becomes a popular solution. However, existing algorithms for exploring MPQ schemes are limited in flexibility and efficiency. Comprehending the complex impacts of different MPQ schemes on post-training quantization and quantization-aware training results is a challenge for conventional methods. Furthermore, an end-to-end framework for the optimization and deployment of MPQ models is missing in existing work. In this paper, we propose the MiCo framework, a holistic MPQ exploration and deployment framework for edge AI applications. The framework adopts a novel optimization algorithm to search for optimal quantization schemes with the highest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.