NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN

Jianhang Xie; Chuntao Ding; Xiaqing Li; Shenyuan Ren; Yidong Li; Zhichao Lu

arXiv:2506.17870·cs.LG·June 24, 2025

NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN

Jianhang Xie, Chuntao Ding, Xiaqing Li, Shenyuan Ren, Yidong Li, Zhichao Lu

PDF

1 Repo

TL;DR

NestQuant introduces a resource-efficient post-training quantization method for IoT devices, enabling dynamic model switching with minimal storage and switching overheads by integer weight nesting and adaptive weight decomposition.

Contribution

It proposes a novel integer-nesting quantization technique that allows on-device model switching without retraining or multiple models, reducing resource consumption and overheads.

Findings

01

Achieves high accuracy with nested quantized models on ImageNet.

02

Reduces switching overheads by approximately 78%.

03

Enables resource-adaptive model deployment on IoT devices.

Abstract

Deploying quantized deep neural network (DNN) models with resource adaptation capabilities on ubiquitous Internet of Things (IoT) devices to provide high-quality AI services can leverage the benefits of compression and meet multi-scenario resource requirements. However, existing dynamic/mixed precision quantization requires retraining or special hardware, whereas post-training quantization (PTQ) has two limitations for resource adaptation: (i) The state-of-the-art PTQ methods only provide one fixed bitwidth model, which makes it challenging to adapt to the dynamic resources of IoT devices; (ii) Deploying multiple PTQ models with diverse bitwidths consumes large storage resources and switching overheads. To this end, this paper introduces a resource-friendly post-training integer-nesting quantization, i.e., NestQuant, for on-device quantized model switching on IoT devices. The proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jianhayes/nestquant
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsNesT