Quantization-Guided Training for Compact TinyML Models

Sedigh Ghamari; Koray Ozcan; Thu Dinh; Andrey Melnikov; Juan Carvajal,; Jan Ernst; Sek Chai

arXiv:2103.06231·cs.LG·March 11, 2021·5 cites

Quantization-Guided Training for Compact TinyML Models

Sedigh Ghamari, Koray Ozcan, Thu Dinh, Andrey Melnikov, Juan Carvajal,, Jan Ernst, Sek Chai

PDF

Open Access

TL;DR

This paper introduces a Quantization Guided Training method that effectively compresses deep neural networks to extremely low bit-precision, enabling tiny models suitable for resource-constrained environments with minimal accuracy loss.

Contribution

The paper presents a novel quantization-aware training approach that uses customized regularization to optimize low-bit-precision models and identify compression bottlenecks.

Findings

01

Achieved 17.7x size reduction with 2-bit precision on a tiny person detection model.

02

Maintained only 3% accuracy drop compared to floating-point baseline.

03

Validated effectiveness on state-of-the-art architectures and vision datasets.

Abstract

We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight values towards a distribution that maximizes accuracy while reducing quantization errors. One of the main benefits of this approach is the ability to identify compression bottlenecks. We validate QGT using state-of-the-art model architectures on vision datasets. We also demonstrate the effectiveness of QGT with an 81KB tiny model for person detection down to 2-bit precision (representing 17.7x size reduction), while maintaining an accuracy drop of only 3% compared to a floating-point baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques