Batch Normalization-Free Fully Integer Quantized Neural Networks via Progressive Tandem Learning

Pengfei Sun; Wenyu Jiang; Piew Yoong Chee; Paul Devos; Dick Botteldooren

arXiv:2512.16476·cs.LG·December 19, 2025

Batch Normalization-Free Fully Integer Quantized Neural Networks via Progressive Tandem Learning

Pengfei Sun, Wenyu Jiang, Piew Yoong Chee, Paul Devos, Dick Botteldooren

PDF

Open Access

TL;DR

This paper introduces a method to train fully integer quantized neural networks without batch normalization, using progressive layer-wise distillation, enabling efficient integer-only inference suitable for edge devices.

Contribution

A novel progressive, layer-wise distillation approach that trains BN-free, fully integer quantized neural networks from pretrained teachers, compatible with existing low-bit pipelines.

Findings

01

Achieves competitive Top-1 accuracy on ImageNet with AlexNet under aggressive quantization.

02

Enables end-to-end integer-only inference compatible with standard workflows.

03

Facilitates deployment on resource-constrained edge and embedded devices.

Abstract

Quantised neural networks (QNNs) shrink models and reduce inference energy through low-bit arithmetic, yet most still depend on a running statistics batch normalisation (BN) layer, preventing true integer-only deployment. Prior attempts remove BN by parameter folding or tailored initialisation; while helpful, they rarely recover BN's stability and accuracy and often impose bespoke constraints. We present a BN-free, fully integer QNN trained via a progressive, layer-wise distillation scheme that slots into existing low-bit pipelines. Starting from a pretrained BN-enabled teacher, we use layer-wise targets and progressive compensation to train a student that performs inference exclusively with integer arithmetic and contains no BN operations. On ImageNet with AlexNet, the BN-free model attains competitive Top-1 accuracy under aggressive quantisation. The procedure integrates directly with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques