True 4-Bit Quantized Convolutional Neural Network Training on CPU: Achieving Full-Precision Parity
Shivnath Tathe

TL;DR
This paper introduces a practical method for training 4-bit quantized convolutional neural networks on standard CPUs, achieving accuracy comparable to full-precision models while significantly reducing memory usage.
Contribution
The authors present a novel 4-bit training technique using only CPU operations, matching full-precision accuracy without specialized hardware or post-training quantization.
Findings
Achieves 92.34% accuracy on CIFAR-10 with 4-bit training on CPU
Maintains full-precision parity accuracy without GPU or specialized kernels
Demonstrates generalization to CIFAR-100 and rapid convergence on mobile devices
Abstract
Low-precision neural network training has emerged as a promising direction for reducing computational costs and democratizing access to deep learning research. However, existing 4-bit quantization methods either rely on expensive GPU infrastructure or suffer from significant accuracy degradation. In this work, we present a practical method for training convolutional neural networks at true 4-bit precision using standard PyTorch operations on commodity CPUs. We introduce a novel tanh-based soft weight clipping technique that, combined with symmetric quantization, dynamic per-layer scaling, and straight-through estimators, achieves stable convergence and competitive accuracy. Training a VGG-style architecture with 3.25 million parameters from scratch on CIFAR-10, our method achieves 92.34% test accuracy on Google Colab's free CPU tier -- matching full-precision baseline performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Network Packet Processing and Optimization
