Efficiera Residual Networks: Hardware-Friendly Fully Binary Weight with 2-bit Activation Model Achieves Practical ImageNet Accuracy
Shuntaro Takahashi, Takuya Wakisaka, Hiroyuki Tokunaga

TL;DR
Efficiera Residual Networks (ERNs) are fully ultra-low-bit quantized neural networks optimized for edge devices, achieving high accuracy and fast inference on ImageNet with minimal model size and hardware resource usage.
Contribution
ERNs introduce a novel shared constant scaling factor technique enabling fully binary weights and 2-bit activations without float operations until the final layer.
Findings
Achieve 72.5% top-1 accuracy on ImageNet with ResNet50 architecture.
Operate at 300FPS on FPGA with models under 1MB.
Maintain high accuracy and speed with fully binary weights and 2-bit activations.
Abstract
The edge-device environment imposes severe resource limitations, encompassing computation costs, hardware resource usage, and energy consumption for deploying deep neural network models. Ultra-low-bit quantization and hardware accelerators have been explored as promising approaches to address these challenges. Ultra-low-bit quantization significantly reduces the model size and the computational cost. Despite progress so far, many competitive ultra-low-bit models still partially rely on float or non-ultra-low-bit quantized computation such as the input and output layer. We introduce Efficiera Residual Networks (ERNs), a model optimized for low-resource edge devices. ERNs achieve full ultra-low-bit quantization, with all weights, including the initial and output layers, being binary, and activations set at 2 bits. We introduce the shared constant scaling factor technique to enable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Adversarial Robustness in Machine Learning
MethodsConvolution · Sparse Evolutionary Training
