Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo, Joseph Bethge, Christoph Meinel, Haojin Yang

TL;DR
This paper introduces BNext, a novel binary neural network architecture that, through improved optimization, knowledge distillation, and data augmentation, achieves over 80% accuracy on ImageNet, surpassing previous binary models.
Contribution
The paper presents BNext, a new binary neural network architecture with enhanced training techniques that achieve state-of-the-art accuracy on ImageNet.
Findings
BNext achieves 80.57% top-1 accuracy on ImageNet.
It significantly outperforms existing binary neural networks.
The proposed methods effectively close the accuracy gap with full-precision models.
Abstract
Binary neural networks are the extreme case of network quantization, which has long been thought of as a potential edge machine learning solution. However, the significant accuracy gap to the full-precision counterparts restricts their creative potential for mobile applications. In this work, we revisit the potential of binary neural networks and focus on a compelling but unanswered problem: how can a binary neural network achieve the crucial accuracy level (e.g., 80%) on ILSVRC-2012 ImageNet? We achieve this goal by enhancing the optimization process from three complementary perspectives: (1) We design a novel binary architecture BNext based on a comprehensive study of binary architectures and their optimization process. (2) We propose a novel knowledge-distillation technique to alleviate the counter-intuitive overfitting problem observed when attempting to train extremely accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · COVID-19 diagnosis using AI · Machine Learning and Data Classification
