FATNN: Fast and Accurate Ternary Neural Networks
Peng Chen, Bohan Zhuang, Chunhua Shen

TL;DR
FATNN introduces a novel ternary neural network framework that reduces computational complexity and improves accuracy, achieving faster inference and better performance than existing methods, as demonstrated on image classification tasks.
Contribution
The paper presents a new ternary quantization algorithm and a framework that significantly enhances accuracy and reduces complexity of TNNs, setting new benchmarks.
Findings
Surpasses state-of-the-art accuracy in image classification
Reduces ternary inner product complexity by a factor of 2
Provides comprehensive speedup analysis across platforms
Abstract
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts. However, 2 bits are required to encode the ternary representation with only 3 quantization levels leveraged. As a result, conventional TNNs have similar memory consumption and speed compared with the standard 2-bit models, but have worse representational capability. Moreover, there is still a significant gap in accuracy between TNNs and full-precision networks, hampering their deployment to real applications. To tackle these two challenges, in this work, we first show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2. Second, to mitigate the performance gap, we elaborately design an implementation-dependent ternary quantization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
