Low-bit Shift Network for End-to-End Spoken Language Understanding

Anderson R. Avila; Khalil Bibi; Rui Heng Yang; Xinlin Li; Chao Xing,; Xiao Chen

arXiv:2207.07497·cs.SD·July 18, 2022

Low-bit Shift Network for End-to-End Spoken Language Understanding

Anderson R. Avila, Khalil Bibi, Rui Heng Yang, Xinlin Li, Chao Xing,, Xiao Chen

PDF

Open Access

TL;DR

This paper introduces a low-bit shift neural network using power-of-two quantization for end-to-end spoken language understanding, achieving high accuracy with reduced computational complexity suitable for edge devices.

Contribution

It proposes a novel low-bit power-of-two quantization method for shift neural networks, enabling efficient SLU with performance comparable to full-precision models.

Findings

01

Achieved 98.76% accuracy on test set.

02

Reduced computational complexity by removing multiplications.

03

Performed comparably to state-of-the-art solutions.

Abstract

Deep neural networks (DNN) have achieved impressive success in multiple domains. Over the years, the accuracy of these models has increased with the proliferation of deeper and more complex architectures. Thus, state-of-the-art solutions are often computationally expensive, which makes them unfit to be deployed on edge computing platforms. In order to mitigate the high computation, memory, and power requirements of inferring convolutional neural networks (CNNs), we propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values. This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights. ResNet is adopted as the building block of our solution and the proposed model is evaluated on a spoken language understanding (SLU) task. Experimental results show improved performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Geophysical Methods and Applications · Speech and Audio Processing

MethodsTest · Residual Connection · 1x1 Convolution · Batch Normalization · Kaiming Initialization · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Average Pooling · Convolution · Global Average Pooling