A study on speech enhancement using exponent-only floating point   quantized neural network (EOFP-QNN)

Yi-Te Hsu; Yu-Chen Lin; Szu-Wei Fu; Yu Tsao; Tei-Wei Kuo

arXiv:1808.06474·eess.AS·November 1, 2018

A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)

Yi-Te Hsu, Yu-Chen Lin, Szu-Wei Fu, Yu Tsao, Tei-Wei Kuo

PDF

Open Access

TL;DR

This paper introduces EOFP-QNN, a novel exponent-only floating-point quantization method for neural networks, significantly reducing model size for speech enhancement tasks without degrading performance.

Contribution

It proposes a two-stage mantissa and exponent quantization approach specifically for regression tasks like speech enhancement, demonstrating effective size reduction.

Findings

01

Model sizes reduced to 18.75% and 21.89% of original for BLSTM and FCN.

02

Maintained speech enhancement performance after quantization.

03

First application of exponent-only floating-point quantization in speech processing.

Abstract

Numerous studies have investigated the effectiveness of neural network quantization on pattern classification tasks. The present study, for the first time, investigated the performance of speech enhancement (a regression task in speech processing) using a novel exponent-only floating-point quantized neural network (EOFP-QNN). The proposed EOFP-QNN consists of two stages: mantissa-quantization and exponent-quantization. In the mantissa-quantization stage, EOFP-QNN learns how to quantize the mantissa bits of the model parameters while preserving the regression accuracy using the least mantissa precision. In the exponent-quantization stage, the exponent part of the parameters is further quantized without causing any additional performance degradation. We evaluated the proposed EOFP quantization technique on two types of neural networks, namely, bidirectional long short-term memory (BLSTM)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing

MethodsMax Pooling · Convolution · Fully Convolutional Network