MIMO Speech Compression and Enhancement Based on Convolutional Denoising   Autoencoder

You-Jin Li; Syu-Siang Wang; Yu Tsao; Borching Su

arXiv:2005.11704·eess.AS·June 8, 2021·APSIPA ASC·1 cites

MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder

You-Jin Li, Syu-Siang Wang, Yu Tsao, Borching Su

PDF

Open Access

TL;DR

This paper introduces a novel MIMO speech compression and enhancement system using convolutional denoising autoencoders, significantly improving speech quality and reducing data transmission needs in IoT environments.

Contribution

It proposes a new MIMO-SCE framework based on CDAE models, enhancing speech quality and compression efficiency over traditional systems.

Findings

01

Improves speech quality and intelligibility.

02

Reduces data transmission by a factor of 7.

03

Effective in handling multiple acoustic signals.

Abstract

For speech-related applications in IoT environments, identifying effective methods to handle interference noises and compress the amount of data in transmissions is essential to achieve high-quality services. In this study, we propose a novel multi-input multi-output speech compression and enhancement (MIMO-SCE) system based on a convolutional denoising autoencoder (CDAE) model to simultaneously improve speech quality and reduce the dimensions of transmission data. Compared with conventional single-channel and multi-input single-output systems, MIMO systems can be employed in applications that handle multiple acoustic signals need to be handled. We investigated two CDAE models, a fully convolutional network (FCN) and a Sinc FCN, as the core models in MIMO systems. The experimental results confirm that the proposed MIMO-SCE framework effectively improves speech quality and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques

MethodsConvolution · Max Pooling · Fully Convolutional Network