MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder
You-Jin Li, Syu-Siang Wang, Yu Tsao, Borching Su

TL;DR
This paper introduces a novel MIMO speech compression and enhancement system using convolutional denoising autoencoders, significantly improving speech quality and reducing data transmission needs in IoT environments.
Contribution
It proposes a new MIMO-SCE framework based on CDAE models, enhancing speech quality and compression efficiency over traditional systems.
Findings
Improves speech quality and intelligibility.
Reduces data transmission by a factor of 7.
Effective in handling multiple acoustic signals.
Abstract
For speech-related applications in IoT environments, identifying effective methods to handle interference noises and compress the amount of data in transmissions is essential to achieve high-quality services. In this study, we propose a novel multi-input multi-output speech compression and enhancement (MIMO-SCE) system based on a convolutional denoising autoencoder (CDAE) model to simultaneously improve speech quality and reduce the dimensions of transmission data. Compared with conventional single-channel and multi-input single-output systems, MIMO systems can be employed in applications that handle multiple acoustic signals need to be handled. We investigated two CDAE models, a fully convolutional network (FCN) and a Sinc FCN, as the core models in MIMO systems. The experimental results confirm that the proposed MIMO-SCE framework effectively improves speech quality and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques
MethodsConvolution · Max Pooling · Fully Convolutional Network
