Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Haoyu Li, Junichi Yamagishi

TL;DR
This paper introduces a GAN-based deep learning system that enhances speech intelligibility in noisy environments by optimizing multiple speech metrics while maintaining constant signal power, suitable for both real-time and offline applications.
Contribution
It presents a novel multi-metric optimization framework using GANs for speech enhancement that improves intelligibility and robustness over existing methods.
Findings
Significant improvements in speech intelligibility metrics.
Robust performance under noisy and reverberant conditions.
Effective in both real-time and offline scenarios.
Abstract
The intelligibility of speech severely degrades in the presence of environmental noise and reverberation. In this paper, we propose a novel deep learning based system for modifying the speech signal to increase its intelligibility under the equal-power constraint, i.e., signal power before and after modification must be the same. To achieve this, we use generative adversarial networks (GANs) to obtain time-frequency dependent amplification factors, which are then applied to the input raw speech to reallocate the speech energy. Instead of optimizing only a single, simple metric, we train a deep neural network (DNN) model to simultaneously optimize multiple advanced speech metrics, including both intelligibility- and quality-related ones, which results in notable improvements in performance and robustness. Our system can not only work in non-realtime mode for offline audio playback but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Acoustic Wave Phenomena Research
