Optimization of a Real-Time Wavelet-Based Algorithm for Improving Speech Intelligibility
Tianqu Kang, Anh-Dung Dinh, Binghong Wang, Tianyuan Du, Yijia Chen,, and Kevin Chau (Hong Kong University of Science, Technology)

TL;DR
This paper presents a real-time wavelet-based algorithm that enhances speech intelligibility by adjusting sub-band gains, improving transcription accuracy under various noise and hearing loss conditions, with applications in hearing aids and speech processing.
Contribution
It introduces a simplified, real-time wavelet-based method for speech enhancement that effectively improves intelligibility across different noise levels and hearing impairments.
Findings
16.9% increase in transcription accuracy for clean speech
9.5% increase in transcription accuracy for noisy speech
Universal sub-band gains effective up to 4.8 dB noise-to-signal ratio
Abstract
The optimization of a wavelet-based algorithm to improve speech intelligibility along with the full data set and results are reported. The discrete-time speech signal is split into frequency sub-bands via a multi-level discrete wavelet transform. Various gains are applied to the sub-band signals before they are recombined to form a modified version of the speech. The sub-band gains are adjusted while keeping the overall signal energy unchanged, and the speech intelligibility under various background interference and simulated hearing loss conditions is enhanced and evaluated objectively and quantitatively using Google Speech-to-Text transcription. A universal set of sub-band gains can work over a range of noise-to-signal ratios up to 4.8 dB. For noise-free speech, overall intelligibility is improved, and the Google transcription accuracy is increased by 16.9 percentage points on average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Structural Health Monitoring Techniques
