DeepGESI: A Non-Intrusive Objective Evaluation Model for Predicting Speech Intelligibility in Hearing-Impaired Listeners
Wenyu Luo, Jinhui Chen

TL;DR
DeepGESI is a novel deep learning model that non-intrusively predicts speech intelligibility for hearing-impaired listeners, outperforming existing metrics in accuracy and speed without needing reference signals.
Contribution
It introduces a non-intrusive deep learning approach to accurately estimate hearing-impaired speech intelligibility without reference signals, addressing limitations of existing metrics.
Findings
Strong correlation with actual GESI scores on CPC2 dataset
Faster prediction speed than traditional methods
Effective for hearing-impaired speech intelligibility assessment
Abstract
Speech intelligibility assessment is essential for many speech-related applications. However, most objective intelligibility metrics are intrusive, as they require clean reference speech in addition to the degraded or processed signal for evaluation. Furthermore, existing metrics such as STOI are primarily designed for normal hearing listeners, and their predictive accuracy for hearing impaired speech intelligibility remains limited. On the other hand, the GESI (Gammachirp Envelope Similarity Index) can be used to estimate intelligibility for hearing-impaired listeners, but it is also intrusive, as it depends on reference signals. This requirement limits its applicability in real-world scenarios. To overcome this limitation, this study proposes DeepGESI, a non-intrusive deep learning-based model capable of accurately and efficiently predicting the speech intelligibility of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Speech and Audio Processing · Voice and Speech Disorders
