Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations
George Close, Thomas Hain, Stefan Goetze

TL;DR
This paper explores the use of self-supervised speech representations for non-intrusive speech intelligibility prediction tailored for hearing-impaired users, demonstrating competitive results and highlighting the need for more data for better generalization.
Contribution
It extends self-supervised speech representations to predict speech intelligibility for hearing-impaired individuals, showing their effectiveness as input features for non-intrusive models.
Findings
Self-supervised speech representations are effective for intelligibility prediction.
Models achieve performance comparable to more complex systems.
More data is needed for better generalization to unknown systems and users.
Abstract
Self-supervised speech representations (SSSRs) have been successfully applied to a number of speech-processing tasks, e.g. as feature extractor for speech quality (SQ) prediction, which is, in turn, relevant for assessment and training speech enhancement systems for users with normal or impaired hearing. However, exact knowledge of why and how quality-related information is encoded well in such representations remains poorly understood. In this work, techniques for non-intrusive prediction of SQ ratings are extended to the prediction of intelligibility for hearing-impaired users. It is found that self-supervised representations are useful as input features to non-intrusive prediction models, achieving competitive performance to more complex systems. A detailed analysis of the performance depending on Clarity Prediction Challenge 1 listeners and enhancement systems indicates that more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Hearing Loss and Rehabilitation
