Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals   using Self Supervised Speech Representations

George Close; Thomas Hain; Stefan Goetze

arXiv:2307.13423·cs.SD·December 8, 2023·1 cites

Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations

George Close, Thomas Hain, Stefan Goetze

PDF

Open Access

TL;DR

This paper explores the use of self-supervised speech representations for non-intrusive speech intelligibility prediction tailored for hearing-impaired users, demonstrating competitive results and highlighting the need for more data for better generalization.

Contribution

It extends self-supervised speech representations to predict speech intelligibility for hearing-impaired individuals, showing their effectiveness as input features for non-intrusive models.

Findings

01

Self-supervised speech representations are effective for intelligibility prediction.

02

Models achieve performance comparable to more complex systems.

03

More data is needed for better generalization to unknown systems and users.

Abstract

Self-supervised speech representations (SSSRs) have been successfully applied to a number of speech-processing tasks, e.g. as feature extractor for speech quality (SQ) prediction, which is, in turn, relevant for assessment and training speech enhancement systems for users with normal or impaired hearing. However, exact knowledge of why and how quality-related information is encoded well in such representations remains poorly understood. In this work, techniques for non-intrusive prediction of SQ ratings are extended to the prediction of intelligibility for hearing-impaired users. It is found that self-supervised representations are useful as input features to non-intrusive prediction models, achieving competitive performance to more complex systems. A detailed analysis of the performance depending on Clarity Prediction Challenge 1 listeners and enhancement systems indicates that more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Hearing Loss and Rehabilitation