Exploiting Hidden Representations from a DNN-based Speech Recogniser for   Speech Intelligibility Prediction in Hearing-impaired Listeners

Zehai Tu; Ning Ma; Jon Barker

arXiv:2204.04287·eess.AS·July 7, 2022

Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners

Zehai Tu, Ning Ma, Jon Barker

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach that uses hidden representations from a DNN-based speech recognizer to improve the prediction of speech intelligibility for hearing-impaired listeners, outperforming traditional methods.

Contribution

The work leverages deep neural network hidden layers as features for intelligibility prediction, providing a new perspective beyond conventional acoustic feature-based methods.

Findings

01

Proposed method outperforms STOI-based binaural measure in prediction accuracy.

02

Using DNN hidden representations improves speech intelligibility prediction.

03

Experiment results validate the effectiveness of the approach on hearing aid data.

Abstract

An accurate objective speech intelligibility prediction algorithms is of great interest for many applications such as speech enhancement for hearing aids. Most algorithms measures the signal-to-noise ratios or correlations between the acoustic features of clean reference signals and degraded signals. However, these hand-picked acoustic features are usually not explicitly correlated with recognition. Meanwhile, deep neural network (DNN) based automatic speech recogniser (ASR) is approaching human performance in some speech recognition tasks. This work leverages the hidden representations from DNN-based ASR as features for speech intelligibility prediction in hearing-impaired listeners. The experiments based on a hearing aid intelligibility database show that the proposed method could make better prediction than a widely used short-time objective intelligibility (STOI) based binaural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

claritychallenge/clarity/tree/main/recipes/cpc1/e032_sheffield
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Speech Recognition and Synthesis