Reduction of Subjective Listening Effort for TV Broadcast Signals with   Recurrent Neural Networks

Nils L. Westhausen; Rainer Huber; Hannah Baumgartner; Ragini Sinha,; Jan Rennies; Bernd T. Meyer

arXiv:2111.01914·eess.AS·November 4, 2021

Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks

Nils L. Westhausen, Rainer Huber, Hannah Baumgartner, Ragini Sinha,, Jan Rennies, Bernd T. Meyer

PDF

TL;DR

This paper presents a speech enhancement system using recurrent neural networks to separate speech from background noise in TV broadcasts, reducing listening effort and improving perceived sound quality for listeners.

Contribution

It introduces a novel RNN-based approach that separates and remixes speech signals at higher SNR, enhancing audio clarity in broadcast signals.

Findings

01

Reduces listening effort by around 2 points on a 13-point scale

02

Increases perceived sound quality compared to original mixture

03

Effective in real TV-broadcast scenarios

Abstract

Listening to the audio of TV broadcast signals can be challenging for hearing-impaired as well as normal-hearing listeners, especially when background sounds are prominent or too loud compared to the speech signal. This can result in a reduced satisfaction and increased listening effort of the listeners. Since the broadcast sound is usually premixed, we perform a subjective evaluation for quantifying the potential of speech enhancement systems based on audio source separation and recurrent neural networks (RNN). Recently, RNNs have shown promising results in the context of sound source separation and real-time signal processing. In this paper, we separate the speech from the background signals and remix the separated sounds at a higher signal-to-noise ratio. This differs from classic speech enhancement, where usually only the extracted speech signal is exploited. The subjective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.