A Data-Driven Investigation of Noise-Adaptive Utterance Generation with   Linguistic Modification

Anupama Chingacham; Vera Demberg; Dietrich Klakow

arXiv:2210.10252·cs.CL·October 20, 2022

A Data-Driven Investigation of Noise-Adaptive Utterance Generation with Linguistic Modification

Anupama Chingacham, Vera Demberg, Dietrich Klakow

PDF

Open Access

TL;DR

This paper investigates how linguistic modifications in paraphrases can improve speech intelligibility in noisy environments, demonstrating a data-driven approach with a new ranking model that enhances understanding under challenging noise conditions.

Contribution

It introduces a dataset of paraphrases in noise, analyzes noise-robust cues, and proposes an intelligibility-aware ranking model that outperforms baselines.

Findings

01

Careful paraphrase selection improves intelligibility by 33% at SNR -5 dB.

02

Intelligibility differences are mainly driven by noise-robust acoustic cues.

03

The proposed ranking model outperforms baselines with a 31.37% relative improvement.

Abstract

In noisy environments, speech can be hard to understand for humans. Spoken dialog systems can help to enhance the intelligibility of their output, either by modifying the speech synthesis (e.g., imitate Lombard speech) or by optimizing the language generation. We here focus on the second type of approach, by which an intended message is realized with words that are more intelligible in a specific noisy environment. By conducting a speech perception experiment, we created a dataset of 900 paraphrases in babble noise, perceived by native English speakers with normal hearing. We find that careful selection of paraphrases can improve intelligibility by 33% at SNR -5 dB. Our analysis of the data shows that the intelligibility differences between paraphrases are mainly driven by noise-robust acoustic cues. Furthermore, we propose an intelligibility-aware paraphrase ranking model, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Speech and Audio Processing