Covertly improving intelligibility with data-driven adaptations of speech timing

Paige Tutt\"os\'i; Angelica Lim; H. Henny Yeung; Yue Wang; Jean-Julien Aucouturier

arXiv:2603.30032·cs.CL·April 1, 2026

Covertly improving intelligibility with data-driven adaptations of speech timing

Paige Tutt\"os\'i, Angelica Lim, H. Henny Yeung, Yue Wang, Jean-Julien Aucouturier

PDF

TL;DR

This study demonstrates that targeted, data-driven adjustments to speech timing can significantly improve speech intelligibility for diverse listeners, often without their awareness, and introduces a novel speech synthesis method to implement these findings.

Contribution

The paper introduces a data-driven approach to modify speech timing in synthesis, enhancing comprehension for both native and non-native listeners under challenging conditions.

Findings

01

Targeted speech-rate adjustments improve comprehension in noisy environments.

02

Listeners are unaware of the benefits of targeted slowing, perceiving it as clearer.

03

A new text-to-speech algorithm replicates beneficial temporal structures in speech.

Abstract

Human talkers often address listeners with language-comprehension challenges, such as hard-of-hearing or non-native adults, by globally slowing down their speech. However, it remains unclear whether this strategy actually makes speech more intelligible. Here, we take advantage of recent advancements in machine-generated speech allowing more precise control of speech rate in order to systematically examine how targeted speech-rate adjustments may improve comprehension. We first use reverse-correlation experiments to show that the temporal influence of speech rate prior to a target vowel contrast (ex. the tense-lax distinction) in fact manifests in a scissor-like pattern, with opposite effects in early versus late context windows; this pattern is remarkably stable both within individuals and across native L1-English listeners and L2-English listeners with French, Mandarin, and Japanese…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.