# Assessing Visual Contributions to the Perception of Speech in Noise

**Authors:** Lida C. Alampounti, Hannah Cooper, Stuart Rosen, Jennifer K. Bizley

PMC · DOI: 10.1177/23312165261428755 · 2026-03-16

## TL;DR

This study shows that visual cues beyond lipreading help people understand speech in noisy environments, improving auditory focus and speech perception.

## Contribution

The study introduces a new audiovisual speech-in-noise test to evaluate visual contributions to auditory streaming and speech perception.

## Key findings

- Both full audiovisual and interrupted visual conditions improved speech reception over static audio alone.
- Visual coherence with the target talker provided the greatest listening advantage.
- Interrupted visual cues still offered a robust benefit despite no lipreading information.

## Abstract

Investigations of the role of audiovisual integration in speech-in-noise perception have largely focused on the benefits provided by lipreading cues. Nonetheless, audiovisual temporal coherence can offer a complementary advantage in auditory selective attention tasks. We developed an audiovisual speech-in-noise test to assess the benefit of visually-conveyed phonetic information and visual contributions to auditory streaming. The test was a video version of the Children's Coordinate Response Measure with a noun as the second keyword (vCCRMn). The vCCRMn allowed us to measure speech reception thresholds in the presence of two competing talkers under three visual conditions: a full naturalistic video (AV), a video which was interrupted during the target word presentation (Inter), thus, providing no lipreading cues, and a static image of a talker with audio (A). In each case, the video/image could display either the target talker or one of the two competing maskers. We assessed speech reception thresholds in each visual condition in 37 young (≤35 years old) normal-hearing participants. Lipreading ability was independently assessed with the test of adult speechreading (TAS). Results showed that both target-coherent AV and Inter visual conditions offer participants a listening benefit over the static image with audio condition. Target coherent visual information provided the greatest listening advantage in the full audiovisual condition, but a robust advantage was also seen in the interrupted condition, where listeners were unable to lipread the target words. Together, our results are consistent with visual information providing multiple benefits to listening, through lipreading and enhanced auditory streaming.

## Full-text entities

- **Diseases:** psychiatric (MESH:D001523), related difficulties (MESH:D051346), ORCID iDs (MESH:C535742), hearing loss (MESH:D034381), tinnitus (MESH:D014012), Deafness and Hearing Problems (MESH:D003638)
- **Chemicals:** CCRM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Ovis aries (domestic sheep, species) [taxon 9940], Bos taurus (bovine, species) [taxon 9913], Felis catus (cat, species) [taxon 9685], Sus scrofa (pig, species) [taxon 9823], Canis lupus familiaris (dog, subspecies) [taxon 9615]

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13009578/full.md

---
Source: https://tomesphere.com/paper/PMC13009578