# Neural speech tracking in a virtual acoustic environment: audio-visual benefit for unscripted continuous speech

**Authors:** Mareike Daeglau, Jürgen Otten, Giso Grimm, Bojana Mirkovic, Volker Hohmann, Stefan Debener

PMC · DOI: 10.3389/fnhum.2025.1560558 · 2025-04-09

## TL;DR

This study shows that seeing a speaker's lips helps understand speech better, especially in noisy environments, using natural and unscripted speech.

## Contribution

The study introduces a novel use of unscripted speech in virtual environments to examine audio-visual benefits in speech perception.

## Key findings

- Audio-visual speech tracking shows significant enhancement in noisy environments.
- Lip movements are crucial for speech understanding in adverse listening conditions.
- Individual speaker characteristics influence audio-visual integration effectiveness.

## Abstract

The audio-visual benefit in speech perception—where congruent visual input enhances auditory processing—is well-documented across age groups, particularly in challenging listening conditions and among individuals with varying hearing abilities. However, most studies rely on highly controlled laboratory environments with scripted stimuli. Here, we examine the audio-visual benefit using unscripted, natural speech from untrained speakers within a virtual acoustic environment. Using electroencephalography (EEG) and cortical speech tracking, we assessed neural responses across audio-visual, audio-only, visual-only, and masked-lip conditions to isolate the role of lip movements. Additionally, we analysed individual differences in acoustic and visual features of the speakers, including pitch, jitter, and lip-openness, to explore their influence on the audio-visual speech tracking benefit. Results showed a significant audio-visual enhancement in speech tracking with background noise, with the masked-lip condition performing similarly to the audio-only condition, emphasizing the importance of lip movements in adverse listening situations. Our findings reveal the feasibility of cortical speech tracking with naturalistic stimuli and underscore the impact of individual speaker characteristics on audio-visual integration in real-world listening contexts.

## Full-text entities

- **Diseases:** COVID-19 (MESH:D000086382), hearing impairments (MESH:D034381), auditory deterioration (MESH:D006311), neurological or psychological disorders (MESH:D020018), AAD (MESH:D001289)
- **Chemicals:** alcohol (MESH:D000438)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12014754/full.md

---
Source: https://tomesphere.com/paper/PMC12014754