# Audiovisual Perception of Sentence Stress in Cochlear Implant Recipients

**Authors:** Hartmut Meister, Moritz Wächtler, Pascale Sandmann, Ruth Lang-Roth, Khaled H. A. Abdel-Latif

PMC · DOI: 10.3390/audiolres15040077 · 2025-06-24

## TL;DR

This study explores how cochlear implant users and people with typical hearing use visual cues like facial movements to understand sentence stress when acoustic cues are limited.

## Contribution

The study introduces a novel use of virtual characters to investigate audiovisual sentence stress perception in cochlear implant recipients.

## Key findings

- CI users and TH participants performed similarly in congruent audiovisual conditions, but CI users better utilized visual cues.
- TH participants showed greater pupil dilation in visual-only conditions, suggesting deeper visual processing.
- CI users exhibited individual differences in cue reliability, especially in incongruent audiovisual conditions.

## Abstract

Background/Objectives: Sentence stress as part of linguistic prosody plays an important role for verbal communication. It emphasizes particularly important words in a phrase and is reflected by acoustic cues such as the voice fundamental frequency. However, visual cues, especially facial movements, are also important for sentence stress perception. Since cochlear implant (CI) recipients are limited in their use of acoustic prosody cues, the question arises as to what extent they are able to exploit visual features. Methods: Virtual characters were used to provide highly realistic but controllable stimuli for investigating sentence stress in groups of experienced CI recipients and typical-hearing (TH) peers. In addition to the proportion of correctly identified stressed words, task load was assessed via reaction times (RTs) and task-evoked pupil dilation (TEPD), and visual attention was estimated via eye tracking. Experiment 1 considered congruent combinations of auditory and visual cues, while Experiment 2 presented incongruent stimuli. Results: In Experiment 1, CI users and TH participants performed similarly in the congruent audiovisual condition, while the former were better at using visual cues. RTs were generally faster in the AV condition, whereas TEPD revealed a more detailed picture, with TH subjects showing greater pupil dilation in the visual condition. The incongruent stimuli in Experiment 2 showed that modality use varied individually among CI recipients, while TH participants relied primarily on auditory cues. Conclusions: Visual cues are generally useful for perceiving sentence stress. As a group, CI users are better at using facial cues than their TH peers. However, CI users show individual differences in the reliability of the various cues.

## Full-text entities

- **Diseases:** pupil dilation (MESH:D011681)

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12285941/full.md

---
Source: https://tomesphere.com/paper/PMC12285941