# Accuracy of Speech-to-Text Transcription in a Digital Cognitive Assessment for Older Adults

**Authors:** Ariel M. Gordon, Peter E. Wais

PMC · DOI: 10.3390/brainsci15101090 · 2025-10-09

## TL;DR

This study shows that Apple's speech-to-text technology can accurately transcribe verbal responses in digital cognitive tests for older adults, supporting its use in improving digital assessments.

## Contribution

The study demonstrates that speech-to-text transcription errors do not significantly impact standardized cognitive scores in digital assessments.

## Key findings

- Speech-to-text transcriptions showed differences compared to human-corrected transcriptions.
- Transcription errors did not significantly affect standardized cognitive performance scores.
- Apple’s STT engine is practically useful for digital neuropsychological assessments.

## Abstract

Background/Objectives: Neuropsychological assessments are valuable tools for evaluating the cognitive performance of older adults. Limitations associated with these in-person paper-and-pencil tests have inspired efforts to develop digital assessments, which would expand access to cognitive screening. Digital tests, however, often lack validity relative to gold-standard paper-and-pencil versions that have been robustly validated. Speech-to-text (STT) technology has the potential to improve the validity of digital tests through its ability to capture verbal responses, yet the effect of its performance on standardized scores used for cognitive characterization is unknown. Methods: The present study evaluated the accuracy of Apple’s STT engine relative to ground-truth transcriptions (RQ1), as well as the effect of the engine’s transcription errors on resulting standardized scores (RQ2). Our study analyzed data from 223 older adults who completed a digital assessment on an iPad that used STT to transcribe and score task responses. These automated transcriptions were then compared against ground-truth transcriptions that were human-corrected via external recordings. Results: Results showed differences between STT and ground-truth transcriptions (RQ1). Nevertheless, these differences were not large enough to practically affect standardized measures of cognitive performance (RQ2). Conclusions: Our results demonstrate the practical utility of Apple’s STT engine for digital neuropsychological assessment and cognitive characterization. These findings support the possibility that speech-to-text, with its ability to capture and process verbal responses, will be a viable tool for increasing the validity of digital neuropsychological assessments.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12563218/full.md

---
Source: https://tomesphere.com/paper/PMC12563218