EasyCall corpus: a dysarthric speech dataset
Rosanna Turrisi, Arianna Braccia, Marco Emanuele, Simone Giulietti,, Maura Pugliatti, Mariachiara Sensi, Luciano Fadiga, Leonardo Badino

TL;DR
The EasyCall corpus is a comprehensive Italian dysarthric speech dataset designed to aid the development of speech recognition assistive technologies, highlighting current system limitations and providing a valuable resource for future research.
Contribution
This paper introduces the largest and most detailed dysarthric speech dataset to date, including diverse commands and non-commands for improved ASR system training.
Findings
Commercial ASR systems perform poorly on the dataset
The corpus includes 21,386 recordings from 55 speakers
The dataset supports development of voice-controlled assistive apps
Abstract
This paper introduces a new dysarthric speech command dataset in Italian, called EasyCall corpus. The dataset consists of 21386 audio recordings from 24 healthy and 31 dysarthric speakers, whose individual degree of speech impairment was assessed by neurologists through the Therapy Outcome Measure. The corpus aims at providing a resource for the development of ASR-based assistive technologies for patients with dysarthria. In particular, it may be exploited to develop a voice-controlled contact application for commercial smartphones, aiming at improving dysarthric patients' ability to communicate with their family and caregivers. Before recording the dataset, participants were administered a survey to evaluate which commands are more likely to be employed by dysarthric individuals in a voice-controlled contact application. In addition, the dataset includes a list of non-commands (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
