Exploiting ultrasound tongue imaging for the automatic detection of   speech articulation errors

Manuel Sam Ribeiro; Joanne Cleland; Aciel Eshky; Korin Richmond; Steve; Renals

arXiv:2103.00324·eess.AS·March 2, 2021

Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors

Manuel Sam Ribeiro, Joanne Cleland, Aciel Eshky, Korin Richmond, Steve, Renals

PDF

TL;DR

This study explores using ultrasound tongue imaging combined with audio analysis to automatically detect speech articulation errors in children, showing promising accuracy and potential for clinical speech therapy applications.

Contribution

It introduces a novel automated system leveraging ultrasound and audio data for detecting speech articulation errors, trained on both child and adult speech datasets.

Findings

01

Achieved 86.9% accuracy in typical speech error detection.

02

Best velar fronting detection correctly identified 86.6% of errors.

03

System shows potential for integration into speech therapy tools.

Abstract

Speech sound disorders are a common communication impairment in childhood. Because speech disorders can negatively affect the lives and the development of children, clinical intervention is often recommended. To help with diagnosis and treatment, clinicians use instrumented methods such as spectrograms or ultrasound tongue imaging to analyse speech articulations. Analysis with these methods can be laborious for clinicians, therefore there is growing interest in its automation. In this paper, we investigate the contribution of ultrasound tongue imaging for the automatic detection of speech articulation errors. Our systems are trained on typically developing child speech and augmented with a database of adult speech using audio and ultrasound. Evaluation on typically developing speech indicates that pre-training on adult speech and jointly using ultrasound and audio gives the best results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.