DDSupport: Language Learning Support System that Displays Differences   and Distances from Model Speech

Kazuki Kawamura; Jun Rekimoto

arXiv:2212.04930·eess.AS·December 12, 2022

DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech

Kazuki Kawamura, Jun Rekimoto

PDF

Open Access

TL;DR

DDSupport is a language learning system that visualizes pronunciation differences without needing extensive annotated data or comparison to a specific native speaker, aiding learners in improving their speech.

Contribution

It introduces a deep learning-based system that calculates pronunciation scores and differences using minimal unannotated data, allowing flexible and intuitive pronunciation training.

Findings

01

Improved speech intelligibility in learners using DDSupport

02

Effective visualization of pronunciation differences and distances

03

No need for extensive annotated speech data or comparison to specific speakers

Abstract

When beginners learn to speak a non-native language, it is difficult for them to judge for themselves whether they are speaking well. Therefore, computer-assisted pronunciation training systems are used to detect learner mispronunciations. These systems typically compare the user's speech with that of a specific native speaker as a model in units of rhythm, phonemes, or words and calculate the differences. However, they require extensive speech data with detailed annotations or can only compare with one specific native speaker. To overcome these problems, we propose a new language learning support system that calculates speech scores and detects mispronunciations by beginners based on a small amount of unannotated speech data without comparison to a specific person. The proposed system uses deep learning--based speech processing to display the pronunciation score of the learner's speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis