DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech
Kazuki Kawamura, Jun Rekimoto

TL;DR
DDSupport is a language learning system that visualizes pronunciation differences without needing extensive annotated data or comparison to a specific native speaker, aiding learners in improving their speech.
Contribution
It introduces a deep learning-based system that calculates pronunciation scores and differences using minimal unannotated data, allowing flexible and intuitive pronunciation training.
Findings
Improved speech intelligibility in learners using DDSupport
Effective visualization of pronunciation differences and distances
No need for extensive annotated speech data or comparison to specific speakers
Abstract
When beginners learn to speak a non-native language, it is difficult for them to judge for themselves whether they are speaking well. Therefore, computer-assisted pronunciation training systems are used to detect learner mispronunciations. These systems typically compare the user's speech with that of a specific native speaker as a model in units of rhythm, phonemes, or words and calculate the differences. However, they require extensive speech data with detailed annotations or can only compare with one specific native speaker. To overcome these problems, we propose a new language learning support system that calculates speech scores and detects mispronunciations by beginners based on a small amount of unannotated speech data without comparison to a specific person. The proposed system uses deep learning--based speech processing to display the pronunciation score of the learner's speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
