Comparison and Analysis of New Curriculum Criteria for End-to-End ASR

Georgios Karakasidis; Tam\'as Gr\'osz; Mikko Kurimo

arXiv:2208.05782·eess.AS·August 12, 2022

Comparison and Analysis of New Curriculum Criteria for End-to-End ASR

Georgios Karakasidis, Tam\'as Gr\'osz, Mikko Kurimo

PDF

Open Access 1 Repo

TL;DR

This paper explores the impact of curriculum learning on end-to-end automatic speech recognition, demonstrating that structured training data organized by difficulty can improve training efficiency and model performance.

Contribution

It introduces various curriculum strategies for speech recognition and empirically shows their effectiveness in enhancing training and accuracy.

Findings

01

Structured curricula can reduce training time.

02

Organized examples improve model accuracy.

03

Different scoring functions influence curriculum effectiveness.

Abstract

It is common knowledge that the quantity and quality of the training data play a significant role in the creation of a good machine learning model. In this paper, we take it one step further and demonstrate that the way the training examples are arranged is also of crucial importance. Curriculum Learning is built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension. When humans learn to speak, they first try to utter basic phones and then gradually move towards more complex structures such as words and sentences. This methodology is known as Curriculum Learning, and we employ it in the context of Automatic Speech Recognition. We hypothesize that end-to-end models can achieve better performance when provided with an organized training set consisting of examples that exhibit an increasing level of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aalto-speech/speechbrain-cl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Neural Networks and Applications · Topic Modeling