Mixtures of Deep Neural Experts for Automated Speech Scoring
Sara Papi, Edmondo Trentin, Roberto Gretter, Marco Matassoni, Daniele, Falavigna

TL;DR
This paper presents a hybrid system combining speech recognition and deep neural classifiers to automatically assess second language proficiency from spoken responses, achieving state-of-the-art results.
Contribution
It introduces a novel mixture of deep neural experts framework that integrates multiple classifier architectures and representations for automated speech scoring.
Findings
Achieved highest evaluation metrics on the third Spoken CALL Shared Task dataset.
Demonstrated effectiveness of combining diverse neural models and representations.
Validated the approach's applicability to real-world language assessment tasks.
Abstract
The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts. The task has significant relevance to the field of computer assisted language learning. The approach presented in the paper relies on two separate modules: (1) an automatic speech recognition system that yields text transcripts of the spoken interactions involved, and (2) a multiple classifier system based on deep learners that ranks the transcripts into proficiency classes. Different deep neural network architectures (both feed-forward and recurrent) are specialized over diverse representations of the texts in terms of: a reference grammar, the outcome of probabilistic language models, several word embeddings, and two bag-of-word models. Combination of the individual classifiers is realized either via a probabilistic pseudo-joint model, or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
