Addressing Cold Start Problem for End-to-end Automatic Speech Scoring
Jungbae Park, Seungtaek Choi

TL;DR
This paper addresses the cold start problem in end-to-end automatic speech scoring for second-language learners by proposing methods to improve performance in new question contexts, demonstrating robustness and superiority over baselines.
Contribution
It introduces a novel framework combining prompt embeddings, question context embeddings, and pretrained acoustic models to mitigate cold start issues in speech scoring.
Findings
Framework outperforms baselines in cold-start scenarios
Proposed methods improve robustness in new question contexts
Experimental results on TOEIC dataset validate effectiveness
Abstract
Integrating automatic speech scoring/assessment systems has become a critical aspect of second-language speaking education. With self-supervised learning advancements, end-to-end speech scoring approaches have exhibited promising results. However, this study highlights the significant decrease in the performance of speech scoring systems in new question contexts, thereby identifying this as a cold start problem in terms of items. With the finding of cold-start phenomena, this paper seeks to alleviate the problem by following methods: 1) prompt embeddings, 2) question context embeddings using BERT or CLIP models, and 3) choice of the pretrained acoustic model. Experiments are conducted on TOEIC speaking test datasets collected from English-as-a-second-language (ESL) learners rated by professional TOEIC speaking evaluators. The results demonstrate that the proposed framework not only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Phonetics and Phonology Research
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Attention Dropout · WordPiece · Dense Connections · Adam · Residual Connection
