Towards few-shot isolated word reading assessment
Reuben Smit, Retief Louw, Herman Kamper

TL;DR
This paper proposes a few-shot, ASR-free method for isolated word reading assessment using SSL models, highlighting its potential and limitations in low-resource child speech analysis.
Contribution
It introduces a novel few-shot approach for child speech assessment that leverages SSL models and explores design options like discretisation and barycentre averaging.
Findings
Reasonable performance for adult speech
Significant performance drop for child speech
SSL representations have limitations for child data in few-shot systems
Abstract
We explore an ASR-free method for isolated word reading assessment in low-resource settings. Our few-shot approach compares input child speech to a small set of adult-provided reference templates. Inputs and templates are encoded using intermediate layers from large self-supervised learned (SSL) models. Using an Afrikaans child speech benchmark, we investigate design options such as discretising SSL features and barycentre averaging of the templates. Idealised experiments show reasonable performance for adults, but a substantial drop for child speech input, even with child templates. Despite the success of employing SSL representations in low-resource speech tasks, our work highlights the limitations of SSL representations for processing child data when used in a few-shot classification system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Text Readability and Simplification
