Speaker Recognition for Children's Speech
Saeid Safavi, Maryam Najafian, Abualsoud Hanani, Martin J Russell,, Peter Jancovic, Michael J Carey

TL;DR
This study investigates speaker recognition for children's speech, identifying key spectral regions and evaluating recognition accuracy across different ages and settings using GMM-based systems.
Contribution
It provides new insights into spectral features relevant for children's speaker recognition and evaluates system performance across age groups and real-world scenarios.
Findings
Spectral regions important for SR are 11-38% higher in children than adults.
SR accuracy varies from 90% to 99% among children based on age.
Identification rate in school settings is 81%.
Abstract
This paper presents results on Speaker Recognition (SR) for children's speech, using the OGI Kids corpus and GMM-UBM and GMM-SVM SR systems. Regions of the spectrum containing important speaker information for children are identified by conducting SR experiments over 21 frequency bands. As for adults, the spectrum can be split into four regions, with the first (containing primary vocal tract resonance information) and third (corresponding to high frequency speech sounds) being most useful for SR. However, the frequencies at which these regions occur are from 11% to 38% higher for children. It is also noted that subband SR rates are lower for younger children. Finally results are presented of SR experiments to identify a child in a class (30 children, similar age) and school (288 children, varying ages). Class performance depends on age, with accuracy varying from 90% for young children…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
