Meta-learning for robust child-adult classification from speech

Nithin Rao Koluguri; Manoj Kumar; So Hyun Kim; Catherine Lord; and Shrikanth Narayanan

arXiv:1910.11400·eess.AS·October 30, 2019

Meta-learning for robust child-adult classification from speech

Nithin Rao Koluguri, Manoj Kumar, So Hyun Kim, Catherine Lord, and Shrikanth Narayanan

PDF

TL;DR

This paper introduces a meta-learning approach using prototypical networks to improve child-adult speaker classification in clinical conversations, demonstrating significant performance gains over existing methods.

Contribution

It applies meta-learning with prototypical networks to enhance speaker classification robustness in child-adult interactions, a novel approach in this domain.

Findings

01

Up to 14.53% relative improvement in F1-scores for weakly supervised classification.

02

Up to 9.66% relative improvement in cluster purity.

03

Prototypical networks outperform traditional speaker embeddings in this task.

Abstract

Computational modeling of naturalistic conversations in clinical applications has seen growing interest in the past decade. An important use-case involves child-adult interactions within the autism diagnosis and intervention domain. In this paper, we address a specific sub-problem of speaker diarization, namely child-adult speaker classification in such dyadic conversations with specified roles. Training a speaker classification system robust to speaker and channel conditions is challenging due to inherent variability in the speech within children and the adult interlocutors. In this work, we propose the use of meta-learning, in particular, prototypical networks which optimize a metric space across multiple tasks. By modeling every child-adult pair in the training set as a separate task during meta-training, we learn a representation with improved generalizability compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.