Meta-Learning Approaches for Speaker-Dependent Voice Fatigue Models
Roseline Polle, Agnes Norbury, Alexandra Livia Georgescu, Nicholas Cummins, Stefano Goria

TL;DR
This paper explores meta-learning techniques to improve speaker-dependent voice fatigue models, demonstrating that advanced methods like transformers outperform traditional models in health monitoring tasks.
Contribution
It introduces a reformulation of speaker adaptation as a meta-learning problem and compares three novel approaches using speech embeddings.
Findings
All meta-learning methods outperformed traditional models.
Transformer-based approach achieved the best performance.
Meta-learning improves speaker-dependent fatigue prediction accuracy.
Abstract
Speaker-dependent modelling can substantially improve performance in speech-based health monitoring applications. While mixed-effect models are commonly used for such speaker adaptation, they require computationally expensive retraining for each new observation, making them impractical in a production environment. We reformulate this task as a meta-learning problem and explore three approaches of increasing complexity: ensemble-based distance models, prototypical networks, and transformer-based sequence models. Using pre-trained speech embeddings, we evaluate these methods on a large longitudinal dataset of shift workers (N=1,185, 10,286 recordings), predicting time since sleep from speech as a function of fatigue, a symptom commonly associated with ill-health. Our results demonstrate that all meta-learning approaches tested outperformed both cross-sectional and conventional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSleep and Work-Related Fatigue · Obstructive Sleep Apnea Research · Voice and Speech Disorders
