Multi-Teacher Language-Aware Knowledge Distillation for Multilingual Speech Emotion Recognition

Mehedi Hasan Bijoy; Dejan Porjazovski; Tam\'as Gr\'osz; Mikko Kurimo

arXiv:2506.08717·cs.CL·January 27, 2026

Multi-Teacher Language-Aware Knowledge Distillation for Multilingual Speech Emotion Recognition

Mehedi Hasan Bijoy, Dejan Porjazovski, Tam\'as Gr\'osz, Mikko Kurimo

PDF

1 Repo

TL;DR

This paper presents a novel language-aware multi-teacher knowledge distillation approach to develop a multilingual speech emotion recognition model, achieving state-of-the-art results across English, Finnish, and French datasets.

Contribution

Introduces a new language-aware multi-teacher knowledge distillation method leveraging Wav2Vec2.0 for multilingual speech emotion recognition.

Findings

01

State-of-the-art weighted recall of 72.9 on English dataset

02

Unweighted recall of 63.4 on Finnish dataset

03

Improved recall for sad and neutral emotions

Abstract

Speech Emotion Recognition (SER) is crucial for improving human-computer interaction. Despite strides in monolingual SER, extending them to build a multilingual system remains challenging. Our goal is to train a single model capable of multilingual SER by distilling knowledge from multiple teacher models. To address this, we introduce a novel language-aware multi-teacher knowledge distillation method to advance SER in English, Finnish, and French. It leverages Wav2Vec2.0 as the foundation of monolingual teacher models and then distills their knowledge into a single multilingual student model. The student model demonstrates state-of-the-art performance, with a weighted recall of 72.9 on the English dataset and an unweighted recall of 63.4 on the Finnish dataset, surpassing fine-tuning and knowledge distillation baselines. Our method excels in improving recall for sad and neutral…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aalto-speech/mtkd4ser
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation