Semantic-Emotional Resonance Embedding: A Semi-Supervised Paradigm for Cross-Lingual Speech Emotion Recognition
Ya Zhao, Yinfeng Yu, Liejun Wang

TL;DR
This paper introduces a semi-supervised cross-lingual speech emotion recognition framework called SERE, which leverages emotional resonance and dynamic features to improve recognition in low-resource languages without relying on target language labels.
Contribution
The paper presents a novel semi-supervised paradigm using Semantic-Emotional Resonance Embedding and a Triple-Resonance Interaction Chain loss, enabling effective cross-lingual emotion recognition with minimal labeled data.
Findings
Effective in multiple languages with only 5-shot source language labels
Outperforms existing methods in low-resource cross-lingual settings
Self-organizes unlabeled samples into emotion-semantic structures
Abstract
Cross-lingual Speech Emotion Recognition (CLSER) aims to identify emotional states in unseen languages. However, existing methods heavily rely on the semantic synchrony of complete labels and static feature stability, hindering low-resource languages from reaching high-resource performance. To address this, we propose a semi-supervised framework based on Semantic-Emotional Resonance Embedding (SERE), a cross-lingual dynamic feature paradigm that requires neither target language labels nor translation alignment. Specifically, SERE constructs an emotion-semantic structure using a small number of labeled samples. It learns human emotional experiences through an Instantaneous Resonance Field (IRF), enabling unlabeled samples to self-organize into this structure. This achieves semi-supervised semantic guidance and structural discovery. Additionally, we design a Triple-Resonance Interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
