Speech Emotion Recognition Using Fine-Tuned DWFormer:A Study on Track 1 of the IERPChallenge 2024

Honghong Wang; Xupeng Jia; Jing Deng; Rong Zheng

arXiv:2508.11371·cs.SD·August 18, 2025

Speech Emotion Recognition Using Fine-Tuned DWFormer:A Study on Track 1 of the IERPChallenge 2024

Honghong Wang, Xupeng Jia, Jing Deng, Rong Zheng

PDF

TL;DR

This paper introduces a fine-tuned DWFormer model for speech emotion recognition, incorporating data augmentation and score fusion, achieving first place in the IERP Challenge 2024 using solely audio features.

Contribution

It presents a novel application of fine-tuning DWFormer with data augmentation and score fusion for emotion recognition in audio, outperforming other methods in the challenge.

Findings

01

Achieved first place in Track 1 of IERP Challenge 2024.

02

Demonstrated the effectiveness of data augmentation and score fusion.

03

Outperformed other participating teams in emotion recognition accuracy.

Abstract

The field of artificial intelligence has a strong interest in the topic of emotion recognition. The majority of extant emotion recognition models are oriented towards enhancing the precision of discrete emotion label prediction. Given the direct relationship between human personality and emotion, as well as the significant inter-individual differences in subjective emotional expression, the IERP Challenge 2024 incorporates personality traits into emotion recognition research. This paper presents the Fosafer submissions to the Track 1 of the IERP Challenge 2024. This task primarily concerns the recognition of emotions in audio, while also providing text and audio features. In Track 1, we utilized exclusively audio-based features and fine-tuned a pre-trained speech emotion recognition model, DWFormer, through the integration of data augmentation and score fusion strategies, thereby…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.