Speech Emotion Recognition using Semantic Information

Panagiotis Tzirakis; Anh Nguyen; Stefanos Zafeiriou; Bj\"orn W.; Schuller

arXiv:2103.02993·cs.SD·March 5, 2021

Speech Emotion Recognition using Semantic Information

Panagiotis Tzirakis, Anh Nguyen, Stefanos Zafeiriou, Bj\"orn W., Schuller

PDF

1 Repo

TL;DR

This paper introduces a novel speech emotion recognition framework that combines semantic and paralinguistic features using an attention mechanism and LSTM, achieving state-of-the-art results on the SEWA dataset.

Contribution

It proposes a new framework that integrates semantic and paralinguistic speech features with an attention mechanism for improved emotion recognition.

Findings

01

Achieves state-of-the-art results on valence and liking dimensions.

02

Effectively captures both semantic and paralinguistic information.

03

Outperforms previous top models on the SEWA dataset.

Abstract

Speech emotion recognition is a crucial problem manifesting in a multitude of applications such as human computer interaction and education. Although several advancements have been made in the recent years, especially with the advent of Deep Neural Networks (DNN), most of the studies in the literature fail to consider the semantic information in the speech signal. In this paper, we propose a novel framework that can capture both the semantic and the paralinguistic information in the signal. In particular, our framework is comprised of a semantic feature extractor, that captures the semantic information, and a paralinguistic feature extractor, that captures the paralinguistic information. Both semantic and paraliguistic features are then combined to a unified representation using a novel attention mechanism. The unified feature vector is passed through a LSTM to capture the temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

glam-imperial/semantic_speech_emotion_recognition
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.