# Attentive Convolutional Neural Network based Speech Emotion Recognition:   A Study on the Impact of Input Features, Signal Length, and Acted Speech

**Authors:** Michael Neumann, Ngoc Thang Vu

arXiv: 1706.00612 · 2017-06-05

## TL;DR

This study evaluates an attentive convolutional neural network for speech emotion recognition, analyzing how input features, signal length, and speech type affect performance, and achieves state-of-the-art results on improvised speech data.

## Contribution

It introduces an attentive CNN model with multi-view learning for speech emotion recognition and systematically examines the impact of various input factors.

## Key findings

- Recognition performance varies with speech data type.
- State-of-the-art results achieved on improvised speech.
- Performance is independent of input feature choice.

## Abstract

Speech emotion recognition is an important and challenging task in the realm of human-computer interaction. Prior work proposed a variety of models and feature sets for training a system. In this work, we conduct extensive experiments using an attentive convolutional neural network with multi-view learning objective function. We compare system performance using different lengths of the input signal, different types of acoustic features and different types of emotion speech (improvised/scripted). Our experimental results on the Interactive Emotional Motion Capture (IEMOCAP) database reveal that the recognition performance strongly depends on the type of speech data independent of the choice of input features. Furthermore, we achieved state-of-the-art results on the improvised speech data of IEMOCAP.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.00612/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1706.00612/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1706.00612/full.md

---
Source: https://tomesphere.com/paper/1706.00612