The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems
Adaeze Adigwe, No\'e Tits, Kevin El Haddad, Sarah Ostadabbas and, Thierry Dutoit

TL;DR
This paper introduces an open-source emotional speech database designed for voice synthesis, demonstrating its effectiveness through a simple conversion system that controls emotional expression in speech.
Contribution
The paper provides a new, publicly available emotional speech database covering multiple languages and emotions, enabling better control in voice generation systems.
Findings
The database effectively supports emotion-controlled speech synthesis.
A simple MLP system successfully converts neutral to angry speech.
Perception tests confirm the database's usefulness for future research.
Abstract
In this paper, we present a database of emotional speech intended to be open-sourced and used for synthesis and generation purpose. It contains data for male and female actors in English and a male actor in French. The database covers 5 emotion classes so it could be suitable to build synthesis and voice transformation systems with the potential to control the emotional dimension in a continuous way. We show the data's efficiency by building a simple MLP system converting neutral to angry speech style and evaluate it via a CMOS perception test. Even though the system is a very simple one, the test show the efficiency of the data which is promising for future work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Emotion and Mood Recognition · Speech and Audio Processing
