# A Methodology for Controlling the Emotional Expressiveness in Synthetic   Speech -- a Deep Learning approach

**Authors:** No\'e Tits

arXiv: 1907.02784 · 2019-07-08

## TL;DR

This paper presents a deep learning methodology for controlling emotional expressiveness in synthetic speech, involving data collection, automatic annotation, and a TTS system that incorporates emotion control.

## Contribution

It introduces a novel three-step methodology for building emotionally controllable TTS systems, including data annotation and transfer learning techniques.

## Key findings

- Transfer learning effectively extracts emotional features.
- Fine-tuning improves emotional expressiveness without compromising intelligibility.
- Visualization aids in understanding vocal-emotional feature correlations.

## Abstract

In this project, we aim to build a Text-to-Speech system able to produce speech with a controllable emotional expressiveness. We propose a methodology for solving this problem in three main steps. The first is the collection of emotional speech data. We discuss the various formats of existing datasets and their usability in speech generation. The second step is the development of a system to automatically annotate data with emotion/expressiveness features. We compare several techniques using transfer learning to extract such a representation through other tasks and propose a method to visualize and interpret the correlation between vocal and emotional features. The third step is the development of a deep learning-based system taking text and emotion/expressiveness as input and producing speech as output. We study the impact of fine tuning from a neutral TTS towards an emotional TTS in terms of intelligibility and perception of the emotion.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.02784/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1907.02784/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1907.02784/full.md

---
Source: https://tomesphere.com/paper/1907.02784