Emotion Intensity and its Control for Emotional Voice Conversion

Kun Zhou; Berrak Sisman; Rajib Rana; Bj\"orn W. Schuller; Haizhou Li

arXiv:2201.03967·cs.SD·July 19, 2022

Emotion Intensity and its Control for Emotional Voice Conversion

Kun Zhou, Berrak Sisman, Rajib Rana, Bj\"orn W. Schuller, Haizhou Li

PDF

TL;DR

This paper introduces a method for emotional voice conversion that explicitly models and controls emotion intensity, enabling more expressive and nuanced speech synthesis while preserving content and speaker identity.

Contribution

It proposes a novel approach to disentangle speaker style from content and encode emotion intensity in a continuous space, improving emotional expressiveness in voice conversion.

Findings

01

Effective control of emotion intensity demonstrated

02

Improved emotional expressiveness validated through evaluations

03

Disentanglement of style and content enhances conversion quality

Abstract

Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while preserving the linguistic content and speaker identity. In EVC, emotions are usually treated as discrete categories overlooking the fact that speech also conveys emotions with various intensity levels that the listener can perceive. In this paper, we aim to explicitly characterize and control the intensity of emotion. We propose to disentangle the speaker style from linguistic content and encode the speaker style into a style embedding in a continuous space that forms the prototype of emotion embedding. We further learn the actual emotion encoder from an emotion-labelled database and study the use of relative attributes to represent fine-grained emotion intensity. To ensure emotional intelligibility, we incorporate emotion classification loss and emotion embedding similarity loss into the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.