ANGUS: Real-time manipulation of vocal roughness for emotional speech   transformations

Marco Liuni; Luc Ardaillon; Louise Bonal; Lou Seropian; Jean-Julien; Aucouturier

arXiv:2008.11241·cs.SD·August 27, 2020·5 cites

ANGUS: Real-time manipulation of vocal roughness for emotional speech transformations

Marco Liuni, Luc Ardaillon, Louise Bonal, Lou Seropian, Jean-Julien, Aucouturier

PDF

Open Access

TL;DR

This paper introduces ANGUS, a real-time, computationally-efficient voice transformation algorithm that simulates vocal roughness to enhance emotional expression in speech, with validated perceptual effects and open-source implementation.

Contribution

ANGUS is a novel real-time algorithm that parametrically manipulates vocal roughness, enabling emotional speech transformation with perceptually validated effects.

Findings

01

ANGUS allows control over spectral roughness features.

02

It increases perceived emotional negativity to levels comparable with state-of-the-art methods.

03

Listeners cannot reliably distinguish transformed from untransformed sounds.

Abstract

Vocal arousal, the non-linear acoustic features taken on by human and animal vocalizations when highly aroused, has an important communicative function because it signals aversive states such as fear, pain or distress. In this work, we present a computationally-efficient, real-time voice transformation algorithm, ANGUS, which uses amplitude modulation and time-domain filtering to simulate roughness, an important component of vocal arousal, in arbitrary voice recordings. In a series of 4 studies, we show that ANGUS allows parametric control over the spectral features of roughness like the presence of sub-harmonics and noise; that ANGUS increases the emotional negativity perceived by listeners, to a comparable level as a non-real-time analysis/resynthesis algorithm from the state-of-the-art; that listeners cannot distinguish transformed and non-transformed sounds above chance level; and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis