ANGUS: Real-time manipulation of vocal roughness for emotional speech transformations
Marco Liuni, Luc Ardaillon, Louise Bonal, Lou Seropian, Jean-Julien, Aucouturier

TL;DR
This paper introduces ANGUS, a real-time, computationally-efficient voice transformation algorithm that simulates vocal roughness to enhance emotional expression in speech, with validated perceptual effects and open-source implementation.
Contribution
ANGUS is a novel real-time algorithm that parametrically manipulates vocal roughness, enabling emotional speech transformation with perceptually validated effects.
Findings
ANGUS allows control over spectral roughness features.
It increases perceived emotional negativity to levels comparable with state-of-the-art methods.
Listeners cannot reliably distinguish transformed from untransformed sounds.
Abstract
Vocal arousal, the non-linear acoustic features taken on by human and animal vocalizations when highly aroused, has an important communicative function because it signals aversive states such as fear, pain or distress. In this work, we present a computationally-efficient, real-time voice transformation algorithm, ANGUS, which uses amplitude modulation and time-domain filtering to simulate roughness, an important component of vocal arousal, in arbitrary voice recordings. In a series of 4 studies, we show that ANGUS allows parametric control over the spectral features of roughness like the presence of sub-harmonics and noise; that ANGUS increases the emotional negativity perceived by listeners, to a comparable level as a non-real-time analysis/resynthesis algorithm from the state-of-the-art; that listeners cannot distinguish transformed and non-transformed sounds above chance level; and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
