Emo-StarGAN: A Semi-Supervised Any-to-Many Non-Parallel   Emotion-Preserving Voice Conversion

Suhita Ghosh; Arnab Das; Yamini Sinha; Ingo Siegert; Tim Polzehl and; Sebastian Stober

arXiv:2309.07586·eess.AS·September 15, 2023

Emo-StarGAN: A Semi-Supervised Any-to-Many Non-Parallel Emotion-Preserving Voice Conversion

Suhita Ghosh, Arnab Das, Yamini Sinha, Ingo Siegert, Tim Polzehl and, Sebastian Stober

PDF

1 Repo

TL;DR

This paper introduces Emo-StarGAN, a semi-supervised voice conversion model that effectively preserves emotion during anonymization, using emotion-aware losses and classifier supervision to improve over existing methods.

Contribution

It presents a novel semi-supervised any-to-many voice conversion approach that enhances emotion preservation while maintaining anonymization on non-parallel data.

Findings

01

Significant improvement in emotion preservation demonstrated

02

Effective across diverse datasets and emotions

03

Maintains intelligibility and anonymization

Abstract

Speech anonymisation prevents misuse of spoken data by removing any personal identifier while preserving at least linguistic content. However, emotion preservation is crucial for natural human-computer interaction. The well-known voice conversion technique StarGANv2-VC achieves anonymisation but fails to preserve emotion. This work presents an any-to-many semi-supervised StarGANv2-VC variant trained on partially emotion-labelled non-parallel data. We propose emotion-aware losses computed on the emotion embeddings and acoustic features correlated to emotion. Additionally, we use an emotion classifier to provide direct emotion supervision. Objective and subjective evaluations show that the proposed approach significantly improves emotion preservation over the vanilla StarGANv2-VC. This considerable improvement is seen over diverse datasets, emotions, target speakers, and inter-group…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

suhitaghosh10/emo-stargan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.