SingIt! Singer Voice Transformation

Amit Eliav; Aaron Taub; Renana Opochinsky; Sharon Gannot

arXiv:2405.04627·eess.AS·May 9, 2024

SingIt! Singer Voice Transformation

Amit Eliav, Aaron Taub, Renana Opochinsky, Sharon Gannot

PDF

Open Access

TL;DR

This paper introduces SingIt!, a system that transforms speech into singing voice using zero-shot style transfer, enabling anyone to sing any song quickly with simple, modular components.

Contribution

It presents a novel zero-shot, many-to-many style transfer model for singing voice generation from speech, combining simple modules for a complex task.

Findings

01

System successfully converts speech to singing with non-expert listeners

02

Samples demonstrate the model's ability to produce singing voices

03

Modular approach simplifies the complex task of singing voice transformation

Abstract

In this paper, we propose a model which can generate a singing voice from normal speech utterance by harnessing zero-shot, many-to-many style transfer learning. Our goal is to give anyone the opportunity to sing any song in a timely manner. We present a system comprising several available blocks, as well as a modified auto-encoder, and show how this highly-complex challenge can be achieved by tailoring rather simple solutions together. We demonstrate the applicability of the proposed system using a group of 25 non-expert listeners. Samples of the data generated from our model are provided.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing