From Real to Cloned Singer Identification
Dorian Desblancs, Gabriel Meseguer-Brocal, Romain Hennequin, and, Manuel Moussallam

TL;DR
This paper explores the effectiveness of singer identification models in distinguishing real singers from cloned versions, revealing current limitations and biases in detecting synthetic voices.
Contribution
It introduces three embedding models trained with contrastive learning to identify singers and evaluates their performance on real and cloned voices, highlighting existing biases.
Findings
Models perform well on real singer identification.
Performance drops significantly on cloned voices.
Mixture-based models are less effective for cloned voice detection.
Abstract
Cloned voices of popular singers sound increasingly realistic and have gained popularity over the past few years. They however pose a threat to the industry due to personality rights concerns. As such, methods to identify the original singer in synthetic voices are needed. In this paper, we investigate how singer identification methods could be used for such a task. We present three embedding models that are trained using a singer-level contrastive learning scheme, where positive pairs consist of segments with vocals from the same singers. These segments can be mixtures for the first model, vocals for the second, and both for the third. We demonstrate that all three models are highly capable of identifying real singers. However, their performance deteriorates when classifying cloned versions of singers in our evaluation set. This is especially true for models that use mixtures as an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDiverse Musicological Studies
MethodsContrastive Learning
