AI-Generated Song Detection via Lyrics Transcripts
Markus Frohmann, Elena V. Epure, Gabriel Meseguer-Brocal, Markus Schedl, Romain Hennequin

TL;DR
This paper introduces a robust method for detecting AI-generated music by transcribing lyrics with ASR models and analyzing them with language models, outperforming audio-based detectors especially under perturbations.
Contribution
It proposes a novel approach using lyrics transcripts and language models for AI-generated music detection, addressing practical limitations of previous audio-based methods.
Findings
High detection accuracy across multiple languages and genres.
Greater robustness than audio-based detectors under audio perturbations.
Effective detection of various AI music generators.
Abstract
The recent rise in capabilities of AI-based music generation tools has created an upheaval in the music industry, necessitating the creation of accurate methods to detect such AI-generated content. This can be done using audio-based detectors; however, it has been shown that they struggle to generalize to unseen generators or when the audio is perturbed. Furthermore, recent work used accurate and cleanly formatted lyrics sourced from a lyrics provider database to detect AI-generated music. However, in practice, such perfect lyrics are not available (only the audio is); this leaves a substantial gap in applicability in real-life use cases. In this work, we instead propose solving this gap by transcribing songs using general automatic speech recognition (ASR) models. We do this using several detectors. The results on diverse, multi-genre, and multi-lingual lyrics show generally strong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
