SONICS: Synthetic Or Not -- Identifying Counterfeit Songs

Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker,; Bishmoy Paul; Shaikh Anowarul Fattah

arXiv:2408.14080·cs.SD·February 26, 2025·3 cites

SONICS: Synthetic Or Not -- Identifying Counterfeit Songs

Md Awsafur Rahman, Zaber Ibn Abdul Hakim, Najibul Haque Sarker,, Bishmoy Paul, Shaikh Anowarul Fattah

PDF

Open Access 2 Repos 6 Models 1 Datasets 1 Video

TL;DR

This paper introduces SONICS, a large dataset for detecting AI-generated synthetic songs, emphasizing the importance of modeling long-range dependencies with a new efficient architecture, SpecTTTra, to improve detection accuracy and efficiency.

Contribution

The paper presents SONICS, a comprehensive dataset for end-to-end fake song detection, and proposes SpecTTTra, an innovative architecture that enhances long-range temporal modeling with better efficiency.

Findings

01

SpecTTTra outperforms ViT by 8% in F1 score on long songs.

02

SpecTTTra is 38% faster and uses 26% less memory than CNN and Transformer models.

03

The SONICS dataset includes over 97k songs, addressing previous dataset limitations.

Abstract

The recent surge in AI-generated songs presents exciting possibilities and challenges. These innovations necessitate the ability to distinguish between human-composed and synthetic songs to safeguard artistic integrity and protect human musical artistry. Existing research and datasets in fake song detection only focus on singing voice deepfake detection (SVDD), where the vocals are AI-generated but the instrumental music is sourced from real songs. However, these approaches are inadequate for detecting contemporary end-to-end artificial songs where all components (vocals, music, lyrics, and style) could be AI-generated. Additionally, existing datasets lack music-lyrics diversity, long-duration songs, and open-access fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs (4,751 hours) with over 49k…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

awsaf49/sonics
dataset· 608 dl
608 dl

Videos

SONICS: Synthetic Or Not - Identifying Counterfeit Songs· slideslive

Taxonomy

TopicsDiverse Musicological Studies · Music History and Culture · Music and Audio Processing

MethodsConvNeXt · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus