TL;DR
This paper systematically evaluates machine-generated music detection models, compares their performance, and explores explainability techniques to understand their decision-making, aiming to improve MGM detection methods.
Contribution
It provides the first comprehensive systematic evaluation of MGM detection models across various architectures and introduces explainability analysis to interpret model decisions.
Findings
ResNet18 outperforms other models in detection accuracy.
Multimodal models are effective due to music's inherent multimodal nature.
Explainability tools reveal key features influencing model decisions.
Abstract
Machine-generated music (MGM) has become a groundbreaking innovation with wide-ranging applications, such as music therapy, personalised editing, and creative inspiration within the music industry. However, the unregulated proliferation of MGM presents considerable challenges to the entertainment, education, and arts sectors by potentially undermining the value of high-quality human compositions. Consequently, MGM detection (MGMD) is crucial for preserving the integrity of these fields. Despite its significance, MGMD domain lacks comprehensive systematic evaluation results necessary to drive meaningful progress. To address this gap, we conduct experiments on existing large-scale datasets using a range of foundational models for audio processing, establishing systematic evaluation results tailored to the MGMD task. Our selection includes traditional machine learning models, deep neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
