Automatic Identification of Samples in Hip-Hop Music via Multi-Loss Training and an Artificial Dataset
Huw Cheston, Jan Van Balen, Simon Durand

TL;DR
This paper presents a neural network approach trained on an artificial dataset to automatically identify samples in hip-hop music, even when altered with effects, improving music discovery tools.
Contribution
The study introduces a novel training method using an artificial dataset and multi-loss optimization to enhance sample detection in real-world hip-hop tracks.
Findings
Achieves 13% higher precision than existing fingerprinting methods.
Successfully detects pitch-shifted and time-stretched samples.
Locates sample positions within five seconds in half of tested recordings.
Abstract
Sampling, the practice of reusing recorded music or sounds from another source in a new work, is common in popular music genres like hip-hop and rap. Numerous services have emerged that allow users to identify connections between samples and the songs that incorporate them, with the goal of enhancing music discovery. Designing a system that can perform the same task automatically is challenging, as samples are commonly altered with audio effects like pitch- and time-stretching and may only be seconds long. Progress on this task has been minimal and is further blocked by the limited availability of training data. Here, we show that a convolutional neural network trained on an artificial dataset can identify real-world samples in commercial hip-hop music. We extract vocal, harmonic, and percussive elements from several databases of non-commercial music recordings using audio source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Diverse Musicological Studies · Music Technology and Sound Studies
