Zero-Shot Crate Digging: DJ Tool Retrieval Using Speech Activity, Music Structure And CLAP Embeddings
Iroro Orife

TL;DR
This paper presents a zero-shot audio retrieval system that uses speech activity detection, music structure analysis, and CLAP embeddings to identify DJ tools in personal music collections, aiding DJs in live and studio settings.
Contribution
It introduces a novel approach combining speech/music activity detection, music boundary analysis, and CLAP embeddings for zero-shot DJ tool retrieval.
Findings
Effective retrieval of DJ tools demonstrated in personal music collections.
System supports live and studio DJ performances.
Utilizes open-source tools for comprehensive audio analysis.
Abstract
In genres like Hip-Hop, RnB, Reggae, Dancehall and just about every Electronic/Dance/Club style, DJ tools are a special set of audio files curated to heighten the DJ's musical performance and creative mixing choices. In this work we demonstrate an approach to discovering DJ tools in personal music collections. Leveraging open-source libraries for speech/music activity, music boundary analysis and a Contrastive Language-Audio Pretraining (CLAP) model for zero-shot audio classification, we demonstrate a novel system designed to retrieve (or rediscover) compelling DJ tools for use live or in the studio.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Diverse Musicological Studies · Music Technology and Sound Studies
