Robust Neural Audio Fingerprinting using Music Foundation Models

Shubhr Singh; Kiran Bhat; Xavier Riley; Benjamin Resnick; John Thickstun; Walter De Brouwer

arXiv:2511.05399·cs.SD·November 10, 2025

Robust Neural Audio Fingerprinting using Music Foundation Models

Shubhr Singh, Kiran Bhat, Xavier Riley, Benjamin Resnick, John Thickstun, Walter De Brouwer

PDF

Open Access

TL;DR

This paper introduces a robust neural audio fingerprinting method leveraging music foundation models and extensive data augmentation, significantly improving identification accuracy under various audio distortions.

Contribution

It proposes using pretrained music foundation models as the backbone and expanding data augmentation techniques for more robust neural audio fingerprinting.

Findings

01

Music foundation models outperform non-musical pretrained models.

02

Enhanced robustness to audio manipulations like time stretching and filtering.

03

Effective localization of fingerprint matches at segment level.

Abstract

The proliferation of distorted, compressed, and manipulated music on modern media platforms like TikTok motivates the development of more robust audio fingerprinting techniques to identify the sources of musical recordings. In this paper, we develop and evaluate new neural audio fingerprinting techniques with the aim of improving their robustness. We make two contributions to neural fingerprinting methodology: (1) we use a pretrained music foundation model as the backbone of the neural architecture and (2) we expand the use of data augmentation to train fingerprinting models under a wide variety of audio manipulations, including time streching, pitch modulation, compression, and filtering. We systematically evaluate our methods in comparison to two state-of-the-art neural fingerprinting models: NAFP and GraFPrint. Results show that fingerprints extracted with music foundation models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Digital Media Forensic Detection · Music Technology and Sound Studies