Audio Denoising for Robust Audio Fingerprinting

Kamil Akesbi

arXiv:2212.11277·cs.SD·December 23, 2022·1 cites

Audio Denoising for Robust Audio Fingerprinting

Kamil Akesbi

PDF

Open Access

TL;DR

This paper introduces a hybrid deep learning approach to improve the robustness of audio fingerprinting systems against background noise by integrating a denoising model before peak extraction.

Contribution

It proposes a novel hybrid strategy combining deep learning denoising with spectral peak-based fingerprinting, including a new loss function tailored for this purpose.

Findings

01

Enhanced robustness of AFP systems in noisy environments

02

Improved spectral peak accuracy with the denoising model

03

First testing of a hybrid deep learning and peak-based AFP approach

Abstract

Music discovery services let users identify songs from short mobile recordings. These solutions are often based on Audio Fingerprinting, and rely more specifically on the extraction of spectral peaks in order to be robust to a number of distortions. Few works have been done to study the robustness of these algorithms to background noise captured in real environments. In particular, AFP systems still struggle when the signal to noise ratio is low, i.e when the background noise is strong. In this project, we tackle this problematic with Deep Learning. We test a new hybrid strategy which consists of inserting a denoising DL model in front of a peak-based AFP algorithm. We simulate noisy music recordings using a realistic data augmentation pipeline, and train a DL model to denoise them. The denoising model limits the impact of background noise on the AFP system's extracted peaks, improving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis

MethodsTest