Audio Fingerprinting with Holographic Reduced Representations

Yusuke Fujita; Tatsuya Komatsu

arXiv:2406.13139·eess.AS·June 21, 2024

Audio Fingerprinting with Holographic Reduced Representations

Yusuke Fujita, Tatsuya Komatsu

PDF

Open Access

TL;DR

This paper introduces an audio fingerprinting approach using holographic reduced representations to significantly reduce stored fingerprints while maintaining high accuracy and time resolution.

Contribution

It presents a novel HRR-based aggregation method for audio fingerprints that reduces storage needs without sacrificing time resolution or accuracy.

Findings

01

Reduces number of fingerprints needed for high accuracy

02

Maintains time resolution with fewer fingerprints

03

Outperforms simple aggregation methods

Abstract

This paper proposes an audio fingerprinting model with holographic reduced representation (HRR). The proposed method reduces the number of stored fingerprints, whereas conventional neural audio fingerprinting requires many fingerprints for each audio track to achieve high accuracy and time resolution. We utilize HRR to aggregate multiple fingerprints into a composite fingerprint via circular convolution and summation, resulting in fewer fingerprints with the same dimensional space as the original. Our search method efficiently finds a combined fingerprint in which a query fingerprint exists. Using HRR's inverse operation, it can recover the relative position within a combined fingerprint, retaining the original time resolution. Experiments show that our method can reduce the number of fingerprints with modest accuracy degradation while maintaining the time resolution, outperforming…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Digital Media Forensic Detection

MethodsHolographic Reduced Representation · Convolution