GraFPrint: A GNN-Based Approach for Audio Identification
Aditya Bhattacharjee, Shubhr Singh, Emmanouil Benetos

TL;DR
GraFPrint is a novel GNN-based framework that creates robust audio fingerprints by leveraging graph structures and self-supervised learning, showing superior scalability and resilience to distortions in large-scale datasets.
Contribution
It introduces GraFPrint, a new GNN-based audio identification method that constructs k-NN graphs and uses contrastive training for improved robustness and scalability.
Findings
Outperforms existing methods on large-scale datasets
Resilient to ambient distortions due to contrastive training
Lightweight and scalable for real-world applications
Abstract
This paper introduces GraFPrint, an audio identification framework that leverages the structural learning capabilities of Graph Neural Networks (GNNs) to create robust audio fingerprints. Our method constructs a k-nearest neighbor (k-NN) graph from time-frequency representations and applies max-relative graph convolutions to encode local and global information. The network is trained using a self-supervised contrastive approach, which enhances resilience to ambient distortions by optimizing feature representation. GraFPrint demonstrates superior performance on large-scale datasets at various levels of granularity, proving to be both lightweight and scalable, making it suitable for real-world applications with extensive reference databases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
