AudioNet: Supervised Deep Hashing for Retrieval of Similar Audio Events
Sagar Dutta, Vipul Arora

TL;DR
AudioNet introduces a supervised deep hashing approach for efficient retrieval of similar audio events, achieving high performance and setting new benchmarks through a novel loss function and discrete gradient propagation.
Contribution
The paper presents a novel deep hashing method with a new loss function and discrete gradient propagation for improved audio event retrieval performance.
Findings
High retrieval accuracy on multiple datasets
Effective handling of imbalanced datasets
Establishes a new benchmark in audio retrieval
Abstract
This work presents a supervised deep hashing method for retrieving similar audio events. The proposed method, named AudioNet, is a deep-learning-based system for efficient hashing and retrieval of similar audio events using an audio example as a query. AudioNet achieves high retrieval performance on multiple standard datasets by generating binary hash codes for similar audio events, setting new benchmarks in the field, and highlighting its efficacy and effectiveness compare to other hashing methods. Through comprehensive experiments on standard datasets, our research represents a pioneering effort in evaluating the retrieval performance of similar audio events. A novel loss function is proposed which incorporates weighted contrastive and weighted pairwise loss along with hashcode balancing to improve the efficiency of audio event retrieval. The method adopts discrete gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Advanced Image and Video Retrieval Techniques · Video Analysis and Summarization
