Anchor-aware Deep Metric Learning for Audio-visual Retrieval
Donghuo Zeng, Yanan Wang, Kazushi Ikeda, Yi Yu

TL;DR
This paper introduces Anchor-aware Deep Metric Learning (AADML), a novel approach that leverages correlation graphs and attention mechanisms to improve audio-visual retrieval by better representing data distributions.
Contribution
The paper proposes an innovative AADML method that uses anchor-based correlation graphs and attention to enhance metric learning for cross-modal retrieval.
Findings
Significantly outperforms state-of-the-art models on benchmark datasets.
Effectively captures underlying data correlations through anchor-aware mechanisms.
Enhances the quality of shared embedding space for audio-visual retrieval.
Abstract
Metric learning minimizes the gap between similar (positive) pairs of data points and increases the separation of dissimilar (negative) pairs, aiming at capturing the underlying data structure and enhancing the performance of tasks like audio-visual cross-modal retrieval (AV-CMR). Recent works employ sampling methods to select impactful data points from the embedding space during training. However, the model training fails to fully explore the space due to the scarcity of training data points, resulting in an incomplete representation of the overall positive and negative distributions. In this paper, we propose an innovative Anchor-aware Deep Metric Learning (AADML) method to address this challenge by uncovering the underlying correlations among existing data points, which enhances the quality of the shared embedding space. Specifically, our method establishes a correlation graph-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
