MarginNCE: Robust Sound Localization with a Negative Margin
Sooyoung Park, Arda Senocak, Joon Son Chung

TL;DR
This paper introduces MarginNCE, a contrastive learning method with a negative margin that improves robustness in sound source localization amidst noisy audio-visual data, outperforming existing methods.
Contribution
The paper proposes a novel contrastive loss modification using a negative margin to handle noisy correspondences in sound localization tasks.
Findings
MarginNCE achieves on-par or better performance than state-of-the-art methods.
Introducing a negative margin consistently improves existing contrastive approaches.
The approach effectively mitigates noise in audio-visual correspondence for localization.
Abstract
The goal of this work is to localize sound sources in visual scenes with a self-supervised approach. Contrastive learning in the context of sound source localization leverages the natural correspondence between audio and visual signals where the audio-visual pairs from the same source are assumed as positive, while randomly selected pairs are negatives. However, this approach brings in noisy correspondences; for example, positive audio and visual pair signals that may be unrelated to each other, or negative pairs that may contain semantically similar samples to the positive one. Our key contribution in this work is to show that using a less strict decision boundary in contrastive learning can alleviate the effect of noisy correspondences in sound source localization. We propose a simple yet effective approach by slightly modifying the contrastive loss with a negative margin. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Hearing Loss and Rehabilitation
MethodsContrastive Learning
