Controlling Contrastive Self-Supervised Learning with Knowledge-Driven Multiple Hypothesis: Application to Beat Tracking
Antonin Gagnere, Slim Essid, Geoffroy Peeters

TL;DR
This paper introduces a contrastive self-supervised learning method that incorporates multiple hypotheses guided by domain knowledge, improving beat tracking performance by handling data ambiguities and diverse interpretations.
Contribution
It presents a novel multi-hypothesis contrastive learning framework that integrates domain knowledge for better music representation learning, especially in beat and downbeat tracking.
Findings
Outperforms existing methods on standard benchmarks
Effectively handles ambiguous rhythmic interpretations
Demonstrates the benefit of knowledge-driven hypothesis selection
Abstract
Ambiguities in data and problem constraints can lead to diverse, equally plausible outcomes for a machine learning task. In beat and downbeat tracking, for instance, different listeners may adopt various rhythmic interpretations, none of which would necessarily be incorrect. To address this, we propose a contrastive self-supervised pre-training approach that leverages multiple hypotheses about possible positive samples in the data. Our model is trained to learn representations compatible with different such hypotheses, which are selected with a knowledge-based scoring function to retain the most plausible ones. When fine-tuned on labeled data, our model outperforms existing methods on standard benchmarks, showcasing the advantages of integrating domain knowledge with multi-hypothesis selection in music representation learning in particular.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
