Controlling Contrastive Self-Supervised Learning with Knowledge-Driven Multiple Hypothesis: Application to Beat Tracking

Antonin Gagnere; Slim Essid; Geoffroy Peeters

arXiv:2510.25560·cs.SD·October 30, 2025

Controlling Contrastive Self-Supervised Learning with Knowledge-Driven Multiple Hypothesis: Application to Beat Tracking

Antonin Gagnere, Slim Essid, Geoffroy Peeters

PDF

TL;DR

This paper introduces a contrastive self-supervised learning method that incorporates multiple hypotheses guided by domain knowledge, improving beat tracking performance by handling data ambiguities and diverse interpretations.

Contribution

It presents a novel multi-hypothesis contrastive learning framework that integrates domain knowledge for better music representation learning, especially in beat and downbeat tracking.

Findings

01

Outperforms existing methods on standard benchmarks

02

Effectively handles ambiguous rhythmic interpretations

03

Demonstrates the benefit of knowledge-driven hypothesis selection

Abstract

Ambiguities in data and problem constraints can lead to diverse, equally plausible outcomes for a machine learning task. In beat and downbeat tracking, for instance, different listeners may adopt various rhythmic interpretations, none of which would necessarily be incorrect. To address this, we propose a contrastive self-supervised pre-training approach that leverages multiple hypotheses about possible positive samples in the data. Our model is trained to learn representations compatible with different such hypotheses, which are selected with a knowledge-based scoring function to retain the most plausible ones. When fine-tuned on labeled data, our model outperforms existing methods on standard benchmarks, showcasing the advantages of integrating domain knowledge with multi-hypothesis selection in music representation learning in particular.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.