Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport

Bernardo Torres; Alain Riou; Ga\"el Richard; Geoffroy Peeters

arXiv:2508.01493·cs.SD·October 28, 2025

Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport

Bernardo Torres, Alain Riou, Ga\"el Richard, Geoffroy Peeters

PDF

Open Access

TL;DR

This paper introduces an optimal transport-based self-supervised learning method for pitch estimation that is translation-equivariant, offering a more stable and theoretically sound alternative to existing approaches.

Contribution

It presents a novel optimal transport objective for training translation-equivariant systems, specifically applied to pitch estimation, improving stability and theoretical grounding.

Findings

01

Enhanced numerical stability in pitch estimation models

02

Theoretically grounded training method

03

Simpler alternative to existing self-supervised approaches

Abstract

In this paper, we propose an Optimal Transport objective for learning one-dimensional translation-equivariant systems and demonstrate its applicability to single pitch estimation. Our method provides a theoretically grounded, more numerically stable, and simpler alternative for training state-of-the-art self-supervised pitch estimators.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Music and Audio Processing · Domain Adaptation and Few-Shot Learning