CREPE: A Convolutional Representation for Pitch Estimation

Jong Wook Kim; Justin Salamon; Peter Li; Juan Pablo Bello

arXiv:1802.06182·eess.AS·February 20, 2018

CREPE: A Convolutional Representation for Pitch Estimation

Jong Wook Kim, Justin Salamon, Peter Li, Juan Pablo Bello

PDF

2 Repos 1 Models 1 Datasets

TL;DR

CREPE is a deep learning-based pitch estimation method that directly analyzes raw audio waveforms, achieving state-of-the-art accuracy and robustness, and is available as an open-source tool.

Contribution

It introduces a novel convolutional neural network for pitch tracking that outperforms traditional DSP-based methods like pYIN.

Findings

01

CREPE achieves state-of-the-art pitch estimation accuracy.

02

The model performs well under noisy conditions.

03

It is freely available as an open-source Python module.

Abstract

The task of estimating the fundamental frequency of a monophonic sound recording, also known as pitch tracking, is fundamental to audio processing with multiple applications in speech processing and music information retrieval. To date, the best performing techniques, such as the pYIN algorithm, are based on a combination of DSP pipelines and heuristics. While such techniques perform very well on average, there remain many cases in which they fail to correctly estimate the pitch. In this paper, we propose a data-driven pitch tracking algorithm, CREPE, which is based on a deep convolutional neural network that operates directly on the time-domain waveform. We show that the proposed model produces state-of-the-art results, performing equally or better than pYIN. Furthermore, we evaluate the model's generalizability in terms of noise robustness. A pre-trained version of CREPE is made…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
bpiyush/sound-of-water-models
model· ♡ 3
♡ 3

Datasets

bpiyush/sound-of-water
dataset· 331 dl
331 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.