Comparing Conventional Pitch Detection Algorithms with a Neural Network Approach
Anja Kroon (McGill University)

TL;DR
This paper compares traditional pitch detection algorithms with a neural network-based approach, evaluating their accuracy and error types to determine which method performs best in pitch prediction tasks.
Contribution
It introduces a comparison between classical algorithms and a neural network-based method, highlighting the performance differences in pitch detection accuracy.
Findings
CREPE outperforms traditional algorithms in certain error metrics
Neural network approach reduces gross pitch errors
Classical methods still have competitive accuracy in some scenarios
Abstract
Despite much research, traditional methods to pitch prediction are still not perfect. With the emergence of neural networks (NNs), researchers hope to create a NN-based pitch predictor that outperforms traditional methods. Three pitch detection algorithms (PDAs), pYIN, YAAPT, and CREPE are compared in this paper. pYIN and YAAPT are conventional approaches considering time domain and frequency domain processing. CREPE utilizes a data-trained deep convolutional neural network to estimate pitch. It involves 6 densely connected convolutional hidden layers and determines pitch probabilities for a given input signal. The performance of CREPE representing neural network pitch predictors is compared to more classical approaches represented by pYIN and YAAPT. The figure of merit (FOM) will include the amount of unvoiced-to-voiced errors, voiced-to-voiced errors, gross pitch errors, and fine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
