RMVPE: A Robust Model for Vocal Pitch Estimation in Polyphonic Music
Haojie Wei, Xueke Cao, Tangpeng Dan, Yueguo Chen

TL;DR
RMVPE is a new robust model that directly estimates vocal pitch from polyphonic music, outperforming previous methods affected by accompaniment and noise, with high accuracy and noise robustness.
Contribution
It introduces RMVPE, a novel model that directly predicts vocal pitches in polyphonic music without relying on source separation, enhancing robustness and accuracy.
Findings
RMVPE achieves superior raw pitch accuracy (RPA) and raw chroma accuracy (RCA).
RMVPE maintains robustness across various noise levels and SNRs.
Experimental results demonstrate RMVPE's effectiveness over existing methods.
Abstract
Vocal pitch is an important high-level feature in music audio processing. However, extracting vocal pitch in polyphonic music is more challenging due to the presence of accompaniment. To eliminate the influence of the accompaniment, most previous methods adopt music source separation models to obtain clean vocals from polyphonic music before predicting vocal pitches. As a result, the performance of vocal pitch estimation is affected by the music source separation models. To address this issue and directly extract vocal pitches from polyphonic music, we propose a robust model named RMVPE. This model can extract effective hidden features and accurately predict vocal pitches from polyphonic music. The experimental results demonstrate the superiority of RMVPE in terms of raw pitch accuracy (RPA) and raw chroma accuracy (RCA). Additionally, experiments conducted with different types of noise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
