SLASH: Self-Supervised Speech Pitch Estimation Leveraging DSP-derived Absolute Pitch
Ryo Terashima, Yuma Shirahata, Masaya Kawamura

TL;DR
SLASH is a novel self-supervised speech pitch estimation method that integrates DSP-derived absolute pitch and a differentiable spectrogram to improve accuracy over existing SSL and DSP approaches.
Contribution
It introduces a new approach combining self-supervised learning with DSP-derived absolute pitch and a differentiable spectrogram for enhanced speech pitch estimation.
Findings
Outperforms baseline DSP and SSL methods in pitch estimation accuracy.
Effectively predicts aperiodic components in speech signals.
Demonstrates improved robustness and applicability in speech processing.
Abstract
We present SLASH, a pitch estimation method of speech signals based on self-supervised learning (SSL). To enhance the performance of conventional SSL-based approaches that primarily depend on the relative pitch difference derived from pitch shifting, our method incorporates absolute pitch values by 1) introducing a prior pitch distribution derived from digital signal processing (DSP), and 2) optimizing absolute pitch through gradient descent with a loss between the target and differentiable DSP-derived spectrograms. To stabilize the optimization, a novel spectrogram generation method is used that skips complicated waveform generation. In addition, the aperiodic components in speech are accurately predicted through differentiable DSP, enhancing the method's applicability to speech signal processing. Experimental results showed that the proposed method outperformed both baseline DSP and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Advanced Data Compression Techniques
