Modified Group Delay Based MultiPitch Estimation in Co-Channel Speech
Rajeev Rajan, Hema A. Murthy

TL;DR
This paper introduces a modified group delay method for multipitch estimation in co-channel speech, effectively extracting multiple concurrent pitches by flattening the spectrum and iterative analysis, showing promising results on standard datasets.
Contribution
It presents a novel multipitch estimation algorithm using modified group delay functions with spectrum flattening and iterative analysis, improving multipitch detection in co-channel speech.
Findings
High pitch accuracy on standard datasets
Effective separation of concurrent pitches
Robust performance in real speech recordings
Abstract
Phase processing has been replaced by group delay processing for the extraction of source and system parameters from speech. Group delay functions are ill-behaved when the transfer function has zeros that are close to unit circle in the z-domain. The modified group delay function addresses this problem and has been successfully used for formant and monopitch estimation. In this paper, modified group delay functions are used for multipitch estimation in concurrent speech. The power spectrum of the speech is first flattened in order to annihilate the system characteristics, while retaining the source characteristics. Group delay analysis on this flattened spectrum picks the predominant pitch in the first pass and a comb filter is used to filter out the estimated pitch along with its harmonics. The residual spectrum is again analyzed for the next candidate pitch estimate in the second…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
