Modified Group Delay Based MultiPitch Estimation in Co-Channel Speech

Rajeev Rajan; Hema A. Murthy

arXiv:1603.05435·cs.SD·March 18, 2016·1 cites

Modified Group Delay Based MultiPitch Estimation in Co-Channel Speech

Rajeev Rajan, Hema A. Murthy

PDF

Open Access

TL;DR

This paper introduces a modified group delay method for multipitch estimation in co-channel speech, effectively extracting multiple concurrent pitches by flattening the spectrum and iterative analysis, showing promising results on standard datasets.

Contribution

It presents a novel multipitch estimation algorithm using modified group delay functions with spectrum flattening and iterative analysis, improving multipitch detection in co-channel speech.

Findings

01

High pitch accuracy on standard datasets

02

Effective separation of concurrent pitches

03

Robust performance in real speech recordings

Abstract

Phase processing has been replaced by group delay processing for the extraction of source and system parameters from speech. Group delay functions are ill-behaved when the transfer function has zeros that are close to unit circle in the z-domain. The modified group delay function addresses this problem and has been successfully used for formant and monopitch estimation. In this paper, modified group delay functions are used for multipitch estimation in concurrent speech. The power spectrum of the speech is first flattened in order to annihilate the system characteristics, while retaining the source characteristics. Group delay analysis on this flattened spectrum picks the predominant pitch in the first pass and a comb filter is used to filter out the estimated pitch along with its harmonics. The residual spectrum is again analyzed for the next candidate pitch estimate in the second…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis