On Harmonic Approximations of Inharmonic Signals
Filip Elvander, Jie Ding, Andreas Jakobsson

TL;DR
This paper derives a Gaussian Cramér-Rao bound for inharmonic signals, demonstrating that misspecified harmonic models can outperform unstructured models in estimating fundamental frequency, especially in speech analysis.
Contribution
It provides a closed-form bound for inharmonic signals and shows the effectiveness of misspecified harmonic models in practical speech signal estimation.
Findings
Bound is sharp and attainable by maximum likelihood estimators.
Misspecified harmonic models outperform unstructured models for inharmonic signals.
Voiced speech signals are effectively modeled as inharmonic, validating the harmonic approach.
Abstract
In this work, we present the misspecified Gaussian Cram\'er-Rao lower bound for the parameters of a harmonic signal, or pitch, when signal measurements are collected from an almost, but not quite, harmonic model. For the asymptotic case of large sample sizes, we present a closed-form expression for the bound corresponding to the pseduo-true fundamental frequency. Using simulation studies, it is shown that the bound is sharp and is attained by maximum likelihood estimators derived under the misspecified harmonic assumption. It is shown that misspecified harmonic models achieve a lower mean squared error than correctly specified unstructured models for moderately inharmonic signals. Examining voices from a speech database, we conclude that human speech belongs to this class of signals, verifying that the use of a harmonic model for voiced speech is preferable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
