Inferring Pitch from Coarse Spectral Features

Danni Ma; Neville Ryant; Mark Liberman

arXiv:2204.04579·cs.SD·December 14, 2022

Inferring Pitch from Coarse Spectral Features

Danni Ma, Neville Ryant, Mark Liberman

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that coarse spectral features can predict pitch better than traditional F0 analysis, highlighting the complexity of pitch perception and suggesting new modeling approaches.

Contribution

It introduces a novel approach using coarse spectral features and linear regression to predict pitch, challenging the reliance on F0 as the sole measure.

Findings

01

Coarse spectral features can predict pitch in simple vocalizations.

02

Prediction accuracy decreases with more complex vocalizations.

03

Covariates for pitch are more complex but still accessible for advanced models.

Abstract

Fundamental frequency (F0) has long been treated as the physical definition of "pitch" in phonetic analysis. But there have been many demonstrations that F0 is at best an approximation to pitch, both in production and in perception: pitch is not F0, and F0 is not pitch. Changes in the pitch involve many articulatory and acoustic covariates; pitch perception often deviates from what F0 analysis predicts; and in fact, quasi-periodic signals from a single voice source are often incompletely characterized by an attempt to define a single time-varying F0. In this paper, we find strong support for the existence of covariates for pitch in aspects of relatively coarse spectra, in which an overtone series is not available. Thus linear regression can predict the pitch of simple vocalizations, produced by an articulatory synthesizer or by human, from single frames of such coarse spectra. Across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dannima/inferringpitch
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing