Gaussian Processes for Music Audio Modelling and Content Analysis
Pablo A. Alvarado, Dan Stowell

TL;DR
This paper introduces a Bayesian Gaussian process framework for joint modeling of music audio signals, improving tasks like pitch estimation and missing segment inference by incorporating rich prior information about musical structure.
Contribution
It presents a novel Gaussian process-based Bayesian approach that unifies multiple music analysis tasks and models complex, non-stationary musical signals jointly.
Findings
Enhanced pitch estimation accuracy
Effective inference of missing audio segments
Demonstrated benefits of joint modeling over separate tasks
Abstract
Real music signals are highly variable, yet they have strong statistical structure. Prior information about the underlying physical mechanisms by which sounds are generated and rules by which complex sound structure is constructed (notes, chords, a complete musical score), can be naturally unified using Bayesian modelling techniques. Typically algorithms for Automatic Music Transcription independently carry out individual tasks such as multiple-F0 detection and beat tracking. The challenge remains to perform joint estimation of all parameters. We present a Bayesian approach for modelling music audio, and content analysis. The proposed methodology based on Gaussian processes seeks joint estimation of multiple music concepts by incorporating into the kernel prior information about non-stationary behaviour, dynamics, and rich spectral content present in the modelled music signal. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
