Nonnegative tensor factorization with frequency modulation cues for blind audio source separation
Elliot Creager, Noah D. Stein, Roland Badeau, Philippe, Depalle

TL;DR
This paper introduces Vibrato Nonnegative Tensor Factorization, an unsupervised algorithm that separates musical sources with nonstationary pitch, such as vibrato, by incorporating local frequency modulation cues into tensor factorization.
Contribution
The paper extends Nonnegative Matrix Factorization to include local frequency modulation cues for better separation of vibrato and glissando sources in audio recordings.
Findings
Successfully separates vibrato and glissando sources
Outperforms baseline methods on synthetic vibrato notes
Effective in real musical recordings with frequency modulations
Abstract
We present Vibrato Nonnegative Tensor Factorization, an algorithm for single-channel unsupervised audio source separation with an application to separating instrumental or vocal sources with nonstationary pitch from music recordings. Our approach extends Nonnegative Matrix Factorization for audio modeling by including local estimates of frequency modulation as cues in the separation. This permits the modeling and unsupervised separation of vibrato or glissando musical sources, which is not possible with the basic matrix factorization formulation. The algorithm factorizes a sparse nonnegative tensor comprising the audio spectrogram and local frequency-slope-to-frequency ratios, which are estimated at each time-frequency bin using the Distributed Derivative Method. The use of local frequency modulations as separation cues is motivated by the principle of common fate partial grouping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Tensor decomposition and applications · Blind Source Separation Techniques
