A general framework for online audio source separation
Laurent S. R. Simon (INRIA - IRISA), Emmanuel Vincent (INRIA - IRISA)

TL;DR
This paper introduces a comprehensive online audio source separation framework that integrates spatial and spectral cues, utilizing a GEM algorithm for parameter estimation, and compares its performance to offline methods.
Contribution
It proposes a novel general framework combining spatial and spectral cues for online audio source separation, using ML estimation with a GEM algorithm.
Findings
Performance varies with block and step sizes.
The framework approaches offline accuracy levels.
It outperforms existing online methods in certain conditions.
Abstract
We consider the problem of online audio source separation. Existing algorithms adopt either a sliding block approach or a stochastic gradient approach, which is faster but less accurate. Also, they rely either on spatial cues or on spectral cues and cannot separate certain mixtures. In this paper, we design a general online audio source separation framework that combines both approaches and both types of cues. The model parameters are estimated in the Maximum Likelihood (ML) sense using a Generalised Expectation Maximisation (GEM) algorithm with multiplicative updates. The separation performance is evaluated as a function of the block size and the step size and compared to that of an offline algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Advanced Adaptive Filtering Techniques
