Independent Low-Rank Matrix Analysis Based on Complex Student's $t$-Distribution for Blind Audio Source Separation
Shinichi Mogami, Daichi Kitamura, Yoshiki Mitsui, Norihiro Takamune,, Hiroshi Saruwatari, Nobutaka Ono

TL;DR
This paper enhances blind audio source separation by integrating a complex Student's t-distribution into ILRMA, improving performance and stability in separating music and speech sources.
Contribution
It introduces a novel source model based on complex Student's t-distribution within ILRMA, extending the conventional Gaussian assumption for better separation results.
Findings
Improved separation quality for music and speech tasks.
Enhanced stability of the source separation process.
Abstract
In this paper, we generalize a source generative model in a state-of-the-art blind source separation (BSS), independent low-rank matrix analysis (ILRMA). ILRMA is a unified method of frequency-domain independent component analysis and nonnegative matrix factorization and can provide better performance for audio BSS tasks. To further improve the performance and stability of the separation, we introduce an isotropic complex Student's -distribution as a source generative model, which includes the isotropic complex Gaussian distribution used in conventional ILRMA. Experiments are conducted using both music and speech BSS tasks, and the results show the validity of the proposed method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Music and Audio Processing
