Low Latency Time Domain Multichannel Speech and Music Source Separation
Gerald Schuller (Ilmenau University of Technology)

TL;DR
This paper introduces a low latency, low complexity multichannel source separation method in the time domain, suitable for real-time applications like teleconferencing and hearing aids, outperforming traditional frequency domain approaches.
Contribution
A novel probabilistic optimization method called 'Random Directions' is proposed for time domain source separation, reducing latency and complexity compared to frequency domain methods.
Findings
Outperforms frequency domain methods in low latency scenarios
Effective in separating speech, music, and noise sources
Scalable for portable device applications
Abstract
The Goal is to obtain a simple multichannel source separation with very low latency. Applications can be teleconferencing, hearing aids, augmented reality, or selective active noise cancellation. These real time applications need a very low latency, usually less than about 6 ms, and low complexity, because they usually run on small portable devices. For that we don't need the best separation, but "useful" separation, and not just on speech, but also music and noise. Usual frequency domain approaches have higher latency and complexity. Hence we introduce a novel probabilistic optimization method which we call "Random Directions", which can overcome local minima, applied to a simple time domain unmixing structure, and which is scalable for low complexity. Then it is compared to frequency domain approaches on separating speech and music sources, and using 3D microphone setups.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Music and Audio Processing
