Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model
Arthur N. dos Santos, Bruno S. Masiero, T\'ulio C. L. Mateus

TL;DR
This paper investigates whether single-channel speech enhancement models can be effectively adapted for multi-channel scenarios by processing each channel independently, aiming to simplify system design and resource requirements.
Contribution
It experimentally compares single-channel and multi-channel enhancement models, demonstrating the viability of single-channel methods for multi-channel applications with some trade-offs.
Findings
Single-channel models can be adapted for multi-channel enhancement.
Multi-channel models better preserve spatial information.
Trade-off exists between spatial preservation and intelligibility gains.
Abstract
One key aspect differentiating data-driven single- and multi-channel speech enhancement and dereverberation methods is that both the problem formulation and complexity of the solutions are considerably more challenging in the latter case. Additionally, with limited computational resources, it is cumbersome to train models that require the management of larger datasets or those with more complex designs. In this scenario, an unverified hypothesis that single-channel methods can be adapted to multi-channel scenarios simply by processing each channel independently holds significant implications, boosting compatibility between sound scene capture and system input-output formats, while also allowing modern research to focus on other challenging aspects, such as full-bandwidth audio enhancement, competitive noise suppression, and unsupervised learning. This study verifies this hypothesis by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Noise Effects and Management · Music and Audio Processing
MethodsFocus
