Efficient Multi-Channel Speech Enhancement with Spherical Harmonics Injection for Directional Encoding
Jiahui Pan, Pengjie Shen, Hui Zhang, Xueliang Zhang

TL;DR
This paper presents a novel multi-channel speech enhancement method that uses spherical harmonics transform coefficients alongside traditional STFT inputs, effectively capturing spatial information to improve performance with fewer resources.
Contribution
The paper introduces a dual-encoder model that fuses spherical harmonics coefficients with STFT for enhanced spatial cue utilization in speech enhancement.
Findings
Outperforms benchmarks on TIMIT with noise and reverberation
Uses fewer computations and parameters than existing methods
Effectively incorporates directional cues via spherical harmonics
Abstract
Multi-channel speech enhancement extracts speech using multiple microphones that capture spatial cues. Effectively utilizing directional information is key for multi-channel enhancement. Deep learning shows great potential on multi-channel speech enhancement and often takes short-time Fourier Transform (STFT) as inputs directly. To fully leverage the spatial information, we introduce a method using spherical harmonics transform (SHT) coefficients as auxiliary model inputs. These coefficients concisely represent spatial distributions. Specifically, our model has two encoders, one for the STFT and another for the SHT. By fusing both encoders in the decoder to estimate the enhanced STFT, we effectively incorporate spatial context. Evaluations on TIMIT under varying noise and reverberation show our model outperforms established benchmarks. Remarkably, this is achieved with fewer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Indoor and Outdoor Localization Technologies
