Dual-Stage Low-Complexity Reconfigurable Speech Enhancement
Jun Yang, Nico Brailovsky

TL;DR
This paper introduces a dual-stage, low-complexity reconfigurable speech enhancement method that significantly improves speech quality and noise suppression in telecommunication applications by performing coarse and fine processing stages.
Contribution
The proposed approach is novel in its dual-stage design driven by input data, offering reconfigurability and improved speech quality metrics over traditional methods.
Findings
Significantly improves 3QUEST metrics (SMOS and NMOS)
Enhances SNR and subjective listening experience
Can be integrated into various speech communication systems
Abstract
This paper proposes a dual-stage, low complexity, and reconfigurable technique to enhance the speech contaminated by various types of noise sources. Driven by input data and audio contents, the proposed dual-stage speech enhancement approach performs a coarse and fine processing in the first-stage and second-stage, respectively. In this paper, we demonstrate that the proposed speech enhancement solution significantly enhances the metrics of 3-fold QUality Evaluation of Speech in Telecommunication (3QUEST) consisting of speech mean-opinion-score (SMOS) and noise MOS (NMOS) for near-field and far-field applications. Moreover, the proposed speech enhancement approach greatly improves both the signal-to-noise ratio (SNR) and subjective listening experience. For comparisons, the traditional speech enhancement methods reduce the SMOS although they increase NMOS and SNR. In addition, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Speech Recognition and Synthesis
