Single-channel speech enhancement by using psychoacoustical model inspired fusion framework
Suman Samui

TL;DR
This paper proposes a fusion framework combining acoustic and modulation domain approaches for single-channel speech enhancement, improving perceived quality and intelligibility in noisy conditions.
Contribution
It introduces a novel fusion framework that leverages psychoacoustic models to enhance speech quality and intelligibility, overcoming limitations of individual methods.
Findings
Consistent improvements in speech quality and intelligibility across noise conditions.
Effective noise reduction at high frequencies with reduced speech distortion.
Superior performance compared to baseline techniques.
Abstract
When the parameters of Bayesian Short-time Spectral Amplitude (STSA) estimator for speech enhancement are selected based on the characteristics of the human auditory system, the gain function of the estimator becomes more flexible. Although this type of estimator in acoustic domain is quite effective in reducing the back-ground noise at high frequencies, it produces more speech distortions, which make the high-frequency contents of the speech such as friciatives less perceptible in heavy noise conditions, resulting in intelligibility reduction. On the other hand, the speech enhancement scheme, which exploits the psychoacoustic evidence of frequency selectivity in the modulation domain, is found to be able to increase the intelligibility of noisy speech by a substantial amount, but also suffers from the temporal slurring problem due to its essential design constraint. In order to achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Acoustic Wave Phenomena Research
