Comparative Study between Adversarial Networks and Classical Techniques for Speech Enhancement
Tito Spadini, Ricardo Suyama

TL;DR
This paper compares classical speech enhancement techniques with deep learning approaches like SEGAN across various noise conditions, finding classical methods generally outperform but deep learning excels in severe noise scenarios.
Contribution
It provides a comprehensive comparison of classical and deep learning speech enhancement methods across diverse noise environments, highlighting their relative strengths.
Findings
Classical techniques outperform in most scenarios.
SEGAN performs better in severe noise conditions.
Deep learning methods have lower variance in noisy environments.
Abstract
Speech enhancement is a crucial task for several applications. Among the most explored techniques are the Wiener filter and the LogMMSE, but approaches exploring deep learning adapted to this task, such as SEGAN, have presented relevant results. This study compared the performance of the mentioned techniques in 85 noise conditions regarding quality, intelligibility, and distortion; and concluded that classical techniques continue to exhibit superior results for most scenarios, but, in severe noise scenarios, SEGAN performed better and with lower variance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Advanced Adaptive Filtering Techniques
