Comparative Study between Adversarial Networks and Classical Techniques   for Speech Enhancement

Tito Spadini; Ricardo Suyama

arXiv:1910.09522·eess.AS·October 22, 2019

Comparative Study between Adversarial Networks and Classical Techniques for Speech Enhancement

Tito Spadini, Ricardo Suyama

PDF

Open Access

TL;DR

This paper compares classical speech enhancement techniques with deep learning approaches like SEGAN across various noise conditions, finding classical methods generally outperform but deep learning excels in severe noise scenarios.

Contribution

It provides a comprehensive comparison of classical and deep learning speech enhancement methods across diverse noise environments, highlighting their relative strengths.

Findings

01

Classical techniques outperform in most scenarios.

02

SEGAN performs better in severe noise conditions.

03

Deep learning methods have lower variance in noisy environments.

Abstract

Speech enhancement is a crucial task for several applications. Among the most explored techniques are the Wiener filter and the LogMMSE, but approaches exploring deep learning adapted to this task, such as SEGAN, have presented relevant results. This study compared the performance of the mentioned techniques in 85 noise conditions regarding quality, intelligibility, and distortion; and concluded that classical techniques continue to exhibit superior results for most scenarios, but, in severe noise scenarios, SEGAN performed better and with lower variance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Advanced Adaptive Filtering Techniques