EffiFusion-GAN: Efficient Fusion Generative Adversarial Network for Speech Enhancement

Bin Wen; Tien-Ping Tan

arXiv:2508.14525·cs.SD·August 21, 2025

EffiFusion-GAN: Efficient Fusion Generative Adversarial Network for Speech Enhancement

Bin Wen, Tien-Ping Tan

PDF

Open Access

TL;DR

EffiFusion-GAN is a lightweight, efficient speech enhancement model that combines depthwise separable convolutions, advanced attention mechanisms, and dynamic pruning to improve performance and reduce model size for resource-limited applications.

Contribution

The paper introduces EffiFusion-GAN, a novel lightweight GAN architecture with integrated multi-scale features, attention, and pruning techniques for effective speech enhancement.

Findings

01

Achieves a PESQ score of 3.45 on VoiceBank+DEMAND dataset.

02

Outperforms existing models with similar parameter counts.

03

Maintains performance while significantly reducing model size.

Abstract

We introduce EffiFusion-GAN (Efficient Fusion Generative Adversarial Network), a lightweight yet powerful model for speech enhancement. The model integrates depthwise separable convolutions within a multi-scale block to capture diverse acoustic features efficiently. An enhanced attention mechanism with dual normalization and residual refinement further improves training stability and convergence. Additionally, dynamic pruning is applied to reduce model size while maintaining performance, making the framework suitable for resource-constrained environments. Experimental evaluation on the public VoiceBank+DEMAND dataset shows that EffiFusion-GAN achieves a PESQ score of 3.45, outperforming existing models under the same parameter settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Infant Health and Development