PLDNet: PLD-Guided Lightweight Deep Network Boosted by Efficient Attention for Handheld Dual-Microphone Speech Enhancement
Nan Zhou, Youhai Jiang, Jialin Tan, Chongmin Qi

TL;DR
PLDNet is a lightweight dual-microphone speech enhancement model that combines power level difference guidance with an attention-augmented U-Net, achieving high performance with significantly reduced computational cost for mobile devices.
Contribution
This paper introduces PLDNet, a novel lightweight deep network that integrates PLD guidance and a new attention module for efficient speech enhancement on mobile phones.
Findings
Achieves competitive speech enhancement performance.
Reduces computational cost by over 90%.
Suitable for low-power mobile devices.
Abstract
Low-complexity speech enhancement on mobile phones is crucial in the era of 5G. Thus, focusing on handheld mobile phone communication scenario, based on power level difference (PLD) algorithm and lightweight U-Net, we propose PLD-guided lightweight deep network (PLDNet), an extremely lightweight dual-microphone speech enhancement method that integrates the guidance of signal processing algorithm and lightweight attention-augmented U-Net. For the guidance information, we employ PLD algorithm to pre-process dual-microphone spectrum, and feed the output into subsequent deep neural network, which utilizes a lightweight U-Net with our proposed gated convolution augmented frequency attention (GCAFA) module to extract desired clean speech. Experimental results demonstrate that our proposed method achieves competitive performance with recent top-performing models while reducing computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Adaptive Filtering Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Concatenated Skip Connection · Max Pooling · U-Net · Gated Linear Unit · Gated Convolution · Convolution
