DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN

Daniel Rika; Nino Sapir; Ido Gus

arXiv:2512.16420·cs.SD·January 14, 2026

DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN

Daniel Rika, Nino Sapir, Ido Gus

PDF

Open Access 1 Models

TL;DR

DPDFNet is a causal speech enhancement model that improves long-range and cross-band modeling using dual-path RNNs, with added loss components and fine-tuning, outperforming larger models on real-world, low-SNR, multi-language noise scenarios, and is feasible for edge deployment.

Contribution

We introduce DPDFNet, a novel causal speech enhancement architecture with dual-path blocks, enhanced loss functions, and a new evaluation set, demonstrating superior real-time performance on edge devices.

Findings

01

DPDFNet outperforms larger causal models on a new real-world evaluation set.

02

The PRISM metric correlates with model scalability and performance.

03

DPDFNet runs in real-time on edge NPUs, maintaining high quality.

Abstract

We present DPDFNet, a causal single-channel speech enhancement model that extends DeepFilterNet2 architecture with dual-path blocks in the encoder, strengthening long-range temporal and cross-band modeling while preserving the original enhancement framework. In addition, we demonstrate that adding a loss component to mitigate over-attenuation in the enhanced speech, combined with a fine-tuning phase tailored for "always-on" applications, leads to substantial improvements in overall model performance. To compare our proposed architecture with a variety of causal open-source models, we created a new evaluation set comprising long, low-SNR recordings in 12 languages across everyday noise scenarios, better reflecting real-world conditions than commonly used benchmarks. On this evaluation set, DPDFNet delivers superior performance to other causal open-source models, including some that are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Ceva-IP/DPDFNet
model· 88 dl· ♡ 2
88 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Hearing Loss and Rehabilitation