Improving DF-Conformer Using Hydra For High-Fidelity Generative Speech Enhancement on Discrete Codec Token
Shogo Seki, Shaoxiang Dang, Li Li

TL;DR
This paper introduces a novel speech enhancement model that replaces FAVOR+ with Hydra, a structured state-space sequence model, achieving better global modeling and maintaining linear complexity, leading to improved performance on discrete codec tokens.
Contribution
The paper proposes integrating Hydra into DF-Conformer to improve global modeling and efficiency in speech enhancement tasks.
Findings
Hydra-based model outperforms DF-Conformer in experiments.
Maintains linear complexity while enhancing global modeling.
Achieves superior results on discrete codec token data.
Abstract
The Dilated FAVOR Conformer (DF-Conformer) is an efficient variant of the Conformer architecture designed for speech enhancement (SE). It employs fast attention through positive orthogonal random features (FAVOR+) to mitigate the quadratic complexity associated with self-attention, while utilizing dilated convolution to expand the receptive field. This combination results in impressive performance across various SE models. In this paper, we propose replacing FAVOR+ with bidirectional selective structured state-space sequence models to achieve two main objectives:(1) enhancing global sequential modeling by eliminating the approximations inherent in FAVOR+, and (2) maintaining linear complexity relative to the sequence length. Specifically, we utilize Hydra, a bidirectional extension of Mamba, framed within the structured matrix mixer framework. Experiments conducted using a generative SE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Adaptive Filtering Techniques
