TL;DR
This paper introduces adaptive convolution, a dynamic and efficient module for CNN-based speech enhancement that adaptively generates time-varying kernels, significantly improving performance with minimal added complexity.
Contribution
The paper proposes adaptive convolution with a lightweight attention mechanism, enhancing CNN models' ability to adaptively represent speech signals and integrating it into an ultra-lightweight AdaptCRN model.
Findings
Adaptive convolution improves speech enhancement performance.
The method achieves better results with negligible computational overhead.
Adaptive convolution correlates kernel selection with speech signal characteristics.
Abstract
Deep learning-based speech enhancement methods have significantly improved speech quality and intelligibility. Convolutional neural networks (CNNs) have been proven to be essential components of many high-performance models. In this paper, we introduce adaptive convolution, an efficient and versatile convolutional module that enhances the model's capability to adaptively represent speech signals. Adaptive convolution performs frame-wise causal dynamic convolution, generating time-varying kernels for each frame by assembling multiple parallel candidate kernels. A lightweight attention mechanism is proposed for adaptive convolution, leveraging both current and historical information to assign adaptive weights to each candidate kernel. This enables the convolution operation to adapt to frame-level speech spectral features, leading to more efficient extraction and reconstruction. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Convolution
