Improving Deep Learning-based Respiratory Sound Analysis with Frequency Selection and Attention Mechanism
Nouhaila Fraihi, Ouassim Karrakchou, and Mounir Ghogho

TL;DR
This paper introduces a compact CNN with integrated self-attention and a frequency band selection module to improve respiratory sound classification, achieving high accuracy with reduced computational costs and robustness across diverse patient groups.
Contribution
We propose a novel CNN-TSA network with a frequency band selection module that enhances accuracy and efficiency in respiratory sound analysis, setting new benchmarks on public datasets.
Findings
FBS reduces noise and improves accuracy by up to 50% in FLOPs.
CNN-TSA achieves state-of-the-art results on SPRSound and ICBHI datasets.
FBS can be integrated into transformer models as an effective enhancement.
Abstract
Accurate classification of respiratory sounds requires deep learning models that effectively capture fine-grained acoustic features and long-range temporal dependencies. Convolutional Neural Networks (CNNs) are well-suited for extracting local time-frequency patterns but are limited in modeling global context. In contrast, transformer-based models can capture long-range dependencies, albeit with higher computational demands. To address these limitations, we propose a compact CNN-Temporal Self-Attention (CNN-TSA) network that integrates lightweight self-attention into an efficient CNN backbone. Central to our approach is a Frequency Band Selection (FBS) module that suppresses noisy and non-informative frequency regions, substantially improving accuracy and reducing FLOPs by up to 50%. We also introduce age-specific models to enhance robustness across diverse patient groups. Evaluated on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
