Improving the Robustness and Clinical Applicability of Automatic   Respiratory Sound Classification Using Deep Learning-Based Audio Enhancement:   Algorithm Development and Validation

Jing-Tong Tzeng; Jeng-Lin Li; Huan-Yu Chen; Chun-Hsiang Huang,; Chi-Hsin Chen; Cheng-Yi Fan; Edward Pei-Chuan Huang; Chi-Chun Lee

arXiv:2407.13895·eess.AS·May 1, 2025

Improving the Robustness and Clinical Applicability of Automatic Respiratory Sound Classification Using Deep Learning-Based Audio Enhancement: Algorithm Development and Validation

Jing-Tong Tzeng, Jeng-Lin Li, Huan-Yu Chen, Chun-Hsiang Huang,, Chi-Hsin Chen, Cheng-Yi Fan, Edward Pei-Chuan Huang, Chi-Chun Lee

PDF

Open Access

TL;DR

This study demonstrates that integrating deep learning-based audio enhancement significantly improves the robustness and clinical utility of automatic respiratory sound classification systems in noisy conditions, with notable performance gains and increased trust.

Contribution

The paper introduces a novel audio enhancement module integrated into respiratory sound classification, showing improved accuracy and clinical applicability over traditional noise augmentation methods.

Findings

01

21.9% increase in classification score on ICBHI dataset

02

4.1% improvement on FABS dataset in noisy scenarios

03

Enhanced workflows improve diagnostic sensitivity by 11.6%

Abstract

Deep learning techniques have shown promising results in the automatic classification of respiratory sounds. However, accurately distinguishing these sounds in real-world noisy conditions remains challenging for clinical deployment. In addition, predicting signals with only background noise may reduce user trust in the system. This study explores the feasibility and effectiveness of incorporating a deep learning-based audio enhancement step into automatic respiratory sound classification systems to improve robustness and clinical applicability. We conducted extensive experiments using various audio enhancement model architectures, including time-domain and time-frequency-domain approaches, combined with multiple classification models to evaluate the module's effectiveness. The classification performance was compared against the noise injection data augmentation method. These experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhonocardiography and Auscultation Techniques · Noise Effects and Management · Chronic Obstructive Pulmonary Disease (COPD) Research

MethodsAutoencoders