RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification
June-Woo Kim, Miika Toikkanen, Sangmin Bae, Minseok Kim, Ho-Young Jung

TL;DR
This paper introduces RepAugment, a novel data augmentation method for respiratory sound classification that improves accuracy, especially for minority disease classes, by bridging the gap between speech and lung sound representations.
Contribution
The paper proposes RepAugment, a representation-level augmentation technique compatible with waveform pretrained models, outperforming traditional spectrogram-based methods like SpecAugment.
Findings
RepAugment outperforms SpecAugment in accuracy.
Significant improvement in minority disease class detection, up to 7.14%.
Demonstrates the effectiveness of input-agnostic augmentation for respiratory sounds.
Abstract
Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrained speech models for respiratory sound classification. We find that there is a characterization gap between speech and lung sound samples, and to bridge this gap, data augmentation is essential. However, the most widely used augmentation technique for audio and speech, SpecAugment, requires 2-dimensional spectrogram format and cannot be applied to models pretrained on speech waveforms. To address this, we propose RepAugment, an input-agnostic representation-level augmentation technique that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonocardiography and Auscultation Techniques · Music and Audio Processing · Chronic Obstructive Pulmonary Disease (COPD) Research
