Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation

David McShannon; Anthony Mella; Nicholas Dietrich

arXiv:2602.02955·cs.SD·February 4, 2026

Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation

David McShannon, Anthony Mella, Nicholas Dietrich

PDF

Open Access

TL;DR

This study evaluates the effectiveness of various synthetic data augmentation techniques on respiratory sound classification, finding limited improvements and highlighting the need for task-specific strategies and better evaluation frameworks.

Contribution

It provides a systematic comparison of three generative augmentation methods and their impact on a baseline CNN in medical audio classification, revealing limited benefits.

Findings

01

Synthetic augmentation did not improve F1-score in most cases.

02

Ensemble of augmented models achieved a modest performance gain.

03

Synthetic augmentation may not be universally effective for medical audio tasks.

Abstract

Medical audio classification remains challenging due to low signal-to-noise ratios, subtle discriminative features, and substantial intra-class variability, often compounded by class imbalance and limited training data. Synthetic data augmentation has been proposed as a potential strategy to mitigate these constraints; however, prior studies report inconsistent methodological approaches and mixed empirical results. In this preliminary study, we explore the impact of synthetic augmentation on respiratory sound classification using a baseline deep convolutional neural network trained on a moderately imbalanced dataset (73%:27%). Three generative augmentation strategies (variational autoencoders, generative adversarial networks, and diffusion models) were assessed under controlled experimental conditions. The baseline model without augmentation achieved an F1-score of 0.645. Across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhonocardiography and Auscultation Techniques · Voice and Speech Disorders · COVID-19 diagnosis using AI