Optimizing Audio Augmentations for Contrastive Learning of Health-Related Acoustic Signals
Louis Blankemeier, Sebastien Baur, Wei-Hung Weng, Jake Garrison, Yossi, Matias, Shruthi Prabhakara, Diego Ardila, Zaid Nabulsi

TL;DR
This paper explores how to optimize audio augmentations within a self-supervised contrastive learning framework to improve the generalizability of health-related acoustic signal analysis across various medical tasks.
Contribution
It introduces an in-depth analysis of augmentation strategies for contrastive learning of health acoustics using a Slowfast NFNet backbone, highlighting the importance of augmentation combinations.
Findings
Augmentation strategies significantly improve model performance.
Combined augmentations have synergistic effects.
Optimized augmentations enhance generalizability across tasks.
Abstract
Health-related acoustic signals, such as cough and breathing sounds, are relevant for medical diagnosis and continuous health monitoring. Most existing machine learning approaches for health acoustics are trained and evaluated on specific tasks, limiting their generalizability across various healthcare applications. In this paper, we leverage a self-supervised learning framework, SimCLR with a Slowfast NFNet backbone, for contrastive learning of health acoustics. A crucial aspect of optimizing Slowfast NFNet for this application lies in identifying effective audio augmentations. We conduct an in-depth analysis of various audio augmentation strategies and demonstrate that an appropriate augmentation strategy enhances the performance of the Slowfast NFNet audio encoder across a diverse set of health acoustic tasks. Our findings reveal that when augmentations are combined, they can produce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonocardiography and Auscultation Techniques · Music and Audio Processing · Respiratory and Cough-Related Research
MethodsBitcoin Customer Service Number +1-833-534-1729 · *Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Residual Connection · Batch Normalization · Max Pooling · Kaiming Initialization · Global Average Pooling · Convolution · 1x1 Convolution
