Exploring Pre-trained General-purpose Audio Representations for Heart   Murmur Detection

Daisuke Niizumi; Daiki Takeuchi; Yasunori Ohishi; Noboru Harada; and; Kunio Kashino

arXiv:2404.17107·eess.AS·April 29, 2024

Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection

Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, and, Kunio Kashino

PDF

Open Access 2 Repos

TL;DR

This paper investigates the use of pre-trained general audio representations, specifically the M2D model, for heart murmur detection, demonstrating improved accuracy and recall over previous methods using limited heart sound datasets.

Contribution

It introduces the application of large-scale pre-trained general audio models to heart murmur detection, showing significant performance gains through transfer learning.

Findings

01

M2D outperforms previous methods with 0.832 weighted accuracy.

02

Ensembling M2D with other models further improves results.

03

The approach demonstrates the effectiveness of general-purpose audio representations for medical audio analysis.

Abstract

To reduce the need for skilled clinicians in heart sound interpretation, recent studies on automating cardiac auscultation have explored deep learning approaches. However, despite the demands for large data for deep learning, the size of the heart sound datasets is limited, and no pre-trained model is available. On the contrary, many pre-trained models for general audio tasks are available as general-purpose audio representations. This study explores the potential of general-purpose audio representations pre-trained on large-scale datasets for transfer learning in heart murmur detection. Experiments on the CirCor DigiScope heart sound dataset show that the recent self-supervised learning Masked Modeling Duo (M2D) outperforms previous methods with the results of a weighted accuracy of 0.832 and an unweighted average recall of 0.713. Experiments further confirm improved performance by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhonocardiography and Auscultation Techniques · Music and Audio Processing

MethodsMasked Modeling Duo