Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation
Jingping Nie, Dung T. Tran, Karan Thakkar, Vasudha Kowtha, Jon Huang, Carlos Avendano, Erdrin Azemi, Vikramjit Mitra

TL;DR
This study investigates how pre-trained foundation models encode heart sound information for heart rate estimation, revealing that certain models, especially an in-house CLAP, outperform traditional acoustic feature-based methods.
Contribution
It provides a layer-wise analysis of multiple foundation models' representations for auscultation-based heart rate estimation, highlighting the effectiveness of the in-house CLAP model.
Findings
Representation vectors from foundation models perform comparably to traditional features.
The in-house CLAP model's audio encoder outperforms baseline methods.
Pre-trained models encode relevant auscultation information despite domain mismatch.
Abstract
Auscultation, particularly heart sound, is a non-invasive technique that provides essential vital sign information. Recently, self-supervised acoustic representation foundation models (FMs) have been proposed to offer insights into acoustics-based vital signs. However, there has been little exploration of the extent to which auscultation is encoded in these pre-trained FM representations. In this work, using a publicly available phonocardiogram (PCG) dataset and a heart rate (HR) estimation model, we conduct a layer-wise investigation of six acoustic representation FMs: HuBERT, wav2vec2, wavLM, Whisper, Contrastive Language-Audio Pretraining (CLAP), and an in-house CLAP model. Additionally, we implement the baseline method from Nie et al., 2024 (which relies on acoustic features) and show that overall, representation vectors from pre-trained foundation models (FMs) offer comparable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonocardiography and Auscultation Techniques · ECG Monitoring and Analysis · Healthcare Technology and Patient Monitoring
