Large Language Model-based Nonnegative Matrix Factorization For Cardiorespiratory Sound Separation
Yasaman Torabi, Shahram Shirani, James P. Reilly

TL;DR
This paper introduces a novel method combining large language models with non-negative matrix factorization to improve the separation of cardiorespiratory sounds, aiding disease diagnosis with enhanced accuracy and insights.
Contribution
It is the first to integrate LLMs with NMF for sound separation, using LLMs to improve results and optimize parameters in a feedback loop.
Findings
Outperforms existing separation methods on synthetic and real datasets.
Enhances disease prediction accuracy from heart and lung sounds.
Demonstrates potential for improved medical sound analysis.
Abstract
This study represents the first integration of large language models (LLMs) with non-negative matrix factorization (NMF), marking a novel advancement in the source separation field. The LLM is employed in two unique ways: enhancing the separation results by providing detailed insights for disease prediction and operating in a feedback loop to optimize a fundamental frequency penalty added to the NMF cost function. We tested the algorithm on two datasets: 100 synthesized mixtures of real measurements, and 210 recordings of heart and lung sounds from a clinical manikin including both individual and mixed sounds, captured using a digital stethoscope. The approach consistently outperformed existing methods, demonstrating its potential to significantly enhance medical sound analysis for disease diagnostics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonocardiography and Auscultation Techniques · Voice and Speech Disorders · Speech Recognition and Synthesis
