Fused Audio Instance and Representation for Respiratory Disease Detection
Tuan Truong, Matthias Lenga, Antoine Serrurier, Sadegh Mohammadi

TL;DR
This paper introduces FAIR, a method combining waveform and spectrogram features from various body sounds to improve respiratory disease detection, demonstrating enhanced performance over single-representation models.
Contribution
FAIR is a novel approach that fuses waveform and spectrogram features from multiple body sounds for better respiratory disease detection accuracy.
Findings
Self-attention fusion yields highest AUC of 0.8658
Combining waveform and spectrogram improves detection performance
Multi-sound fusion outperforms single-sound models
Abstract
Audio-based classification techniques on body sounds have long been studied to aid in the diagnosis of respiratory diseases. While most research is centered on the use of cough as the main biomarker, other body sounds also have the potential to detect respiratory diseases. Recent studies on COVID-19 have shown that breath and speech sounds, in addition to cough, correlate with the disease. Our study proposes Fused Audio Instance and Representation (FAIR) as a method for respiratory disease detection. FAIR relies on constructing a joint feature vector from various body sounds represented in waveform and spectrogram form. We conducted experiments on the use case of COVID-19 detection by combining waveform and spectrogram representation of body sounds. Our findings show that the use of self-attention to combine extracted features from cough, breath, and speech sounds leads to the best…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonocardiography and Auscultation Techniques · COVID-19 diagnosis using AI · Respiratory and Cough-Related Research
