Disentangling Dual-Encoder Masked Autoencoder for Respiratory Sound Classification

Peidong Wei; Shiyu Miao; Lin Li

arXiv:2506.10698·eess.AS·June 16, 2025

Disentangling Dual-Encoder Masked Autoencoder for Respiratory Sound Classification

Peidong Wei, Shiyu Miao, Lin Li

PDF

Open Access

TL;DR

This paper introduces DDE-MAE, a dual-encoder masked autoencoder model that disentangles disease-related and irrelevant features to improve respiratory sound classification amid data scarcity and domain mismatch.

Contribution

The paper proposes a novel DDE-MAE model with two independent encoders for feature disentanglement, addressing domain mismatch in respiratory sound classification.

Findings

01

Achieves competitive performance on ICBHI dataset.

02

Effectively reduces domain mismatch through feature disentanglement.

03

Improves classification accuracy with limited data.

Abstract

Deep neural networks have been applied to audio spectrograms for respiratory sound classification, but it remains challenging to achieve satisfactory performance due to the scarcity of available data. Moreover, domain mismatch may be introduced into the trained models as a result of the respiratory sound samples being collected from various electronic stethoscopes, patient demographics, and recording environments. To tackle this issue, we proposed a modified MaskedAutoencoder(MAE) model, named Disentangling Dual-Encoder MAE (DDE-MAE) for respiratory sound classification. Two independent encoders were designed to capture disease-related and disease-irrelevant information separately, achieving feature disentanglement to reduce the domain mismatch. Our method achieves a competitive performance on the ICBHI dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhonocardiography and Auscultation Techniques · Voice and Speech Disorders · Respiratory and Cough-Related Research

MethodsMasked autoencoder