An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis
Yingchen Wei, Xihe Qiu, Xiaoyu Tan, Jingjing Huang, Wei Chu, Yinghui, Xu, Yuan Qi

TL;DR
This paper introduces a multimodal dual-encoder deep learning framework that combines visual facial features and semantic data for accurate, efficient, and non-invasive diagnosis of obstructive sleep apnea-hypopnea syndrome, outperforming existing methods.
Contribution
The paper presents a novel multimodal dual-encoder model integrating visual and semantic information with attention mechanisms for OSAHS diagnosis, achieving state-of-the-art accuracy.
Findings
Achieved 91.3% top-1 accuracy in four-class severity classification.
Improved diagnostic accuracy over existing facial image analysis methods.
Demonstrated effectiveness of cross-attention and ordered regression loss in model stability.
Abstract
Obstructive sleep apnea-hypopnea syndrome (OSAHS) is a common sleep disorder caused by upper airway blockage, leading to oxygen deprivation and disrupted sleep. Traditional diagnosis using polysomnography (PSG) is expensive, time-consuming, and uncomfortable. Existing deep learning methods using facial image analysis lack accuracy due to poor facial feature capture and limited sample sizes. To address this, we propose a multimodal dual encoder model that integrates visual and language inputs for automated OSAHS diagnosis. The model balances data using randomOverSampler, extracts key facial features with attention grids, and converts physiological data into meaningful text. Cross-attention combines image and text data for better feature extraction, and ordered regression loss ensures stable learning. Our approach improves diagnostic efficiency and accuracy, achieving 91.3% top-1 accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCardiovascular and Diving-Related Complications
MethodsSoftmax · Attention Is All You Need
