MultiSense-Pneumo: A Multimodal Learning Framework for Pneumonia Screening in Resource-Constrained Settings

Dineth Jayakody; Pasindu Thenahandi; Chameli Dommanige

arXiv:2605.02207·cs.CV·May 5, 2026

MultiSense-Pneumo: A Multimodal Learning Framework for Pneumonia Screening in Resource-Constrained Settings

Dineth Jayakody, Pasindu Thenahandi, Chameli Dommanige

PDF

TL;DR

MultiSense-Pneumo is a multimodal framework combining symptoms, cough audio, speech, and radiographs for pneumonia screening in resource-limited settings, emphasizing transparency and offline capability.

Contribution

It introduces a modular, interpretable multimodal system optimized for low-resource environments, integrating diverse data sources for pneumonia screening.

Findings

01

Radiograph analysis remains robust under domain shifts.

02

The system operates fully offline on standard hardware.

03

Minority class recall for acoustic signals needs improvement.

Abstract

Pneumonia remains a leading global cause of morbidity and mortality, particularly in low resource settings where access to imaging, laboratory testing, and specialist care is limited. Clinical assessment relies on heterogeneous evidence, including symptoms, respiratory patterns, and chest imaging, making screening inherently multimodal. However, many existing computational approaches remain unimodal and focus primarily on radiographs. In this work, we present MultiSense-Pneumo, a multimodal framework for pneumonia oriented screening and triage support that integrates structured symptom descriptors, cough audio, spoken language, and chest radiographs. The system combines deterministic symptom triage, LightGBM based acoustic classification, domain adversarial radiograph analysis using ResNet 18, transformer based speech recognition, and an interpretable multimodal fusion operator. Each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.