TL;DR
DeepGB-TB is an AI-powered, non-invasive TB screening system that combines audio analysis and demographic data with innovative cross-attention and risk-balancing techniques, achieving high accuracy and interpretability for rapid deployment in low-resource settings.
Contribution
It introduces a novel cross-modal attention mechanism and a risk-balanced loss function, enabling accurate, interpretable TB risk assessment from cough audio and demographics.
Findings
Achieved AUROC of 0.903 and F1-score of 0.851 on diverse dataset.
Operates efficiently on mobile devices for real-time screening.
Provides clinically validated explanations to support trust.
Abstract
Large-scale tuberculosis (TB) screening is limited by the high cost and operational complexity of traditional diagnostics, creating a need for artificial-intelligence solutions. We propose DeepGB-TB, a non-invasive system that instantly assigns TB risk scores using only cough audio and basic demographic data. The model couples a lightweight one-dimensional convolutional neural network for audio processing with a gradient-boosted decision tree for tabular features. Its principal innovation is a Cross-Modal Bidirectional Cross-Attention module (CM-BCA) that iteratively exchanges salient cues between modalities, emulating the way clinicians integrate symptoms and risk factors. To meet the clinical priority of minimizing missed cases, we design a Tuberculosis Risk-Balanced Loss (TRBL) that places stronger penalties on false-negative predictions, thereby reducing high-risk…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
