Multi-Attention Stacked Ensemble for Lung Cancer Detection in CT Scans
Uzzal Saha, Surya Prakash

TL;DR
This paper introduces a multi-attention stacked ensemble of deep neural networks that significantly improves lung cancer detection accuracy in CT scans, offering a robust automated tool for radiologists.
Contribution
It proposes a novel multi-level attention ensemble with a custom training strategy, achieving state-of-the-art performance in lung nodule classification.
Findings
Achieved 98.09% accuracy and 0.9961 AUC on LIDC-IDRI dataset.
Reduced error rate by 35% compared to previous methods.
Balanced sensitivity and specificity, especially on challenging cases.
Abstract
In this work, we address the challenge of binary lung nodule classification (benign vs malignant) using CT images by proposing a multi-level attention stacked ensemble of deep neural networks. Three pretrained backbones -- EfficientNet V2 S, MobileViT XXS, and DenseNet201 -- are each adapted with a custom classification head tailored to 96 x 96 pixel inputs. A two-stage attention mechanism learns both model-wise and class-wise importance scores from concatenated logits, and a lightweight meta-learner refines the final prediction. To mitigate class imbalance and improve generalization, we employ dynamic focal loss with empirically calculated class weights, MixUp augmentation during training, and test-time augmentation at inference. Experiments on the LIDC-IDRI dataset demonstrate exceptional performance, achieving 98.09 accuracy and 0.9961 AUC, representing a 35 percent reduction in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
