Sensitivity and Specificity Evaluation of Deep Learning Models for Detection of Pneumoperitoneum on Chest Radiographs
Manu Goyal, Judith Austin-Strohbehn, Sean J. Sun, Karen Rodriguez,, Jessica M. Sin, Yvonne Y. Cheung, Saeed Hassanpour

TL;DR
This study evaluates the performance and generalizability of deep learning models in detecting pneumoperitoneum on chest X-rays across multiple hospitals and imaging systems, demonstrating high accuracy and robustness.
Contribution
It introduces a comprehensive validation of deep learning models for pneumoperitoneum detection across diverse imaging systems, highlighting their potential clinical utility.
Findings
DenseNet161 achieved highest AUC of 95.7%
Models maintained high sensitivity and specificity across systems
DenseNet161 classified images from different systems with 90.8% accuracy
Abstract
Background: Deep learning has great potential to assist with detecting and triaging critical findings such as pneumoperitoneum on medical images. To be clinically useful, the performance of this technology still needs to be validated for generalizability across different types of imaging systems. Materials and Methods: This retrospective study included 1,287 chest X-ray images of patients who underwent initial chest radiography at 13 different hospitals between 2011 and 2019. The chest X-ray images were labelled independently by four radiologist experts as positive or negative for pneumoperitoneum. State-of-the-art deep learning models (ResNet101, InceptionV3, DenseNet161, and ResNeXt101) were trained on a subset of this dataset, and the automated classification performance was evaluated on the rest of the dataset by measuring the AUC, sensitivity, and specificity for each model.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
