# Uncertainty-Aware Explainable AI for Pancreatic Cysts: Identifying Deep Learning Vulnerabilities and Ensuring Safe Clinical Triage in IPMN Management

**Authors:** Halil Ertugrul Aktas, Gorkem Durak, Andrea Mia Bejar, Ziliang Hong, Rutger Hendrix, Hongyi Pan, Elif Keles, Fergan Bol, Yavuz B Taktak, Cagdas Topel, Yalcin Tur, Zheyuan Zhang, Neil R Chatterjee, Yuri Velichko, Concetto Spampinato, Ivo G Schoots, Marco J Bruno, Chenchan Huang, Tamas Gonda, Alpay Medetalibeyoglu, Gulbiz Dagoglu Kartal, Sukru Mehmet Erturk, Lili Zhao, Candice Bolan, Frank H Miller, Michael B Wallace, Rajesh N Keswani, Ulas Bagci

PMC · DOI: 10.21203/rs.3.rs-9096790/v1 · 2026-03-22

## TL;DR

This study introduces an uncertainty-aware AI system for pancreatic cysts, showing how deep learning models can be unreliable for high-risk cases and how uncertainty quantification can help clinicians make safer decisions.

## Contribution

The first multi-center study integrating explainable AI and uncertainty quantification for IPMN malignancy risk stratification.

## Key findings

- Model uncertainty was significantly lower for correct predictions compared to incorrect ones.
- Low-risk lesion accuracy was significantly higher than high-risk lesion accuracy.
- Body and whole-pancreas cysts were stratified more accurately than tail cysts.

## Abstract

Despite growing interest in AI for pancreatic cyst management, no prior study has systematically investigated how lesion characteristics influence model behavior or provided a comprehensive explainability analysis in IPMN malignancy risk stratification. We present the first multi-center study integrating explainable AI with uncertainty quantification to evaluate how cyst type, size, and location affect deep learning performance in IPMN malignancy prediction. Our retrospective study analyzed 170 IPMNs from seven centers using a radiomics-deep learning fusion model. Cases were stratified by dysplasia grade, IPMN type, size, and location; with model interpretability assessed using SHAP, LIME, Grad-CAM, and uncertainty quantification. Overall accuracy was 67.1%, with model uncertainty lower for correct versus incorrect predictions (0.72 vs 0.78, p < 0.001). On subgroup analysis: Low-Risk lesion accuracy exceeded High-Risk (81.3% vs. 42.9%, p < 0.001), BD accuracy was inversely correlated with cyst size while MD remained stable, and body and whole-pancreas cysts stratified more accurately than tail cysts (77.8%, 76.6% vs. 47.8%). AI models face significant, previously uncharacterized performance drops when evaluating complex high-risk IPMNs. Our study shows that integrating uncertainty quantification can successfully flag unreliable predictions, enabling a selective-prediction framework where clinicians can confidently rely on AI for low-risk, low-uncertainty lesions while deferring high-uncertainty cases to expert review.

## Linked entities

- **Diseases:** IPMN (MONDO:0004286)

## Full-text entities

- **Diseases:** cyst (MESH:D003560), BD (MESH:D001528), dysplasia (MESH:D015792), MD (MESH:C535955), Pancreatic Cysts (MESH:D010181), IPMN (MESH:D000077779)

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13015609/full.md

---
Source: https://tomesphere.com/paper/PMC13015609