# Interpretable and intervenable ultrasonography-based machine learning   models for pediatric appendicitis

**Authors:** Ri\v{c}ards Marcinkevi\v{c}s, Patricia Reis Wolfertstetter, Ugne, Klimiene, Kieran Chin-Cheong, Alyssia Paschke, Julia Zerres, Markus, Denzinger, David Niederberger, Sven Wellmann, Ece Ozkan, Christian Knorr,, Julia E. Vogt

arXiv: 2302.14460 · 2023-11-27

## TL;DR

This paper develops interpretable machine learning models using ultrasound images to diagnose pediatric appendicitis, enabling clinicians to understand and intervene without sacrificing accuracy.

## Contribution

It introduces concept bottleneck models extended for multiple views and incomplete data, tailored for ultrasound-based appendicitis diagnosis.

## Key findings

- Achieved AUROC of 0.80 and AUPR of 0.92 in diagnosis prediction.
- Models are interpretable and do not require extensive image annotation.
- Performance is comparable to black-box neural networks.

## Abstract

Appendicitis is among the most frequent reasons for pediatric abdominal surgeries. Previous decision support systems for appendicitis have focused on clinical, laboratory, scoring, and computed tomography data and have ignored abdominal ultrasound, despite its noninvasive nature and widespread availability. In this work, we present interpretable machine learning models for predicting the diagnosis, management and severity of suspected appendicitis using ultrasound images. Our approach utilizes concept bottleneck models (CBM) that facilitate interpretation and interaction with high-level concepts understandable to clinicians. Furthermore, we extend CBMs to prediction problems with multiple views and incomplete concept sets. Our models were trained on a dataset comprising 579 pediatric patients with 1709 ultrasound images accompanied by clinical and laboratory data. Results show that our proposed method enables clinicians to utilize a human-understandable and intervenable predictive model without compromising performance or requiring time-consuming image annotation when deployed. For predicting the diagnosis, the extended multiview CBM attained an AUROC of 0.80 and an AUPR of 0.92, performing comparably to similar black-box neural networks trained and tested on the same dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14460/full.md

## Figures

45 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14460/full.md

## References

136 references — full list in the complete paper: https://tomesphere.com/paper/2302.14460/full.md

---
Source: https://tomesphere.com/paper/2302.14460