Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases
Xiangde Luo, Zihan Li, Shaoting Zhang, Wenjun Liao, and Guotai Wang

TL;DR
This paper introduces the RAOS dataset, a comprehensive benchmark of 413 CT scans with detailed organ annotations, to evaluate and improve the robustness of abdominal organ segmentation models, especially in challenging clinical cases.
Contribution
The paper presents the RAOS dataset, a new benchmark with challenging cases and diverse clinical scenarios, enabling systematic robustness evaluation of segmentation models.
Findings
State-of-the-art methods show varying robustness across clinical groups.
Cross-dataset generalization is limited, highlighting the need for more robust models.
RAOS provides a challenging testbed for future research in clinical segmentation robustness.
Abstract
Deep learning has enabled great strides in abdominal multi-organ segmentation, even surpassing junior oncologists on common cases or organs. However, robustness on corner cases and complex organs remains a challenging open problem for clinical adoption. To investigate model robustness, we collected and annotated the RAOS dataset comprising 413 CT scans (80k 2D images, 8k 3D organ annotations) from 413 patients each with 17 (female) or 19 (male) labelled organs, manually delineated by oncologists. We grouped scans based on clinical information into 1) diagnosis/radiotherapy (317 volumes), 2) partial excision without the whole organ missing (22 volumes), and 3) excision with the whole organ missing (74 volumes). RAOS provides a potential benchmark for evaluating model robustness including organ hallucination. It also includes some organs that can be very hard to access on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColorectal Cancer Screening and Detection
