SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge

Hao Ding; Yuqian Zhang; Tuxun Lu; Ruixing Liang; Hongchao Shu; Lalithkumar Seenivasan; Yonghao Long; Qi Dou; Cong Gao; Yicheng Leng; Seok Bong Yoo; Eung-Joo Lee; Negin Ghamsarian; Klaus Schoeffmann; Raphael Sznitman; Zijian Wu; Yuxin Chen; Septimiu E. Salcudean; Samra Irshad; Shadi Albarqouni; Seong Tae Kim; Yueyi Sun; An Wang; Long Bai; Hongliang Ren; Ihsan Ullah; Ho-Gun Ha; Attaullah Khan; Hyunki Lee; Satoshi Kondo; Satoshi Kasai; Kousuke Hirasawa; Sita Tailor; Ricardo Sanchez-Matilla; Imanol Luengo; Tianhao Fu; Jun Ma; Bo Wang; Marcos Fern\'andez-Rodr\'iguez; Estevao Lima; Jo\~ao L. Vila\c{c}a; Mathias Unberath

arXiv:2407.11906·cs.CV·May 12, 2026

SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge

Hao Ding, Yuqian Zhang, Tuxun Lu, Ruixing Liang, Hongchao Shu, Lalithkumar Seenivasan, Yonghao Long, Qi Dou, Cong Gao, Yicheng Leng, Seok Bong Yoo, Eung-Joo Lee, Negin Ghamsarian, Klaus Schoeffmann, Raphael Sznitman, Zijian Wu, Yuxin Chen, Septimiu E. Salcudean, Samra Irshad

PDF

TL;DR

SegSTRONG-C is a challenge focused on evaluating and improving the robustness of surgical tool segmentation models against non-adversarial corruptions using a specialized dataset and community benchmarking.

Contribution

It introduces a new dataset and challenge for assessing robustness of surgical tool segmentation models under realistic corruptions, highlighting effective strategies and future directions.

Findings

01

Top models achieved over 93% DSC and NSD on corrupted test sets.

02

Prior knowledge and architectural choices significantly improve robustness.

03

Conventional data augmentation methods have limitations in non-adversarial robustness.

Abstract

Surgical data science has seen rapid advancement with the excellent performance of end-to-end deep neural networks (DNNs). Despite their successes, DNNs have been proven susceptible to minor "corruptions," introducing a major concern for the translation of cutting-edge technology, especially in high-stakes scenarios. We introduce the SegSTRONG-C challenge dedicated to better understanding model deterioration under unforeseen but plausible non-adversarial "corruption" and the capabilities of contemporary methods that seek to improve it. Built on a dataset generated through counterfactual robotic replay, SegSTRONG-C provides paired clean and "corrupted" samples, enabling reproducible evaluation of model robustness. Participants are challenged to train tool segmentation algorithms on "uncorrupted" data and evaluate them on "corrupted" test domains for the binary robot tool segmentation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.