CARES: Collaborative Agentic Reasoning for Error Detection in Surgery

Chang Han Low; Zhu Zhuo; Ziyue Wang; Jialang Xu; Haofeng Liu; Nazir Sirajudeen; Matthew Boal; Philip J. Edwards; Danail Stoyanov; Nader Francis; Jiehui Zhong; Di Gu; Evangelos B. Mazomenos; Yueming Jin

arXiv:2508.08764·cs.MA·August 13, 2025

CARES: Collaborative Agentic Reasoning for Error Detection in Surgery

Chang Han Low, Zhu Zhuo, Ziyue Wang, Jialang Xu, Haofeng Liu, Nazir Sirajudeen, Matthew Boal, Philip J. Edwards, Danail Stoyanov, Nader Francis, Jiehui Zhong, Di Gu, Evangelos B. Mazomenos, Yueming Jin

PDF

Open Access

TL;DR

This paper introduces CARES, a zero-shot reasoning framework for surgical error detection that leverages clinically-informed prompts and a new dataset, MERP, to improve accuracy without prior training.

Contribution

It presents a novel zero-shot, clinically-informed agentic reasoning architecture for multi-class surgical error detection and introduces the MERP dataset with detailed annotations.

Findings

01

Achieved 54.3 mF1 on RARP dataset.

02

Outperformed existing zero-shot methods by up to 14%.

03

Demonstrated effectiveness of specialized agents and risk-aware routing.

Abstract

Robotic-assisted surgery (RAS) introduces complex challenges that current surgical error detection methods struggle to address effectively due to limited training data and methodological constraints. Therefore, we construct MERP (Multi-class Error in Robotic Prostatectomy), a comprehensive dataset for error detection in robotic prostatectomy with frame-level annotations featuring six clinically aligned error categories. In addition, we propose CARES (Collaborative Agentic Reasoning for Error Detection in Surgery), a novel zero-shot clinically-informed and risk-stratified agentic reasoning architecture for multi-class surgical error detection. CARES implements adaptive generation of medically informed, error-specific Chain-of-Thought (CoT) prompts across multiple expertise levels. The framework employs risk-aware routing to assign error task to expertise-matched reasoning pathways based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurgical Simulation and Training · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education