Sum-of-Checks: Structured Reasoning for Surgical Safety with Large Vision-Language Models

Weiqiu You; Cassandra Goldberg; Amin Madani; Daniel A. Hashimoto; Eric Wong

arXiv:2604.22156·cs.LG·April 27, 2026

Sum-of-Checks: Structured Reasoning for Surgical Safety with Large Vision-Language Models

Weiqiu You, Cassandra Goldberg, Amin Madani, Daniel A. Hashimoto, Eric Wong

PDF

1 Repo

TL;DR

Sum-of-Checks enhances surgical safety assessment by decomposing complex visual reasoning into expert-defined checks, improving accuracy and transparency of large vision-language models in critical laparoscopic procedures.

Contribution

It introduces a structured framework that decomposes surgical reasoning into verifiable checks, improving reliability and auditability of AI in surgical safety assessment.

Findings

01

Sum-of-Checks improves average frame-level mean average precision by 12-14%.

02

LVLMs are reliable on observational checks but variable on anatomical evidence.

03

Explicitly separating evidence from decision-making enhances AI transparency in surgery.

Abstract

Purpose: Accurate assessment of the Critical View of Safety (CVS) during laparoscopic cholecystectomy is essential to prevent bile duct injury, a complication associated with significant morbidity and mortality. While large vision-language models (LVLMs) offer flexible reasoning, their predictions remain difficult to audit and unreliable on safety-critical surgical tasks. Methods: We introduce Sum-of-Checks, a framework that decomposes each CVS criterion into expert-defined reasoning checks reflecting clinically relevant visual evidence. Given a laparoscopic frame, an LVLM evaluates each check, producing a binary judgment and justification. Criterion-level scores are computed via fixed, weighted aggregation of check outcomes. We evaluate on the Endoscapes2023 benchmark using three frontier LVLMs, comparing against direct prompting, chain-of-thought, and sub-question decomposition,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BrachioLab/SumOfChecks
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.