A Comparative Study in Surgical AI: Potential and Limitations of Data, Compute, and Scaling

Kirill Skobelev; Eric Fithian; Yegor Baranovski; Jack Cook; Sandeep Angara; Shauna Otto; Zhuang-Fang Yi; John Zhu; Daniel A. Donoho; X.Y. Han; Neeraj Mainkar; Margaux Masson-Forsythe

arXiv:2603.27341·cs.AI·May 19, 2026

A Comparative Study in Surgical AI: Potential and Limitations of Data, Compute, and Scaling

Kirill Skobelev, Eric Fithian, Yegor Baranovski, Jack Cook, Sandeep Angara, Shauna Otto, Zhuang-Fang Yi, John Zhu, Daniel A. Donoho, X.Y. Han, Neeraj Mainkar, Margaux Masson-Forsythe

PDF

TL;DR

This study evaluates the potential of current AI models for surgical tasks, revealing limitations in tool detection accuracy and diminishing returns from scaling, highlighting significant obstacles for surgical AI deployment.

Contribution

It provides a comprehensive case study on surgical tool detection with state-of-the-art models, emphasizing the challenges and limitations of scaling AI for surgical applications.

Findings

01

Current models fall short in neurosurgical tool detection.

02

Scaling model size yields diminishing performance gains.

03

Obstacles in surgical AI are not solely due to data or compute limitations.

Abstract

Recent Artificial Intelligence (AI) models have matched or exceeded human experts in several benchmarks of biomedical task performance, but surgical benchmarks in particular are often missing from prominent medical benchmark suites. Since surgery requires integrating disparate tasks, generally-capable AI models could be particularly attractive as a collaborative tool if performance could be improved. On the one hand, the canonical approach of scaling architecture size and training data is attractive, especially since there are millions of hours of surgical video data generated per year. On the other hand, preparing surgical data for AI training requires significantly higher levels of professional expertise, and training on that data requires expensive computational resources. These trade-offs paint an uncertain picture of whether and to-what-extent modern AI could aid surgical practice.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.