Deep learning approaches to surgical video segmentation and object detection: A Scoping Review
Devanish N. Kamtam, Joseph B. Shrager, Satya Deepya Malla, Nicole Lin,, Juan J. Cardona, Jake J. Kim, Clarence Hu

TL;DR
This scoping review evaluates deep learning models for surgical video segmentation and object detection, highlighting progress in real-time applications for larger organs and identifying challenges with smaller structures and data limitations.
Contribution
It provides a comprehensive overview of the current state-of-the-art in DL-based surgical video segmentation, including model performance, clinical relevance, and existing challenges.
Findings
Semantic segmentation is the primary CV task in surgical videos.
U-Net and DeepLab are the most widely used models.
Models achieve higher accuracy on larger organs like the liver.
Abstract
Introduction: Computer vision (CV) has had a transformative impact in biomedical fields such as radiology, dermatology, and pathology. Its real-world adoption in surgical applications, however, remains limited. We review the current state-of-the-art performance of deep learning (DL)-based CV models for segmentation and object detection of anatomical structures in videos obtained during surgical procedures. Methods: We conducted a scoping review of studies on semantic segmentation and object detection of anatomical structures published between 2014 and 2024 from 3 major databases - PubMed, Embase, and IEEE Xplore. The primary objective was to evaluate the state-of-the-art performance of semantic segmentation in surgical videos. Secondary objectives included examining DL models, progress toward clinical applications, and the specific challenges with segmentation of organs/tissues in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Radiomics and Machine Learning in Medical Imaging
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Dense Connections · Dilated Convolution · Feedforward Network · Conditional Random Field · Max Pooling · DeepLab · Convolution · Concatenated Skip Connection · U-Net
