Identifying Surgical Instruments in Pedagogical Cataract Surgery Videos   through an Optimized Aggregation Network

Sanya Sinha; Michal Balazia; Francois Bremond

arXiv:2501.02618·cs.CV·January 7, 2025

Identifying Surgical Instruments in Pedagogical Cataract Surgery Videos through an Optimized Aggregation Network

Sanya Sinha, Michal Balazia, Francois Bremond

PDF

Open Access

TL;DR

This paper introduces a novel deep learning model based on YOLOV9 architecture, enhanced with PGI and Go-ELAN, for real-time identification of surgical instruments in cataract surgery videos, achieving high accuracy on a custom dataset.

Contribution

The paper proposes a new optimized aggregation network and PGI mechanism integrated into YOLOV9 for improved instrument detection in surgical videos.

Findings

01

Achieved a mAP of 73.74 at IoU 0.5 on the dataset.

02

Outperformed YOLO v5, v7, v8, vanilla YOLOV9, Laptool, and DETR.

03

Demonstrated real-time detection capability with high accuracy.

Abstract

Instructional cataract surgery videos are crucial for ophthalmologists and trainees to observe surgical details repeatedly. This paper presents a deep learning model for real-time identification of surgical instruments in these videos, using a custom dataset scraped from open-access sources. Inspired by the architecture of YOLOV9, the model employs a Programmable Gradient Information (PGI) mechanism and a novel Generally-Optimized Efficient Layer Aggregation Network (Go-ELAN) to address the information bottleneck problem, enhancing Minimum Average Precision (mAP) at higher Non-Maximum Suppression Intersection over Union (NMS IoU) scores. The Go-ELAN YOLOV9 model, evaluated against YOLO v5, v7, v8, v9 vanilla, Laptool and DETR, achieves a superior mAP of 73.74 at IoU 0.5 on a dataset of 615 images with 10 instrument classes, demonstrating the effectiveness of the proposed model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Imaging in Medicine · Surgical Simulation and Training