Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection
Duygu Sarikaya, Jason J. Corso, Khurshid A. Guru

TL;DR
This paper presents a novel deep learning-based approach for detecting and localizing robotic tools in surgical videos, achieving high accuracy and speed, and introduces a new dataset for RAS video analysis.
Contribution
First to apply deep neural networks for tool detection and localization in RAS videos, combining multimodal CNNs with region proposal networks for improved performance.
Findings
Average Precision of 91% in tool detection
Detection speed of 0.1 seconds per frame
Superior to conventional medical imaging methods
Abstract
Video understanding of robot-assisted surgery (RAS) videos is an active research area. Modeling the gestures and skill level of surgeons presents an interesting problem. The insights drawn may be applied in effective skill acquisition, objective skill assessment, real-time feedback, and human-robot collaborative surgeries. We propose a solution to the tool detection and localization open problem in RAS video understanding, using a strictly computer vision approach and the recent advances of deep learning. We propose an architecture using multimodal convolutional neural networks for fast detection and localization of tools in RAS videos. To our knowledge, this approach will be the first to incorporate deep neural networks for tool detection and localization in RAS videos. Our architecture applies a Region Proposal Network (RPN), and a multi-modal two stream convolutional network for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRegion Proposal Network
