A Novel Deep ML Architecture by Integrating Visual Simultaneous Localization and Mapping (vSLAM) into Mask R-CNN for Real-time Surgical Video Analysis
Ella Selina Lan

TL;DR
This paper introduces vSLAM-CNN, a novel deep learning architecture that integrates vSLAM with Mask R-CNN for real-time surgical video analysis, significantly improving accuracy and speed over previous methods.
Contribution
The paper presents the first integration of vSLAM with Mask R-CNN, combining geometric and semantic features for enhanced real-time surgical tool and workflow detection.
Findings
Achieved 96.8 mAP for tool detection
Reached 97.5 mean Jaccard score for workflow detection
Operates at 50 FPS, 10x faster than traditional CNNs
Abstract
Seven million people suffer surgical complications each year, but with sufficient surgical training and review, 50\% of these complications could be prevented. To improve surgical performance, existing research uses various deep learning (DL) technologies including convolutional neural networks (CNN) and recurrent neural networks (RNN) to automate surgical tool and workflow detection. However, there is room to improve accuracy; real-time analysis is also minimal due to the complexity of CNN. In this research, a novel DL architecture is proposed to integrate visual simultaneous localization and mapping (vSLAM) into Mask R-CNN. This architecture, vSLAM-CNN (vCNN), for the first time, integrates the best of both worlds, inclusive of (1) vSLAM for object detection, by focusing on geometric information for region proposals, and (2) CNN for object recognition, by focusing on semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training · Augmented Reality Applications · Anatomy and Medical Technology
MethodsRegion Proposal Network · Softmax · RoIAlign · Convolution · Mask R-CNN
