Automatic Operating Room Surgical Activity Recognition for Robot-Assisted Surgery
Aidean Sharghi, Helene Haugerud, Daniel Oh, Omid Mohareri

TL;DR
This paper presents a large-scale dataset and a deep learning approach for automatic recognition of surgical activities in robot-assisted surgeries, achieving high accuracy and enabling smarter surgical workflows.
Contribution
The study introduces the first large-scale multi-view dataset with detailed annotations and adapts advanced computer vision models for surgical activity recognition.
Findings
Achieved 88% mean Average Precision in activity recognition
Developed a novel combination of I3D, Gaussian Mixture, and LSTM models
Provided insights into model performance and potential for real-time applications
Abstract
Automatic recognition of surgical activities in the operating room (OR) is a key technology for creating next generation intelligent surgical devices and workflow monitoring/support systems. Such systems can potentially enhance efficiency in the OR, resulting in lower costs and improved care delivery to the patients. In this paper, we investigate automatic surgical activity recognition in robot-assisted operations. We collect the first large-scale dataset including 400 full-length multi-perspective videos from a variety of robotic surgery cases captured using Time-of-Flight cameras. We densely annotate the videos with 10 most recognized and clinically relevant classes of activities. Furthermore, we investigate state-of-the-art computer vision action recognition techniques and adapt them for the OR environment and the dataset. First, we fine-tune the Inflated 3D ConvNet (I3D) for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMemory Network
