Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows
Yutong Ban, Guy Rosman, Thomas Ward, Daniel Hashimoto, Taisei Kondo,, Hidekazu Iwaki, Ozanan Meireles, Daniela Rus

TL;DR
This paper introduces a new temporal neural network architecture that effectively captures long-term dependencies in surgical workflow analysis, improving phase recognition accuracy in laparoscopic and robotic surgeries.
Contribution
It proposes a task-specific network with a sufficient statistics model integrated into an LSTM backbone, enhancing long-term context aggregation for surgical workflow recognition.
Findings
Outperforms existing state-of-the-art segmentation methods.
Achieves superior accuracy on Cholec80 and MGH100 datasets.
Demonstrates robustness on challenging, clinically meaningful labels.
Abstract
Analyzing surgical workflow is crucial for surgical assistance robots to understand surgeries. With the understanding of the complete surgical workflow, the robots are able to assist the surgeons in intra-operative events, such as by giving a warning when the surgeon is entering specific keys or high-risk phases. Deep learning techniques have recently been widely applied to recognizing surgical workflows. Many of the existing temporal neural network models are limited in their capability to handle long-term dependencies in the data, instead, relying upon the strong performance of the underlying per-frame visual models. We propose a new temporal network structure that leverages task-specific network representation to collect long-term sufficient statistics that are propagated by a sufficient statistics model (SSM). We implement our approach within an LSTM backbone for the task of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
