Multi-Task Temporal Convolutional Networks for Joint Recognition of Surgical Phases and Steps in Gastric Bypass Procedures
Sanat Ramesh, Diego Dall'Alba, Cristians Gonzalez, Tong Yu, Pietro, Mascagni, Didier Mutter, Jacques Marescaux, Paolo Fiorini, Nicolas Padoy

TL;DR
This paper introduces a multi-task temporal convolutional network that jointly recognizes surgical phases and steps in gastric bypass videos, improving accuracy over single-task models and demonstrating the benefits of multi-level activity modeling.
Contribution
The work presents a novel multi-task multi-stage TCN for joint recognition of surgical phases and steps, leveraging their complementarity for better activity understanding.
Findings
MTMS-TCN outperforms single-task models in accuracy, precision, and recall.
Achieves 3-6% higher performance than LSTM-based models for step recognition.
Joint modeling of phases and steps improves overall surgical activity recognition.
Abstract
Purpose: Automatic segmentation and classification of surgical activity is crucial for providing advanced support in computer-assisted interventions and autonomous functionalities in robot-assisted surgeries. Prior works have focused on recognizing either coarse activities, such as phases, or fine-grained activities, such as gestures. This work aims at jointly recognizing two complementary levels of granularity directly from videos, namely phases and steps. Method: We introduce two correlated surgical activities, phases and steps, for the laparoscopic gastric bypass procedure. We propose a Multi-task Multi-Stage Temporal Convolutional Network (MTMS-TCN) along with a multi-task Convolutional Neural Network (CNN) training setup to jointly predict the phases and steps and benefit from their complementarity to better evaluate the execution of the procedure. We evaluate the proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
