Bounded Future MS-TCN++ for surgical gesture recognition
Adam Goldbraikh, Netanell Avisdris, Carla M. Pugh, Shlomi Laufer

TL;DR
This paper introduces a bounded future MS-TCN++ model for surgical gesture recognition, optimizing the trade-off between response delay and accuracy in real-time surgical video analysis.
Contribution
The study proposes a novel MS-TCN++ based approach that effectively manages the performance-delay trade-off for online surgical gesture recognition.
Findings
Limiting accessible future frames improves recognition performance.
Naive reduction of network depth is sub-optimal for small delays.
The proposed method outperforms naive approaches in delay-sensitive scenarios.
Abstract
In recent times there is a growing development of video based applications for surgical purposes. Part of these applications can work offline after the end of the procedure, other applications must react immediately. However, there are cases where the response should be done during the procedure but some delay is acceptable. In the literature, the online-offline performance gap is known. Our goal in this study was to learn the performance-delay trade-off and design an MS-TCN++-based algorithm that can utilize this trade-off. To this aim, we used our open surgery simulation data-set containing 96 videos of 24 participants that perform a suturing task on a variable tissue simulator. In this study, we used video data captured from the side view. The Networks were trained to identify the performed surgical gestures. The naive approach is to reduce the MS-TCN++ depth, as a result, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training · Human Pose and Action Recognition · Augmented Reality Applications
