TL;DR
This paper introduces self-supervised learning methods based on temporal coherence for laparoscopic workflow analysis, reducing the need for annotated data and improving neural network performance in surgical phase segmentation.
Contribution
It presents and compares different self-supervised pretraining approaches using unlabeled videos, enhancing surgical workflow analysis without extensive annotations.
Findings
Achieved a maximum F1 score of 84.6 on Cholec80 dataset.
Pretraining increased F1 score by up to 10 points.
Demonstrated effectiveness of temporal coherence-based pretraining in surgical video analysis.
Abstract
In order to provide the right type of assistance at the right time, computer-assisted surgery systems need context awareness. To achieve this, methods for surgical workflow analysis are crucial. Currently, convolutional neural networks provide the best performance for video-based workflow analysis tasks. For training such networks, large amounts of annotated data are necessary. However, collecting a sufficient amount of data is often costly, time-consuming, and not always feasible. In this paper, we address this problem by presenting and comparing different approaches for self-supervised pretraining of neural networks on unlabeled laparoscopic videos using temporal coherence. We evaluate our pretrained networks on Cholec80, a publicly available dataset for surgical phase segmentation, on which a maximum F1 score of 84.6 was reached. Furthermore, we were able to achieve an increase of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning
