Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization
Negin Ghamsarian, Mario Taschwer, Doris Putzgruber-Adamitsch,, Stephanie Sarny, Klaus Schoeffmann

TL;DR
This paper presents a three-module framework for automatic relevance detection in cataract surgery videos, enhancing training, skill assessment, and irregularity detection by localizing and classifying surgical phases using spatiotemporal analysis.
Contribution
The proposed method introduces a novel three-module system combining idle frame recognition, cornea detection, and recurrent CNNs for phase classification in cataract videos, outperforming existing static and recurrent models.
Findings
Outperforms static CNNs in relevance detection accuracy.
Effective segmentation of idle and action phases in videos.
Improved classification of surgical phases using spatiotemporal features.
Abstract
In cataract surgery, the operation is performed with the help of a microscope. Since the microscope enables watching real-time surgery by up to two people only, a major part of surgical training is conducted using the recorded videos. To optimize the training procedure with the video content, the surgeons require an automatic relevance detection approach. In addition to relevance-based retrieval, these results can be further used for skill assessment and irregularity detection in cataract surgery videos. In this paper, a three-module framework is proposed to detect and classify the relevant phase segments in cataract videos. Taking advantage of an idle frame recognition network, the video is divided into idle and action segments. To boost the performance in relevance detection, the cornea where the relevant surgical actions are conducted is detected in all frames using Mask R-CNN. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRegion Proposal Network · Softmax · Convolution · RoIAlign · Mask R-CNN
