Event Recognition in Laparoscopic Gynecology Videos with Hybrid Transformers
Sahar Nasirihaghighi, Negin Ghamsarian, Heinrich Husslein, Klaus, Schoeffmann

TL;DR
This paper introduces a hybrid transformer-based approach for recognizing key events in laparoscopic gynecology videos, utilizing a new annotated dataset and a frame sampling strategy to improve accuracy amidst challenging surgical conditions.
Contribution
The paper presents a novel hybrid transformer architecture and a specialized training-inference framework for event recognition in laparoscopic videos, along with a new dataset and sampling strategy.
Findings
Hybrid transformer architecture outperforms CNN-RNN models in accuracy.
The dataset includes annotations for intra-operative challenges and complications.
Frame sampling strategy enhances temporal resolution and robustness.
Abstract
Analyzing laparoscopic surgery videos presents a complex and multifaceted challenge, with applications including surgical training, intra-operative surgical complication prediction, and post-operative surgical assessment. Identifying crucial events within these videos is a significant prerequisite in a majority of these applications. In this paper, we introduce a comprehensive dataset tailored for relevant event recognition in laparoscopic gynecology videos. Our dataset includes annotations for critical events associated with major intra-operative challenges and post-operative complications. To validate the precision of our annotations, we assess event recognition performance using several CNN-RNN architectures. Furthermore, we introduce and evaluate a hybrid transformer architecture coupled with a customized training-inference framework to recognize four specific events in laparoscopic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training · Radiology practices and education
MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Dense Connections · Dropout · Byte Pair Encoding · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer
