Human Gaze Guided Attention for Surgical Activity Recognition

Abdishakour Awale; Duygu Sarikaya

arXiv:2203.04752·eess.IV·November 15, 2022

Human Gaze Guided Attention for Surgical Activity Recognition

Abdishakour Awale, Duygu Sarikaya

PDF

Open Access

TL;DR

This paper introduces a novel approach that leverages human gaze data to guide spatio-temporal attention in surgical activity recognition, significantly improving accuracy on a public dataset.

Contribution

It is the first to incorporate human gaze as supervision for attention in surgical video activity recognition, enhancing model performance.

Findings

01

Achieved 85.4% accuracy on JIGSAWS Suturing task.

02

Demonstrated the effectiveness of gaze-guided attention over state-of-the-art models.

03

Validated through ablation studies the importance of gaze supervision.

Abstract

Modeling and automatically recognizing surgical activities are fundamental steps toward automation in surgery and play important roles in providing timely feedback to surgeons. Accurately recognizing surgical activities in video poses a challenging problem that requires an effective means of learning both spatial and temporal dynamics. Human gaze and visual saliency carry important information about visual attention and can be used to extract more relevant features that better reflect these spatial and temporal dynamics. In this study, we propose to use human gaze with a spatio-temporal attention mechanism for activity recognition in surgical videos. Our model consists of an I3D-based architecture, learns spatio-temporal features using 3D convolutions, as well as learns an attention map using human gaze as supervision. We evaluate our model on the Suturing task of JIGSAWS which is a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurgical Simulation and Training · Delphi Technique in Research · Augmented Reality Applications