Where are they looking in the operating room?

Keqi Chen; S\'eraphin Baributsa; Lilien Schewski; Vinkle Srivastav; Didier Mutter; Guido Beldi; Sandra Keller; Nicolas Padoy

arXiv:2604.20574·cs.CV·April 23, 2026

Where are they looking in the operating room?

Keqi Chen, S\'eraphin Baributsa, Lilien Schewski, Vinkle Srivastav, Didier Mutter, Guido Beldi, Sandra Keller, Nicolas Padoy

PDF

TL;DR

This paper introduces gaze-following in the operating room, demonstrating its potential to improve understanding of surgical workflows, roles, and team communication through novel datasets and models.

Contribution

It extends existing datasets with gaze-following annotations and develops new models for clinical role prediction, surgical phase recognition, and team communication detection.

Findings

01

Achieved state-of-the-art F1 scores of 0.92 for role prediction and 0.95 for phase recognition.

02

Significantly outperformed baselines in team communication detection by over 30%.

03

Validated the effectiveness of gaze-based models in complex surgical environments.

Abstract

Purpose: Gaze-following, the task of inferring where individuals are looking, has been widely studied in computer vision, advancing research in visual attention modeling, social scene understanding, and human-robot interaction. However, gaze-following has never been explored in the operating room (OR), a complex, high-stakes environment where visual attention plays an important role in surgical workflow analysis. In this work, we introduce the concept of gaze-following to the surgical domain, and demonstrate its great potential for understanding clinical roles, surgical phases, and team communications in the OR. Methods: We extend the 4D-OR dataset with gaze-following annotations, and extend the Team-OR dataset with gaze-following and a new team communication activity annotations. Then, we propose novel approaches to address clinical role prediction, surgical phase recognition, and team…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.