SurgAtt-Tracker: Online Surgical Attention Tracking via Temporal Proposal Reranking and Motion-Aware Refinement
Rulin Zhou, Guankun Wang, An Wang, Yujie Ma, Lixin Ouyang, Bolin Cui, Junyan Li, Chaowei Zhu, Mingyang Li, Ming Chen, Xiaopin Zhong, Peng Lu, Jiankun Wang, Xianming Liu, Hongliang Ren

TL;DR
SurgAtt-Tracker is a novel framework for real-time surgical attention tracking that uses temporal proposal reranking and motion-aware refinement, supported by a large annotated dataset, to improve robustness and provide actionable FoV guidance.
Contribution
It introduces SurgAtt-Tracker, a new spatio-temporal learning approach for surgical attention tracking, and SurgAtt-1.16M, a large-scale benchmark dataset for training and evaluation.
Findings
Achieves state-of-the-art attention tracking performance.
Demonstrates robustness under occlusion and interference.
Provides effective frame-wise FoV guidance for robotic surgery.
Abstract
Accurate and stable field-of-view (FoV) guidance is critical for safe and efficient minimally invasive surgery, yet existing approaches often conflate visual attention estimation with downstream camera control or rely on direct object-centric assumptions. In this work, we formulate surgical attention tracking as a spatio-temporal learning problem and model surgeon focus as a dense attention heatmap, enabling continuous and interpretable frame-wise FoV guidance. We propose SurgAtt-Tracker, a holistic framework that robustly tracks surgical attention by exploiting temporal coherence through proposal-level reranking and motion-aware refinement, rather than direct regression. To support systematic training and evaluation, we introduce SurgAtt-1.16M, a large-scale benchmark with a clinically grounded annotation protocol that enables comprehensive heatmap-based attention analysis across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training · Robotics and Sensor-Based Localization · Soft Robotics and Applications
