SurgiTrack: Fine-Grained Multi-Class Multi-Tool Tracking in Surgical Videos
Chinedu Innocent Nwoye, Nicolas Padoy

TL;DR
SurgiTrack introduces a deep learning approach combining YOLOv7 detection and an attention-based re-identification mechanism to improve multi-class surgical tool tracking across multiple perspectives, enhancing accuracy and real-time performance.
Contribution
The paper presents SurgiTrack, a novel method that incorporates operator cues and a bipartite matching framework to advance multi-tool tracking in complex surgical scenarios.
Findings
Outperforms existing methods on CholecTrack20 dataset
Achieves real-time inference with high accuracy
Effectively re-identifies tools after occlusion or re-insertion
Abstract
Accurate tool tracking is essential for the success of computer-assisted intervention. Previous efforts often modeled tool trajectories rigidly, overlooking the dynamic nature of surgical procedures, especially tracking scenarios like out-of-body and out-of-camera views. Addressing this limitation, the new CholecTrack20 dataset provides detailed labels that account for multiple tool trajectories in three perspectives: (1) intraoperative, (2) intracorporeal, and (3) visibility, representing the different types of temporal duration of tool tracks. These fine-grained labels enhance tracking flexibility but also increase the task complexity. Re-identifying tools after occlusion or re-insertion into the body remains challenging due to high visual similarity, especially among tools of the same category. This work recognizes the critical role of the tool operators in distinguishing tool track…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging in Medicine · Surgical Simulation and Training
