Stabilizing Temporal Inference Dynamics for Online Surgical Phase Recognition
Yang Liu, Ning Zhu, Jingjing Peng, Xiwu Chen, Alejandro Granados, Guotai Wang, Sebastien Ourselin

TL;DR
This paper introduces a unified framework with novel loss and inference components to improve the temporal stability of online surgical phase recognition models, reducing prediction fragmentation and increasing reliability.
Contribution
It proposes a model-agnostic, plug-and-play framework that stabilizes temporal inference dynamics using TEC loss and EGTP, and introduces the TFI metric for evaluation.
Findings
Significant reduction in temporal fragmentation across datasets.
Improved stability without sacrificing frame-wise accuracy.
Framework is effective across multiple backbone models.
Abstract
Online Surgical Phase Recognition (SPR) models can reach high frame-wise accuracy, yet their predictions often lack temporal stability, fragmenting workflow understanding and reducing the reliability of downstream assistance. We show that this instability is not random noise but arises from two mechanisms: early misclassifications corrupt temporal feature states and propagate forward to form error cascades, and phase transitions follow evidence-accumulation dynamics whereas most online SPR systems rely on memoryless frame-wise decisions, making them sensitive to transient confidence fluctuations. We propose a unified Train-Inference-Evaluation framework that explicitly stabilizes temporal inference dynamics using model-agnostic, plug-and-play components. For training, the Temporal Error-Cascade (TEC) loss suppresses error onset and mitigates forward error propagation by stabilizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
