StepAL: Step-aware Active Learning for Cataract Surgical Videos
Nisarg A. Shah, Bardia Safaei, Shameema Sikder, S. Swaroop Vedula, and Vishal M. Patel

TL;DR
StepAL is a novel active learning framework tailored for surgical videos that selects entire videos for annotation by leveraging step-aware features and uncertainty, significantly reducing labeling effort while maintaining high recognition accuracy.
Contribution
We introduce StepAL, a step-aware active learning method that effectively selects full surgical videos for annotation, improving efficiency over existing frame-based approaches.
Findings
Outperforms existing active learning methods on cataract datasets
Achieves higher step recognition accuracy with fewer labeled videos
Reduces annotation effort in surgical video analysis
Abstract
Active learning (AL) can reduce annotation costs in surgical video analysis while maintaining model performance. However, traditional AL methods, developed for images or short video clips, are suboptimal for surgical step recognition due to inter-step dependencies within long, untrimmed surgical videos. These methods typically select individual frames or clips for labeling, which is ineffective for surgical videos where annotators require the context of the entire video for annotation. To address this, we propose StepAL, an active learning framework designed for full video selection in surgical step recognition. StepAL integrates a step-aware feature representation, which leverages pseudo-labels to capture the distribution of predicted steps within each video, with an entropy-weighted clustering strategy. This combination prioritizes videos that are both uncertain and exhibit diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
