Analysis of Transferability Estimation Metrics for Surgical Phase Recognition
Prabhant Singh, Yiping Li, Yasmina Al Khalil

TL;DR
This paper evaluates transferability estimation metrics for surgical phase recognition, benchmarking LogME, H-Score, and TransRate, and provides practical guidelines for model selection in surgical video analysis.
Contribution
It formalizes SITE for surgical phase recognition and offers the first comprehensive benchmark of three transferability metrics on diverse datasets.
Findings
LogME best predicts fine-tuning accuracy, especially with minimum per-subset aggregation.
H-Score shows weak predictive power for transferability.
TransRate often inverses true model rankings, reducing its reliability.
Abstract
Fine-tuning pre-trained models has become a cornerstone of modern machine learning, allowing practitioners to achieve high performance with limited labeled data. In surgical video analysis, where expert annotations are especially time-consuming and costly, identifying the most suitable pre-trained model for a downstream task is both critical and challenging. Source-independent transferability estimation (SITE) offers a solution by predicting how well a model will fine-tune on target data using only its embeddings or outputs, without requiring full retraining. In this work, we formalize SITE for surgical phase recognition and provide the first comprehensive benchmark of three representative metrics, LogME, H-Score, and TransRate, on two diverse datasets (RAMIE and AutoLaparo). Our results show that LogME, particularly when aggregated by the minimum per-subset score, aligns most closely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
