Self-distillation for surgical action recognition
Amine Yamlahi, Thuy Nuong Tran, Patrick Godau, Melanie Schellenberg,, Dominik Michael, Finn-Henri Smidt, Jan-Hinrich Noelke, Tim Adler, Minu, Dietlinde Tizabi, Chinedu Nwoye, Nicolas Padoy, Lena Maier-Hein

TL;DR
This paper introduces a novel self-distillation approach using Swin Transformers for surgical action recognition, effectively addressing class imbalance and label ambiguity, and outperforming existing solutions in a major challenge.
Contribution
It is the first to apply self-distillation with a heterogeneous ensemble of Swin Transformer models to surgical video analysis, demonstrating significant performance improvements.
Findings
Self-distillation with soft labels boosts model performance.
Our method outperforms all other submissions in the challenge.
External validation confirms robustness and effectiveness.
Abstract
Surgical scene understanding is a key prerequisite for contextaware decision support in the operating room. While deep learning-based approaches have already reached or even surpassed human performance in various fields, the task of surgical action recognition remains a major challenge. With this contribution, we are the first to investigate the concept of self-distillation as a means of addressing class imbalance and potential label ambiguity in surgical video analysis. Our proposed method is a heterogeneous ensemble of three models that use Swin Transfomers as backbone and the concepts of self-distillation and multi-task learning as core design choices. According to ablation studies performed with the CholecT45 challenge data via cross-validation, the biggest performance boost is achieved by the usage of soft labels obtained by self-distillation. External validation of our method on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Medical Image Segmentation Techniques
MethodsTest
