TL;DR
This paper introduces a self-supervised learning approach using knowledge distillation to train lightweight 2D/3D human pose estimation models from unlabeled data, enabling real-time deployment in operating rooms.
Contribution
It presents a novel teacher/student framework that leverages unlabeled data and a complex teacher network to improve lightweight pose estimation models.
Findings
The student network achieves performance comparable to the teacher on MVOR+ dataset.
The method effectively utilizes unlabeled data for training pose estimation models.
The approach enables real-time 2D/3D pose estimation in clinical settings.
Abstract
2D/3D human pose estimation is needed to develop novel intelligent tools for the operating room that can analyze and support the clinical activities. The lack of annotated data and the complexity of state-of-the-art pose estimation approaches limit, however, the deployment of such techniques inside the OR. In this work, we propose to use knowledge distillation in a teacher/student framework to harness the knowledge present in a large-scale non-annotated dataset and in an accurate but complex multi-stage teacher network to train a lightweight network for joint 2D/3D pose estimation. The teacher network also exploits the unlabeled data to generate both hard and soft labels useful in improving the student predictions. The easily deployable network trained using this effective self-supervision strategy performs on par with the teacher network on \emph{MVOR+}, an extension of the public MVOR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
