HOLa: HoloLens Object Labeling
Michael Schwimmbeck, Serouj Khajarian, Konstantin Holzapfel, Johannes, Schmidt, Stefanie Remmele

TL;DR
HOLa is an automatic object labeling tool for HoloLens 2 that leverages SAM-Track for high-speed, high-accuracy segmentation, significantly reducing manual effort in medical AR applications.
Contribution
It introduces HOLa, a fully automatic, minimal-human-interaction object annotation system based on SAM-Track for HoloLens 2, applicable across various AR medical scenarios.
Findings
Labeling speed increased by over 500 times.
Dice scores between 0.875 and 0.982, comparable to human annotators.
Effective in complex medical imaging contexts.
Abstract
In the context of medical Augmented Reality (AR) applications, object tracking is a key challenge and requires a significant amount of annotation masks. As segmentation foundation models like the Segment Anything Model (SAM) begin to emerge, zero-shot segmentation requires only minimal human participation obtaining high-quality object masks. We introduce a HoloLens-Object-Labeling (HOLa) Unity and Python application based on the SAM-Track algorithm that offers fully automatic single object annotation for HoloLens 2 while requiring minimal human participation. HOLa does not have to be adjusted to a specific image appearance and could thus alleviate AR research in any application field. We evaluate HOLa for different degrees of image complexity in open liver surgery and in medical phantom experiments. Using HOLa for image annotation can increase the labeling speed by more than 500 times…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Computer Graphics and Visualization Techniques · Image Retrieval and Classification Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
