HOLa: HoloLens Object Labeling

Michael Schwimmbeck; Serouj Khajarian; Konstantin Holzapfel; Johannes; Schmidt; Stefanie Remmele

arXiv:2412.04945·cs.CV·January 3, 2025

HOLa: HoloLens Object Labeling

Michael Schwimmbeck, Serouj Khajarian, Konstantin Holzapfel, Johannes, Schmidt, Stefanie Remmele

PDF

Open Access 1 Repo

TL;DR

HOLa is an automatic object labeling tool for HoloLens 2 that leverages SAM-Track for high-speed, high-accuracy segmentation, significantly reducing manual effort in medical AR applications.

Contribution

It introduces HOLa, a fully automatic, minimal-human-interaction object annotation system based on SAM-Track for HoloLens 2, applicable across various AR medical scenarios.

Findings

01

Labeling speed increased by over 500 times.

02

Dice scores between 0.875 and 0.982, comparable to human annotators.

03

Effective in complex medical imaging contexts.

Abstract

In the context of medical Augmented Reality (AR) applications, object tracking is a key challenge and requires a significant amount of annotation masks. As segmentation foundation models like the Segment Anything Model (SAM) begin to emerge, zero-shot segmentation requires only minimal human participation obtaining high-quality object masks. We introduce a HoloLens-Object-Labeling (HOLa) Unity and Python application based on the SAM-Track algorithm that offers fully automatic single object annotation for HoloLens 2 while requiring minimal human participation. HOLa does not have to be adjusted to a specific image appearance and could thus alleviate AR research in any application field. We evaluate HOLa for different degrees of image complexity in open liver surgery and in medical phantom experiments. Using HOLa for image annotation can increase the labeling speed by more than 500 times…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mschwimmbeck/hola
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Computer Graphics and Visualization Techniques · Image Retrieval and Classification Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings