# NeuroLens: organ localization using natural language commands for anatomical recognition in surgical training

**Authors:** Nevin M. Matasyoh, Daniel Delev, Waseem Masalha, Franziska Mathis-Ullrich, Ramy A. Zeineldin

PMC · DOI: 10.1007/s11548-025-03463-5 · 2025-06-24

## TL;DR

NeuroLens is a system that uses video and voice commands to help surgical trainees identify and learn about brain structures during training.

## Contribution

NeuroLens introduces a multimodal deep learning system for anatomical localization in surgical training using natural language commands.

## Key findings

- NeuroLens achieved 100% predicted class accuracy and 79.69% localization accuracy.
- The system scored 71.5 on the System Usability Scale, indicating acceptable usability.
- Participants suggested improvements like 3D visualization to enhance the system.

## Abstract

This study introduces NeuroLens, a multimodal system designed to enhance anatomical recognition by integrating video with textual and voice inputs. It aims to provide an interactive learning platform for surgical trainees.

NeuroLens employs a multimodal deep learning localization model trained on an Endoscopic Third Ventriculostomy dataset. It processes neuroendoscopic videos with textual or voice descriptions to identify and localize anatomical structures, displaying them as labeled bounding boxes. Usability was evaluated through a questionnaire by five participants, including surgical students and practicing surgeons. The questionnaire included both quantitative and qualitative sections. The quantitative part covered the System Usability Scale (SUS) and assessments of system appearance, functionality, and overall usability, while the qualitative section gathered user feedback and improvement suggestions. The localization model’s performance was assessed using accuracy and mean Intersection over Union (mIoU) metrics.

The system demonstrates strong usability, with an average SUS score of 71.5, exceeding the threshold for acceptable usability. The localization achieves a predicted class accuracy of 100%, a localization accuracy of 79.69%, and a mIoU of 67.10%. Participant feedback highlights the intuitive design, organization, and responsiveness while suggesting enhancements like 3D visualization.

NeuroLens integrates multimodal inputs for accurate anatomical detection and localization, addressing limitations of traditional training. Its strong usability and technical performance make it a valuable tool for enhancing anatomical learning in surgical training. While NeuroLens shows strong usability and performance, its small sample size limits generalizability. Further evaluation with more students and enhancements like 3D visualization will strengthen its effectiveness.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12350519/full.md

---
Source: https://tomesphere.com/paper/PMC12350519