Integrating GPT-4o Into Data Mining in Neurosurgery: Feasibility and Proof-of-Concept Study

Arthur Henrique Almeida Sales; Jürgen Beck; Jürgen Grauvogel

PMC · DOI:10.2196/77114·March 9, 2026

Integrating GPT-4o Into Data Mining in Neurosurgery: Feasibility and Proof-of-Concept Study

Arthur Henrique Almeida Sales, Jürgen Beck, Jürgen Grauvogel

PDF

Open Access

TL;DR

This study shows that GPT-4o can accurately extract structured data from neurosurgical reports, especially for simple variables, but needs prompt refinement for more complex information.

Contribution

The study introduces a proof-of-concept evaluation of GPT-4o's feasibility for structured data extraction in neurosurgical documentation.

Findings

01

GPT-4o achieved 100% accuracy for structured variables like patient ID and surgery date.

02

Prompt refinement improved accuracy for complex variables like intraoperative complications from 50% to 90-100%.

03

Accuracy varied by variable type, with categorical variables performing best and conditional text variables worst.

Abstract

Large language models offer new possibilities for transforming unstructured clinical text into structured datasets. However, their performance in specialized and complex documentation environments, such as neurosurgery, remains insufficiently characterized. GPT-4o is a large language model with enhanced natural language capabilities, but its accuracy in extracting structured data from neurosurgical reports has not been systematically assessed. This proof-of-concept study evaluated the feasibility and accuracy of GPT-4o for extracting predefined structured variables from unstructured neurosurgical reports of patients with vestibular schwannoma. Specific aims were to measure accuracy across variable types, assess the impact of prompt refinement, and explore the model’s potential utility for research-oriented data mining. In this retrospective single-center study, 10 consecutive patients…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Chemicals1

GPT-4o

Diseases2

vestibular schwannoma postoperative

Figures4

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Radiomics and Machine Learning in Medical Imaging · Genomics and Rare Diseases