Automated Spinal MRI Labelling from Reports Using a Large Language Model

Robin Y. Park; Rhydian Windsor; Amir Jamaludin; Andrew Zisserman

arXiv:2410.17235·eess.IV·November 8, 2024

Automated Spinal MRI Labelling from Reports Using a Large Language Model

Robin Y. Park, Rhydian Windsor, Amir Jamaludin, Andrew Zisserman

PDF

1 Repo

TL;DR

This paper introduces a pipeline that uses large language models to automatically extract labels from spinal MRI reports, enabling efficient annotation and training of imaging classifiers that perform comparably to those trained on manual labels.

Contribution

The study demonstrates that open-source large language models can effectively extract clinical labels from reports, matching GPT-4's performance and facilitating automated training of MRI classifiers.

Findings

01

Open-source models match or outperform GPT-4 on report labeling.

02

Automated labels enable training classifiers with comparable accuracy to manual annotations.

03

The pipeline is validated on five spinal conditions.

Abstract

We propose a general pipeline to automate the extraction of labels from radiology reports using large language models, which we validate on spinal MRI reports. The efficacy of our labelling method is measured on five distinct conditions: spinal cancer, stenosis, spondylolisthesis, cauda equina compression and herniation. Using open-source models, our method equals or surpasses GPT-4 on a held-out set of reports. Furthermore, we show that the extracted labels can be used to train imaging models to classify the identified conditions in the accompanying MR scans. All classifiers trained using automated labels achieve comparable performance to models trained using scans manually annotated by clinicians. Code can be found at https://github.com/robinyjpark/AutoLabelClassifier.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

robinyjpark/autolabelclassifier
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Multi-Head Attention · Adam · Softmax · Dropout · Absolute Position Encodings · Label Smoothing · Byte Pair Encoding