Labelling imaging datasets on the basis of neuroradiology reports: a   validation study

David A. Wood; Sina Kafiabadi; Aisha Al Busaidi; Emily Guilhem; Jeremy; Lynch; Matthew Townend; Antanas Montvila; Juveria Siddiqui; Naveen Gadapa,; Matthew Benger; Gareth Barker; Sebastian Ourselin; James H. Cole; Thomas C.; Booth

arXiv:2007.04226·eess.IV·March 10, 2021

Labelling imaging datasets on the basis of neuroradiology reports: a validation study

David A. Wood, Sina Kafiabadi, Aisha Al Busaidi, Emily Guilhem, Jeremy, Lynch, Matthew Townend, Antanas Montvila, Juveria Siddiqui, Naveen Gadapa,, Matthew Benger, Gareth Barker, Sebastian Ourselin, James H. Cole, Thomas C., Booth

PDF

1 Repo

TL;DR

This study validates the use of NLP for labeling neuroradiology MRI datasets, showing high accuracy for binary labels but variable results for detailed labels, and highlights the importance of specialist involvement.

Contribution

It provides a thorough validation of NLP-based labeling, compares specialist and non-specialist performance, and offers tools for streamlined report labeling.

Findings

01

Binary labels are highly accurate when derived from reports.

02

Granular labeling accuracy varies by category.

03

Non-specialist labeling reduces downstream model performance.

Abstract

Natural language processing (NLP) shows promise as a means to automate the labelling of hospital-scale neuroradiology magnetic resonance imaging (MRI) datasets for computer vision applications. To date, however, there has been no thorough investigation into the validity of this approach, including determining the accuracy of report labels compared to image labels as well as examining the performance of non-specialist labellers. In this work, we draw on the experience of a team of neuroradiologists who labelled over 5000 MRI neuroradiology reports as part of a project to build a dedicated deep learning-based neuroradiology report classifier. We show that, in our experience, assigning binary labels (i.e. normal vs abnormal) to images from reports alone is highly accurate. In contrast to the binary labels, however, the accuracy of more granular labelling is dependent on the category, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MIDIconsortium/RadReports
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.