Augmenting Chest X-ray Datasets with Non-Expert Annotations
Veronika Cheplygina, Cathrine Damgaard, Trine Naja Eriksen, Dovile Juodelyte, Amelia Jim\'enez-S\'anchez

TL;DR
This paper introduces NEATX, a dataset with non-expert annotations of chest X-ray tubes, demonstrating that non-expert labels can effectively augment datasets and train reliable detectors, with agreement comparable to expert annotations.
Contribution
The paper presents a new dataset with non-expert annotations of tubes in chest X-rays and shows these annotations can be used to train effective detection models.
Findings
Non-expert annotations achieve moderate to almost perfect agreement with experts.
Chest drain detector trained on non-expert annotations generalizes well to expert labels.
The dataset enhances existing datasets and raises awareness about annotation quality.
Abstract
The advancement of machine learning algorithms in medical image analysis requires the expansion of training datasets. A popular and cost-effective approach is automated annotation extraction from free-text medical reports, primarily due to the high costs associated with expert clinicians annotating medical images, such as chest X-rays. However, it has been shown that the resulting datasets are susceptible to biases and shortcuts. Another strategy to increase the size of a dataset is crowdsourcing, a widely adopted practice in general computer vision with some success in medical image analysis. In a similar vein to crowdsourcing, we enhance two publicly available chest X-ray datasets by incorporating non-expert annotations. However, instead of using diagnostic labels, we annotate shortcuts in the form of tubes. We collect 3.5k chest drain annotations for NIH-CXR14, and 1k annotations for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging · Colorectal Cancer Screening and Detection
