Semi-Supervised Siamese Network for Identifying Bad Data in Medical   Imaging Datasets

Niamh Belton; Aonghus Lawlor; Kathleen M. Curran

arXiv:2108.07130·cs.CV·August 17, 2021·1 cites

Semi-Supervised Siamese Network for Identifying Bad Data in Medical Imaging Datasets

Niamh Belton, Aonghus Lawlor, Kathleen M. Curran

PDF

Open Access 1 Repo

TL;DR

This paper introduces a semi-supervised Siamese network approach to identify bad data in medical imaging datasets, improving data quality for robust model training with minimal expert review.

Contribution

The novel semi-supervised Siamese network method efficiently detects bad medical images using only a small reference set and outperforms previous approaches.

Findings

01

Achieved an AUC of 0.989 in bad data detection

02

Requires minimal expert review of reference images

03

Effective in identifying images lacking major anatomical structures

Abstract

Noisy data present in medical imaging datasets can often aid the development of robust models that are equipped to handle real-world data. However, if the bad data contains insufficient anatomical information, it can have a severe negative effect on the model's performance. We propose a novel methodology using a semi-supervised Siamese network to identify bad data. This method requires only a small pool of 'reference' medical images to be reviewed by a non-expert human to ensure the major anatomical structures are present in the Field of View. The model trains on this reference set and identifies bad data by using the Siamese network to compute the distance between the reference set and all other medical images in the dataset. This methodology achieves an Area Under the Curve (AUC) of 0.989 for identifying bad data. Code will be available at https://git.io/JYFuV.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

niamhbelton/Siamese_Network_Bad_Data
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Machine Learning in Healthcare · Artificial Intelligence in Healthcare

MethodsSiamese Network