Deep Cervix Model Development from Heterogeneous and Partially Labeled Image Datasets
Anabik Pal, Zhiyun Xue, Sameer Antani

TL;DR
This paper introduces a self-supervised learning approach to develop a robust cervical image classification model from heterogeneous, partially labeled datasets, enhancing accuracy and addressing data sharing restrictions.
Contribution
It presents a novel SSL-based method for pre-training and fine-tuning cervical image models across diverse datasets with different labeling criteria and limited labels.
Findings
SSL initialization improves classification accuracy by at least 2.5%.
Including data from multiple datasets further enhances performance.
Federated SSL can outperform models trained on individual datasets.
Abstract
Cervical cancer is the fourth most common cancer in women worldwide. The availability of a robust automated cervical image classification system can augment the clinical care provider's limitation in traditional visual inspection with acetic acid (VIA). However, there are a wide variety of cervical inspection objectives which impact the labeling criteria for criteria-specific prediction model development. Moreover, due to the lack of confirmatory test results and inter-rater labeling variation, many images are left unlabeled. Motivated by these challenges, we propose a self-supervised learning (SSL) based approach to produce a pre-trained cervix model from unlabeled cervical images. The developed model is further fine-tuned to produce criteria-specific classification models with the available labeled images. We demonstrate the effectiveness of the proposed approach using two cervical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCervical Cancer and HPV Research
