# Validation of conformal prediction in cervical atypia classification

**Authors:** Misgina Tsighe Hagos, Antti Suutala, Dmitrii Bychkov, Hakan Kücükel, Joar von Bahr, Milda Poceviciute, Johan Lundin, Nina Linder, Claes Lundström

PMC · DOI: 10.1038/s41598-026-44850-5 · 2026-03-22

## TL;DR

This paper evaluates conformal prediction methods for cervical cancer classification to better reflect model uncertainty and improve diagnostic reliability.

## Contribution

The study introduces a comprehensive validation framework for conformal prediction using expert annotations in cervical atypia classification.

## Key findings

- Conventional coverage-based validation overestimates conformal prediction performance.
- Current conformal prediction methods often produce prediction sets misaligned with human labels.
- Conformal prediction methods can identify ambiguous and out-of-distribution data.

## Abstract

Deep learning based cervical cancer classification can potentially increase access to screening in low-resource regions. However, deep learning models are often overconfident and do not reliably reflect diagnostic uncertainty. Moreover, they are typically optimized to generate maximum-likelihood predictions, which fail to convey uncertainty or ambiguity in their results. Such challenges can be addressed using conformal prediction, a model-agnostic framework for generating prediction sets that contain likely classes for trained deep-learning models. The size of these prediction sets indicates model uncertainty, contracting as model confidence increases. However, existing validation of conformal prediction primarily focuses on whether the prediction set includes or covers the true class, often overlooking the presence of extraneous classes. We argue that prediction sets should be truthful and valuable to end users, ensuring that the listed likely classes align with human expectations rather than being overly relaxed and including false positives or unlikely classes. In this study, we comprehensively validate conformal prediction sets using expert annotation sets collected from multiple annotators. We evaluate three conformal prediction approaches applied to three deep-learning models trained for cervical atypia classification. Our expert annotation-based analysis reveals that conventional coverage-based validations overestimate performance and that current conformal prediction methods often produce prediction sets that are not well aligned with human labels. Additionally, we explore the capabilities of the conformal prediction methods in identifying ambiguous and out-of-distribution data.

## Linked entities

- **Diseases:** cervical cancer (MONDO:0002974)

## Full-text entities

- **Genes:** LCT (lactase) [NCBI Gene 3938] {aka LAC, LPH, LPH1}, SH2B2 (SH2B adaptor protein 2) [NCBI Gene 10603] {aka APS}
- **Diseases:** RAPS (MESH:D018489), cancer (MESH:D009369), cervical atypia (MESH:D002575), SCC (MESH:D002294), AIS (MESH:D065311), ASC-H (MESH:D000081483), deaths (MESH:D003643), OOD (MESH:D020243), IC (MESH:D009361), ASC (MESH:D065309), H (MESH:D000848), Cervical cancer (MESH:D002583)
- **Chemicals:** oil (MESH:D009821), OOD (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13009466/full.md

---
Source: https://tomesphere.com/paper/PMC13009466