# Three-Blind Validation Strategy of Deep Learning Models for Image Segmentation

**Authors:** Andrés Larroza, Francisco Javier Pérez-Benito, Raquel Tendero, Juan Carlos Perez-Cortes, Marta Román, Rafael Llobet

PMC · DOI: 10.3390/jimaging11050170 · Journal of Imaging · 2025-05-21

## TL;DR

This paper introduces a new validation method for image segmentation models that reduces bias by using a third blind expert to evaluate results from multiple annotators or models.

## Contribution

The novel three-blind validation strategy provides an unbiased evaluation framework for subjective segmentation tasks.

## Key findings

- The three-blind validation strategy effectively identifies systematic issues in human and machine annotations.
- The method was successfully applied to a mammography use case for dense tissue segmentation.
- The approach is generalizable to various segmentation tasks with high subjectivity.

## Abstract

Image segmentation plays a central role in computer vision applications such as medical imaging, industrial inspection, and environmental monitoring. However, evaluating segmentation performance can be particularly challenging when ground truth is not clearly defined, as is often the case in tasks involving subjective interpretation. These challenges are amplified by inter- and intra-observer variability, which complicates the use of human annotations as a reliable reference. To address this, we propose a novel validation framework—referred to as the three-blind validation strategy—that enables rigorous assessment of segmentation models in contexts where subjectivity and label variability are significant. The core idea is to have a third independent expert, blind to the labeler identities, assess a shuffled set of segmentations produced by multiple human annotators and/or automated models. This allows for the unbiased evaluation of model performance and helps uncover patterns of disagreement that may indicate systematic issues with either human or machine annotations. The primary objective of this study is to introduce and demonstrate this validation strategy as a generalizable framework for robust model evaluation in subjective segmentation tasks. We illustrate its practical implementation in a mammography use case involving dense tissue segmentation while emphasizing its potential applicability to a broad range of segmentation scenarios.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12113085/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12113085/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/PMC12113085/full.md

---
Source: https://tomesphere.com/paper/PMC12113085