# An artificial intelligence platform for automated measurement and count estimation of ovarian follicles during ovarian stimulation and IVF: a multicenter study

**Authors:** Piotr Wygocki, Andrzej Zapała, Mateusz Ulfig, Marcin Zieleń, Krystian Zieliński, Natalia Gajewska, Damian Drzyzga, Marcin Wrochna, Piotr Sankowski, Gerard Letterie

PMC · DOI: 10.1007/s10815-025-03777-y · 2026-01-03

## TL;DR

An AI platform accurately counts and measures ovarian follicles in IVF, matching expert assessments and improving efficiency.

## Contribution

A novel AI platform for automated follicle measurement in IVF that matches expert performance and improves annotation efficiency.

## Key findings

- The AI model achieved 98.2% precision for follicles ≥10 mm and 94.2% precision for all follicles.
- Annotation time was reduced 2.5-fold with AI assistance, and model performance was stable across ultrasound systems.
- Expert adjustments averaged 0.54 per scan, indicating high AI accuracy in real-world settings.

## Abstract

Ultrasound measurement of follicle diameter is essential in IVF monitoring. This study evaluates the analytical performance of follicle counts and size measurements from two-dimensional images using an AI-based platform, compared to assessments by certified sonographers.

A total of 5508 TVUS scans from 1689 patients undergoing controlled ovarian stimulation across four IVF centers (Poland, Argentina, Colombia, and the USA) were retrospectively analyzed. All visible follicles were marked using bounding boxes. The dataset included three subsets: training/validation for model development, independent test for evaluating performance across ultrasound systems, and a consensus test set (102 scans from 27 patients) annotated by three expert sonographers. Model performance was assessed using precision, recall, and F1 score. Annotation efficiency was measured by comparing manual and AI-assisted times. Real-world performance was evaluated on a prospective cohort of 904 scans from 269 patients, based on expert adjustments to AI annotations.

For follicles ≥ 10 mm, the model achieved 98.2% precision (95% CI, 96.5–99.2), 88.9% recall (85.0–91.8), and 93.3% F1 score (90.7–95.1). For all follicles, precision and recall were 94.2% (92.8–95.4) and 68.9% (65.9–71.9). Annotation time was reduced 2.5-fold (p < 0.01), with an average of 0.54 expert adjustments per scan (CI, 0.47–0.62). Model performance was stable across ultrasound platforms.

This AI platform enables accurate, automated follicle counting and measurement during ovarian stimulation. It matches expert-level performance, improves efficiency, and supports scalable, cost-effective fertility care without compromising quality.

The online version contains supplementary material available at 10.1007/s10815-025-03777-y.

## Full-text entities

- **Diseases:** IVF (MESH:C537182)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12982739/full.md

---
Source: https://tomesphere.com/paper/PMC12982739