# Outliers and anomalies in training and testing datasets for AI-powered morphometry—evidence from CT scans of the spleen

**Authors:** Yuriy Vasilev, Anastasia Pamova, Tatiana Bobrovskaya, Anton Vladzimirskyy, Olga Omelyanskaya, Elena Astapenko, Artem Kruchinkin, Novik Vladimir, Kirill Arzamasov

PMC · DOI: 10.3389/frai.2025.1607348 · 2025-07-15

## TL;DR

This study explores methods to detect outliers and anomalies in medical datasets used for training AI to measure organ sizes, using spleen CT scans as an example.

## Contribution

The study identifies effective methods for detecting anomalies in morphometric datasets, combining visual, statistical, and machine learning approaches.

## Key findings

- Visual methods like boxplots and histograms were effective for identifying outliers.
- Machine learning algorithms such as OSVM, KNN, and autoencoders also proved useful.
- A total of 32 outlier anomalies were detected in the spleen dataset.

## Abstract

Creating training and testing datasets for machine learning algorithms to measure linear dimensions of organs is a tedious task. There are no universally accepted methods for evaluating outliers or anomalies in such datasets. This can cause errors in machine learning and compromise the quality of end products. The goal of this study is to identify optimal methods for detecting organ anomalies and outliers in medical datasets designed to train and test neural networks in morphometrics.

A dataset was created containing linear measurements of the spleen obtained from CT scans. Labelling was performed by three radiologists. The total number of studies included in the sample was N = 197 patients. Using visual methods (1.5 interquartile range; heat map; boxplot; histogram; scatter plot), machine learning algorithms (Isolation forest; Density-Based Spatial Clustering of Applications with Noise; K-nearest neighbors algorithm; Local outlier factor; One-class support vector machines; EllipticEnvelope; Autoencoders), and mathematical statistics (z-score, Grubb’s test; Rosner’s test).

We identified measurement errors, input errors, abnormal size values and non-standard shapes of the organ (sickle-shaped, round, triangular, additional lobules). The most effective methods included visual techniques (including boxplots and histograms) and machine learning algorithms such is OSVM, KNN and autoencoders. A total of 32 outlier anomalies were found.

Curation of complex morphometric datasets must involve thorough mathematical and clinical analyses. Relying solely on mathematical statistics or machine learning methods appears inadequate.

## Full-text entities

- **Genes:** PKD2 (polycystin 2, transient receptor potential cation channel) [NCBI Gene 5311] {aka APKD2, PC2, PKD4, Pc-2, TRPP2}, PCSK1 (proprotein convertase subtilisin/kexin type 1) [NCBI Gene 5122] {aka BMIQ12, NEC1, PC1, PC1/3, PC3, SPC3}
- **Diseases:** abnormalities of the spleen (MESH:D013159), AI (MESH:C538142), anomalies (MESH:D000013), abnormalities (MESH:D000014), splenomegaly (MESH:D013163)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12303909/full.md

---
Source: https://tomesphere.com/paper/PMC12303909