Unsupervised Machine Learning for Detecting Structural Anomalies in European Regional Statistics

Bogdan Oancea

arXiv:2605.02884·cs.LG·May 5, 2026

Unsupervised Machine Learning for Detecting Structural Anomalies in European Regional Statistics

Bogdan Oancea

PDF

TL;DR

This paper introduces an unsupervised machine learning framework to detect structurally atypical regional profiles in Europe using Eurostat data, improving validation of high-dimensional socio-economic statistics.

Contribution

It compares five anomaly detection methods and proposes a scalable, reproducible approach for identifying meaningful regional anomalies in European statistics.

Findings

01

Identified regions with divergent socio-economic profiles, including major cities and disadvantaged areas.

02

Machine learning methods consistently flagged regions with significant profile deviations.

03

The framework is compatible with existing validation workflows and scalable for broader use.

Abstract

Ensuring the coherence of regional socio-economic statistics is a central task for national statistical institutes. Traditional validation tools, such as range edits, ratio checks, or univariate outlier detection, are effective for identifying extreme values in individual series but are less suited for detecting unusual combinations of indicators in high-dimensional settings. This paper proposes an unsupervised machine learning framework for identifying structurally atypical regional profiles within Europe using publicly available Eurostat data. We construct a cross-sectional dataset of NUTS2 regions (2022) covering four key indicators: GDP per capita in PPS, unemployment rate, tertiary educational attainment, and population density. We apply and compare five anomaly detection techniques, univariate z-scores, Mahalanobis distance, Isolation Forest, Local Outlier Factor, and One-Class…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.