# Mean Reversion and Heavy Tails: Characterizing Time-Series Data Using Ornstein–Uhlenbeck Processes and Machine Learning

**Authors:** Sebastian Raubitzek, Sebastian Schrittwieser, Georg Goldenits, Alexander Schatten, Kevin Mallinger

PMC · DOI: 10.3390/s26041263 · 2026-02-14

## TL;DR

This paper introduces a machine learning method to analyze time-series data by estimating mean reversion and heavy-tail characteristics from short windows.

## Contribution

A novel supervised learning framework using CatBoost to estimate mean-reversion rate and heavy-tail parameters from short time-series windows.

## Key findings

- The method accurately estimates θ and α from synthetic Ornstein–Uhlenbeck processes with α-stable noise.
- It successfully detects regime changes in financial returns, sunspot data, and climate fields.
- The framework is robust to non-Gaussian and heavy-tailed inputs without domain-specific tuning.

## Abstract

We present a supervised learning method to estimate two local descriptors of time-series dynamics, the mean-reversion rate θ and a heavy-tail estimate α, from short windows of data. These parameters summarize recovery behavior and tail heaviness and are useful for interpreting stochastic signals in sensing applications. The method is trained on synthetic, dimensionless Ornstein–Uhlenbeck processes with α-stable noise, ensuring robustness for non-Gaussian and heavy-tailed inputs. Gradient-boosted tree models (CatBoost) map window-level statistical features to discrete α and θ categories with high accuracy and predominantly adjacent-class confusion. Using the same trained models, we analyze daily financial returns, daily sunspot numbers, and NASA POWER climate fields for Austria. The method detects changes in local dynamics, including shifts in the financial tail structure after 2010, weaker and more irregular solar cycles after 2005, and a redistribution in clear-sky shortwave irradiance around 2000. Because it relies only on short windows and requires no domain-specific tuning, the framework provides a compact diagnostic tool for signal processing, supporting the characterization of local variability, detection of regime changes, and decision making in settings where long-term stationarity is not guaranteed.

## Full-text entities

- **Diseases:** AAPL (MESH:D007409), COVID-19 (MESH:D000086382), injury to (MESH:D014947)
- **Chemicals:** EVT (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12944041/full.md

---
Source: https://tomesphere.com/paper/PMC12944041