PYRREGULAR: A Unified Framework for Irregular Time Series, with Classification Benchmarks

Francesco Spinnato; Cristiano Landi

arXiv:2505.06047·cs.LG·January 28, 2026

PYRREGULAR: A Unified Framework for Irregular Time Series, with Classification Benchmarks

Francesco Spinnato, Cristiano Landi

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces PYRREGULAR, a comprehensive framework and standardized dataset repository for irregular time series classification, facilitating unified evaluation and advancing research across multiple domains.

Contribution

It presents the first unified framework and dataset repository for irregular time series classification, promoting interoperability and standardized benchmarking.

Findings

01

Benchmarking 12 classifiers across 34 datasets

02

Improved evaluation consistency for irregular time series methods

03

Facilitated cross-domain research and comparison

Abstract

Irregular temporal data, characterized by varying recording frequencies, differing observation durations, and missing values, presents significant challenges across fields like mobility, healthcare, and environmental science. Existing research communities often overlook or address these challenges in isolation, leading to fragmented tools and methods. To bridge this gap, we introduce a unified framework, and the first standardized dataset repository for irregular time series classification, built on a common array format to enhance interoperability. This repository comprises 34 datasets on which we benchmark 12 classifier models from diverse domains and communities. This work aims to centralize research efforts and enable a more robust evaluation of irregular temporal data analysis methods.

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

The paper focuses on long-standing pain points in irregular time series research such as fragmented tools, lack of standardized benchmarks, and reliance on artificially induced irregularity. The benchmark design is rigorous and comprehensive. Authors have evaluates a total of twelve methods over 34 datasets.

Weaknesses

The paper studies exclusively on classification tasks and excludes other important time series tasks such as forecasting, anomaly detection from benchmarking. While the paper mentions potential extensions, authors did not provide any technical details or preliminary results for these tasks. Despite emphasizing practicality, authors provide incomplete details on computational ciost while it reports training/inference delay for classifiers, this paper did not analyze how the framework scales wit

Reviewer 02Rating 6Confidence 4

Strengths

1. The paper clearly identifies a major gap in the field: the lack of interoperable tools and standardized benchmarks for irregular time series classification, which has long hindered cross-domain reproducibility and comparison. 2. The proposed array format elegantly combines xarray and sparse COO representations to achieve both flexibility and memory efficiency, while supporting multiple types of irregularity (uneven sampling, partial observation, raggedness). 3. The authors assemble a substa

Weaknesses

1. While the benchmark covers a diverse set of classical and neural classifiers, recent developments in foundation or LLM-based time-series models (e.g., Time-LLM, CALF, or multimodal pretraining frameworks) are not discussed. Including such models, even conceptually, could contextualize where pyrregular fits within the broader trend toward generalist temporal modeling. 2. The paper briefly mentions runtime comparisons, but a deeper discussion of computational efficiency, scalability with data

Reviewer 03Rating 6Confidence 4

Strengths

+ It handles a real problem in ITS learning research, despite not being a traditional ICLR paper + The framework covers multiple ITS settings, improving comparability across papers + The proposed pipeline to Raw ITS into model-ready tensors can accelerate experimentation and deployment

Weaknesses

+ As the main contribution is the framework itself, the paper could be more focused on how it was implemented instead of evaluating the included models. + As it was proposed for unified ITS, some relevant families seem absent from the main implementation/benchmark (e.g., latent ODE[1]/RNN variants beyond NCDE, state-space models/SSMs, and foundation models as TabPFN[2]) despite being used for ITS with relevant results. + TIMESNET is reported as a transformer-based model, but it is not based on t

Code & Models

Repositories

fspinna/pyrregular
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Machine Learning in Healthcare · Data Stream Mining Techniques