TimeSeriesExam: A time series understanding exam

Yifu Cai; Arjun Choudhry; Mononito Goswami; Artur Dubrawski

arXiv:2410.14752·cs.AI·October 22, 2024

TimeSeriesExam: A time series understanding exam

Yifu Cai, Arjun Choudhry, Mononito Goswami, Artur Dubrawski

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces TimeSeriesExam, a comprehensive multiple-choice question test to evaluate large language models' understanding of core time series concepts, revealing strengths and weaknesses across different models.

Contribution

The paper presents a novel scalable exam with over 700 questions to systematically assess LLMs' understanding of time series data, filling a knowledge gap in model interpretability.

Findings

01

GPT-4 and Gemini outperform open-source models on simple concepts

02

All models struggle with causality analysis in time series

03

Question generation is key for assessing LLM understanding

Abstract

Large Language Models (LLMs) have recently demonstrated a remarkable ability to model time series data. These capabilities can be partly explained if LLMs understand basic time series concepts. However, our knowledge of what these models understand about time series data remains relatively limited. To address this gap, we introduce TimeSeriesExam, a configurable and scalable multiple-choice question exam designed to assess LLMs across five core time series understanding categories: pattern recognition, noise understanding, similarity analysis, anomaly detection, and causality analysis. TimeSeriesExam comprises of over 700 questions, procedurally generated using 104 carefully curated templates and iteratively refined to balance difficulty and their ability to discriminate good from bad models. We test 7 state-of-the-art LLMs on the TimeSeriesExam and provide the first comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

moment-timeseries-foundation-model/timeseriesexam
pytorch

Datasets

AutonLab/TimeSeriesExam1
dataset· 938 dl
938 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Multi-Head Attention · Adam · Dropout