The UCR Time Series Archive

Hoang Anh Dau; Anthony Bagnall; Kaveh Kamgar; Chin-Chia Michael Yeh,; Yan Zhu; Shaghayegh Gharghabi; Chotirat Ann Ratanamahatana; Eamonn Keogh

arXiv:1810.07758·cs.LG·September 10, 2019

The UCR Time Series Archive

Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh,, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, Eamonn Keogh

PDF

2 Repos 2 Datasets

TL;DR

The paper discusses the expansion of the UCR Time Series Archive to 128 datasets, offers evaluation advice, and highlights potential misattribution of algorithmic improvements in existing research.

Contribution

It introduces the new dataset expansion, provides evaluation guidance, and reveals that many reported improvements may be due to simpler modifications rather than novel algorithms.

Findings

01

Archive expanded from 85 to 128 datasets

02

Many papers may misattribute improvements to complex algorithms

03

Simple modifications can often replicate reported gains

Abstract

The UCR Time Series Archive - introduced in 2002, has become an important resource in the time series data mining community, with at least one thousand published papers making use of at least one data set from the archive. The original incarnation of the archive had sixteen data sets but since that time, it has gone through periodic expansions. The last expansion took place in the summer of 2015 when the archive grew from 45 to 85 data sets. This paper introduces and will focus on the new data expansion from 85 to 128 data sets. Beyond expanding this valuable resource, this paper offers pragmatic advice to anyone who may wish to evaluate a new algorithm on the archive. Finally, this paper makes a novel and yet actionable claim: of the hundreds of papers that show an improvement over the standard baseline (1-nearest neighbor classification), a large fraction may be mis-attributing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.