MESSI: In-Memory Data Series Indexing
Botao Peng, Panagiota Fatourou, Themis Palpanas

TL;DR
MESSI is a novel in-memory data series index that leverages modern hardware parallelization to enable real-time similarity search on large datasets, significantly outperforming previous methods in speed.
Contribution
This paper introduces MESSI, the first in-memory data series index optimized for modern hardware, achieving faster construction and query times for large datasets.
Findings
Up to 4x faster index construction
Up to 11x faster query answering
Real-time similarity search on 100GB datasets in under 75ms
Abstract
Data series similarity search is a core operation for several data series analysis applications across many different domains. However, the state-of-the-art techniques fail to deliver the time performance required for interactive exploration, or analysis of large data series collections. In this work, we propose MESSI, the first data series index designed for in-memory operation on modern hardware. Our index takes advantage of the modern hardware parallelization opportunities (i.e., SIMD instructions, multi-core and multi-socket architectures), in order to accelerate both index construction and similarity search processing times. Moreover, it benefits from a careful design in the setup and coordination of the parallel workers and data structures, so that it maximizes its performance for in-memory operations. Our experiments with synthetic and real datasets demonstrate that overall MESSI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Data Management and Algorithms · Music and Audio Processing
