Compact Representations of Event Sequences
Nieves R. Brisaboa, Guillermo de Bernardo, Gonzalo Navarro, Tirso V., Rodeiro, Diego Seco

TL;DR
This paper presents a flexible, space-efficient method for compressing and querying large multidimensional event sequences, leveraging data regularities and customizable query-driven representations.
Contribution
It introduces two novel compressed representations for multidimensional sequences that optimize storage and query performance based on domain-specific needs.
Findings
Significant space savings demonstrated on real datasets
Efficient aggregation query performance achieved
Flexible representation adapts to different query types
Abstract
We introduce a new technique for the efficient management of large sequences of multidimensional data, which takes advantage of regularities that arise in real-world datasets and supports different types of aggregation queries. More importantly, our representation is flexible in the sense that the relevant dimensions and queries may be used to guide the construction process, easily providing a space-time tradeoff depending on the relevant queries in the domain. We provide two alternative representations for sequences of multidimensional data and describe the techniques to efficiently store the datasets and to perform aggregation queries over the compressed representation. We perform experimental evaluation on realistic datasets, showing the space efficiency and query capabilities of our proposal.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
