I/O-Efficient Data Structures for Colored Range and Prefix Reporting
Kasper Green Larsen, Rasmus Pagh

TL;DR
This paper introduces optimal I/O-efficient data structures for colored range and prefix reporting problems, significantly improving query efficiency and space usage in the external memory model.
Contribution
It presents the first optimal linear-space data structures for colored range and prefix reporting that exploit full machine capabilities, breaking previous indivisibility assumptions.
Findings
Achieves O(1 + k/B) I/O query time with linear space.
Provides optimal solutions for three-sided orthogonal range reporting.
Extends results to colored prefix reporting with efficient top-k extension.
Abstract
Motivated by information retrieval applications, we consider the one-dimensional colored range reporting problem in rank space. The goal is to build a static data structure for sets C_1,...,C_m \subseteq {1,...,sigma} that supports queries of the kind: Given indices a,b, report the set Union_{a <= i <= b} C_i. We study the problem in the I/O model, and show that there exists an optimal linear-space data structure that answers queries in O(1+k/B) I/Os, where k denotes the output size and B the disk block size in words. In fact, we obtain the same bound for the harder problem of three-sided orthogonal range reporting. In this problem, we are to preprocess a set of n two-dimensional points in rank space, such that all points inside a query rectangle of the form [x_1,x_2] x (-infinity,y] can be reported. The best previous bounds for this problem is either O(n lg^2_B n) space and O(1+k/B)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Machine Learning and Algorithms · Advanced Image and Video Retrieval Techniques
