A Communication-Efficient Distributed Data Structure for Top-k and k-Select Queries
Felix Biermeier, Bj\"orn Feldkord, Manuel Malatyali, Friedhelm Meyer, auf der Heide

TL;DR
This paper introduces communication-efficient distributed data structures and protocols for top-k and k-select queries over streaming data from sensor nodes, optimizing message complexity while maintaining accuracy.
Contribution
It presents novel memoryless protocols for top-k and k-select queries and a dynamic data structure for tracking approximate ranks, reducing communication costs in distributed streaming.
Findings
Top-k query protocol uses O(k + log m + log log n) messages.
k-Select query protocol uses O((1/ε^2) log(1/δ) + log m + log^2 log n) messages.
Results are asymptotically tight for certain parameters.
Abstract
We consider the scenario of sensor nodes observing streams of data. The nodes are connected to a central server whose task it is to compute some function over all data items observed by the nodes. In our case, there exists a total order on the data items observed by the nodes. Our goal is to compute the currently lowest observed values or a value with rank in with probability . We propose solutions for these problems in an extension of the distributed monitoring model where the server can send broadcast messages to all nodes for unit cost. We want to minimize communication over multiple time steps where there are updates to a node's value in between queries. The result is composed of two main parts, which each may be of independent interest: (1) Protocols which answer Top-k and k-Select queries. These protocols are memoryless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
