Timestamp tokens: a better coordination primitive for data-processing systems
Andrea Lattuada, Frank McSherry

TL;DR
This paper introduces timestamp tokens, a new coordination primitive for dataflow systems that reduces communication overhead while maintaining concurrency precision, enabling more flexible and efficient data processing models.
Contribution
The paper proposes timestamp tokens as a novel coordination primitive that improves data-processing system efficiency and expressiveness without requiring full system redesigns.
Findings
Timestamp tokens reduce communication volume in dataflow systems.
Projects using timestamp tokens can explore new computational idioms.
No need for complete system redesigns to adopt timestamp tokens.
Abstract
Distributed data processing systems have advanced through models that expose more and more opportunities for concurrency within a computation. The scheduling of these increasingly sophisticated models has become the bottleneck for improved throughput and reduced latency. We present a new coordination primitive for dataflow systems, the timestamp token, which minimizes the volume of information shared between the computation and host system, without surrendering precision about concurrency. Several projects have now used timestamp tokens, and were able to explore computational idioms that could not be expressed easily, if at all, in other platforms. Importantly, these projects did not need to design and implement whole systems to support their research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
