Online Analysis of Distributed Dataflows with Timely Dataflow
Malte Sandstede

TL;DR
ST2 is an online analysis system for distributed dataflows, built on Timely Dataflow, enabling real-time visualization, debugging, and optimization of data-parallel computations with high performance.
Contribution
The paper introduces ST2, a novel system that extends Timely Dataflow for real-time analysis, visualization, and debugging of distributed dataflows, surpassing previous systems in performance.
Findings
ST2 can analyze distributed dataflows in real-time with high efficiency.
ST2 outperforms SnailTrail 1 in performance benchmarks.
The system effectively detects issues in dataflow computations through case studies.
Abstract
We present ST2, an end-to-end solution to analyze distributed dataflows in an online setting. It is powered by Timely Dataflow, a low-latency, distributed data-parallel dataflow computational framework, and expands on its predecessor SnailTrail 1, a system to run online critical path analysis on program activity graphs derived from dataflow execution traces. ST2 connects to a running Timely computation, creates the program activity graph representation, and runs multiple analyses on top of it. Analyses include aggregate metrics, progress and temporal invariant checking, and graph pattern matching. Through a command-line interface and a real-time dashboard, users are able to interact with and visualize ST2's analysis results. For ST2's implementation, we discuss Differential Dataflow, a framework that uses differential computation to incrementalize even complex relational dataflow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
