StreamBed: capacity planning for stream processing
Guillaume Rosinosky, Donatien Schmitz, Etienne Rivi\`ere

TL;DR
StreamBed is a capacity planning system for stream processing that accurately predicts resource needs for large-scale queries using small testbeds, aiding efficient deployment.
Contribution
It introduces a novel capacity planning approach that models resource requirements through small-scale testing to predict large-scale deployment needs.
Findings
Accurately predicts capacity for over 1,000-core jobs
Uses only 48-core testbeds for predictions
Effective for large-scale stream processing queries
Abstract
StreamBed is a capacity planning system for stream processing. It predicts, ahead of any production deployment, the resources that a query will require to process an incoming data rate sustainably, and the appropriate configuration of these resources. StreamBed builds a capacity planning model by piloting a series of runs of the target query in a small-scale, controlled testbed. We implement StreamBed for the popular Flink DSP engine. Our evaluation with large-scale queries of the Nexmark benchmark demonstrates that StreamBed can effectively and accurately predict capacity requirements for jobs spanning more than 1,000 cores using a testbed of only 48 cores.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Cloud Computing and Resource Management · Data Management and Algorithms
