Scout: An Experienced Guide to Find the Best Cloud Configuration
Chin-Jung Hsu, Vivek Nair, Tim Menzies, Vincent W. Freeh

TL;DR
SCOUT is a novel cloud configuration search method that leverages prior measurements and low-level performance metrics to efficiently find optimal setups, outperforming existing methods in cost and performance.
Contribution
The paper introduces SCOUT, a new approach that uses historical data and low-level metrics to improve cloud configuration search efficiency and effectiveness.
Findings
SCOUT outperforms state-of-the-art methods in finding better configurations.
Using prior measurements reduces search cost significantly.
Low-level performance metrics are crucial for effective configuration tuning.
Abstract
Finding the right cloud configuration for workloads is an essential step to ensure good performance and contain running costs. A poor choice of cloud configuration decreases application performance and increases running cost significantly. While Bayesian Optimization is effective and applicable to any workloads, it is fragile because performance and workload are hard to model (to predict). In this paper, we propose a novel method, SCOUT. The central insight of SCOUT is that using prior measurements, even those for different workloads, improves search performance and reduces search cost. At its core, SCOUT extracts search hints (inference of resource requirements) from low-level performance metrics. Such hints enable SCOUT to navigate through the search space more efficiently---only spotlight region will be searched. We evaluate SCOUT with 107 workloads on Apache Hadoop and Spark.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Data Stream Mining Techniques · Machine Learning and Data Classification
