LeJOT: An Intelligent Job Cost Orchestration Solution for Databricks Platform
Lizhi Ma, Yi-Xiang Hu, Yuke Wang, Yifang Zhao, Yihui Ren, Jian-Xiang Liao, Feng Wu, Xiang-Yang Li

TL;DR
LeJOT is an intelligent, machine learning-based job orchestration framework for Databricks that predicts workload demands and optimizes resource allocation to reduce operational costs effectively.
Contribution
It introduces a novel proactive scheduling approach combining workload prediction and solver-based optimization for cost-efficient resource management in Databricks.
Findings
Achieves 20% cost reduction on real workloads
Outperforms static allocation strategies
Operates within minute-level scheduling timeframe
Abstract
With the rapid advancements in big data technologies, the Databricks platform has become a cornerstone for enterprises and research institutions, offering high computational efficiency and a robust ecosystem. However, managing the escalating operational costs associated with job execution remains a critical challenge. Existing solutions rely on static configurations or reactive adjustments, which fail to adapt to the dynamic nature of workloads. To address this, we introduce LeJOT, an intelligent job cost orchestration framework that leverages machine learning for execution time prediction and a solver-based optimization model for real-time resource allocation. Unlike conventional scheduling techniques, LeJOT proactively predicts workload demands, dynamically allocates computing resources, and minimizes costs while ensuring performance requirements are met. Experimental results on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Big Data and Digital Economy · Software System Performance and Reliability
