Fine-Grained Modeling and Optimization for Intelligent Resource   Management in Big Data Processing

Chenghao Lyu; Qi Fan; Fei Song; Arnab Sinha; Yanlei Diao; Wei Chen; Li; Ma; Yihui Feng; Yaliang Li; Kai Zeng; Jingren Zhou

arXiv:2207.02026·cs.DB·September 24, 2024

Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing

Chenghao Lyu, Qi Fan, Fei Song, Arnab Sinha, Yanlei Diao, Wei Chen, Li, Ma, Yihui Feng, Yaliang Li, Kai Zeng, Jingren Zhou

PDF

TL;DR

This paper introduces a fine-grained, multi-objective resource optimization system for big data processing, significantly reducing latency and cost through hierarchical modeling and optimization techniques.

Contribution

It presents a novel architecture with instance-level modeling and optimization methods that improve resource management efficiency in big data systems.

Findings

01

Reduced latency by 37-72% in production workloads.

02

Lowered costs by 43-78% compared to existing systems.

03

Achieved fast optimization in 0.02-0.23 seconds.

Abstract

Big data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary constraints of analytical users. The RO problem is challenging because it involves a set of decisions (the partition count, placement of parallel instances on machines, and resource allocation to each instance), requires multi-objective optimization (MOO), and is compounded by the scale and complexity of big data systems while having to meet stringent time constraints for scheduling. This paper presents a MaxCompute-based integrated system to support multi-objective resource optimization via fine-grained instance-level modeling and optimization. We propose a new architecture that breaks RO into a series of simpler problems, new fine-grained predictive models, and novel optimization methods that exploit these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.