Performance evaluation of job schedulers on Hadoop YARN
JIa-Chun Lin, Ming-Chang Lee

TL;DR
This paper evaluates the performance impacts of different scheduling policies and queue structures in Hadoop YARN, providing insights for better resource management in mixed application environments.
Contribution
It systematically compares four scheduling-policy combinations and queue structures in YARN, offering guidance for optimizing application performance.
Findings
Different SPCs significantly affect application performance.
Queue structures influence resource sharing efficiency.
Recommendations for selecting optimal SPCs and queues.
Abstract
To solve the limitation of Hadoop on scalability, resource sharing, and application support, the open-source community proposes the next generation of Hadoop's compute platform called Yet Another Resource Negotiator (YARN) by separating resource management functions from the programming model. This separation enables various application types to run on YARN in parallel. To achieve fair resource sharing and high resource utilization, YARN provides the capacity scheduler and the fair scheduler. However, the performance impacts of the two schedulers are not clear when mixed applications run on a YARN cluster. Therefore, in this paper, we study four scheduling-policy combinations (SPCs for short) derived from the two schedulers and then evaluate the four SPCs in extensive scenarios, which consider not only four application types, but also three different queue structures for organizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
