Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server
Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

TL;DR
This paper analyzes the micro-architectural performance bottlenecks of in-memory data analytics on modern cloud servers using Apache Spark, highlighting scalability issues and memory latency as key limiting factors.
Contribution
It provides a detailed characterization of Spark workloads on a NUMA architecture, identifying micro-architectural bottlenecks and scalability limitations.
Findings
Spark workloads do not scale linearly beyond twelve threads.
Memory-bound latency is the primary cause of work time inflation.
Workload inefficiencies are linked to memory latency and thread imbalance.
Abstract
In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads. We also quantify the inefficiencies at micro-architecture level for various data analysis workloads. Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance. Further, at the micro-architecture level, we observe memory bound latency to be the major cause of work time inflation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
