Reshape: Adaptive Result-aware Skew Handling for Exploratory Analysis on Big Data
Avinash Kumar, Sadeem Alsudais, Shengquan Ni, Zuozhi Wang, Yicong, Huang, Chen Li

TL;DR
Reshape is a framework that adaptively mitigates partitioning skew during data analysis workflows, improving initial result representativeness and efficiency in big data systems.
Contribution
It introduces a novel, adaptive, two-phase skew handling framework that adjusts load distribution during execution, reducing user burden and supporting multiple operators.
Findings
Reshape effectively reduces skew during execution.
It improves the representativeness of initial results.
Demonstrates efficiency on Amber and Flink systems.
Abstract
The process of data analysis, especially in GUI-based analytics systems, is highly exploratory. The user iteratively refines a workflow multiple times before arriving at the final workflow. In such an exploratory setting, it is valuable to the user if the initial results of the workflow are representative of the final answers so that the user can refine the workflow without waiting for the completion of its execution. Partitioning skew may lead to the production of misleading initial results during the execution. In this paper, we explore skew and its mitigation strategies from the perspective of the results shown to the user. We present a novel framework called Reshape that can adaptively handle partitioning skew in pipelined execution. Reshape employs a two-phase approach that transfers load in a fine-tuned manner to mitigate skew iteratively during execution, thus enabling it to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
