Optimizing Data-Intensive Computations in Existing Libraries with Split   Annotations

Shoumik Palkar; Matei Zaharia

arXiv:1810.12297·cs.DC·September 20, 2019

Optimizing Data-Intensive Computations in Existing Libraries with Split Annotations

Shoumik Palkar, Matei Zaharia

PDF

TL;DR

This paper introduces split annotations (SAs), a technique that allows data movement optimizations in existing data-intensive libraries without modifying their code, significantly improving performance.

Contribution

The paper presents split annotations (SAs), a novel method enabling data movement optimizations over unmodified libraries through minimal annotations and an API.

Findings

01

Up to 15x acceleration in workloads like Intel MKL and Pandas.

02

Performance gains comparable or superior to library rewriting approaches.

03

Effective cross-function data pipelining and parallelization achieved.

Abstract

Data movement between main memory and the CPU is a major bottleneck in parallel data-intensive applications. In response, researchers have proposed using compilers and intermediate representations (IRs) that apply optimizations such as loop fusion under existing high-level APIs such as NumPy and TensorFlow. Even though these techniques generally do not require changes to user applications, they require intrusive changes to the library itself: often, library developers must rewrite each function using a new IR. In this paper, we propose a new technique called split annotations (SAs) that enables key data movement optimizations over unmodified library functions. SAs only require developers to annotate functions and implement an API that specifies how to partition data in the library. The annotation and API describe how to enable cross-function data pipelining and parallelization, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.