High-Performance Code Generation though Fusion and Vectorization

Jason Sewall; Simon J. Pennycook

arXiv:1710.08774·cs.PF·October 25, 2017·2 cites

High-Performance Code Generation though Fusion and Vectorization

Jason Sewall, Simon J. Pennycook

PDF

Open Access 1 Repo

TL;DR

This paper introduces HFAV, a technique that automatically fuses and vectorizes nested loop kernels to reduce storage and enhance performance on modern hardware, using a declarative transformation approach.

Contribution

It presents a novel method for automatic kernel transformation involving fusion and vectorization, with a prototype implementation that improves HPC code performance.

Findings

01

Reduced intermediate storage in transformed kernels

02

Improved performance on contemporary hardware

03

Effective automatic transformation for nested loops

Abstract

We present a technique for automatically transforming kernel-based computations in disparate, nested loops into a fused, vectorized form that can reduce intermediate storage needs and lead to improved performance on contemporary hardware. We introduce representations for the abstract relationships and data dependencies of kernels in loop nests and algorithms for manipulating them into more efficient form; we similarly introduce techniques for determining data access patterns for stencil-like array accesses and show how this can be used to elide storage and improve vectorization. We discuss our prototype implementation of these ideas---named HFAV---and its use of a declarative, inference-based front-end to drive transformations, and we present results for some prominent codes in HPC.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

intel/HFAV
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Embedded Systems Design Techniques