Autovesk: Automatic vectorized code generation from unstructured static   kernels using graph transformations

Hayfa Tayeb; Ludovic Paillat; Berenger Bramas

arXiv:2301.01018·cs.DC·July 3, 2023

Autovesk: Automatic vectorized code generation from unstructured static kernels using graph transformations

Hayfa Tayeb, Ludovic Paillat, Berenger Bramas

PDF

Open Access

TL;DR

Autovesk introduces an automatic method to vectorize scalar code with irregular data access patterns by transforming instruction graphs, enabling efficient use of SIMD capabilities on modern CPUs.

Contribution

It presents a novel graph transformation approach for automatic vectorization of chaotic data access codes, extending beyond regular algorithms.

Findings

01

Effective vectorization of irregular data access kernels

02

Demonstrated improvements on Intel AVX-512 and ARM SVE

03

Reduces instruction count and transformation costs

Abstract

Leveraging the SIMD capability of modern CPU architectures is mandatory to take full benefit of their increasing performance. To exploit this feature, binary executables must be explicitly vectorized by the developers or an automatic vectorization tool. This why the compilation research community has created several strategies to transform a scalar code into a vectorized implementation. However, the majority of the approaches focus on regular algorithms, such as affine loops, that can be vectorized with few data transformations. In this paper, we present a new approach that allow automatically vectorizing scalar codes with chaotic data accesses as long as their operations can be statically inferred. We describe how our method transforms a graph of scalar instructions into a vectorized one using different heuristics with the aim of reducing the number or cost of the instructions.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Interconnection Networks and Systems