The ARM Scalable Vector Extension

Nigel Stephens; Stuart Biles; Matthias Boettcher; Jacob Eapen; Mbou; Eyole; Giacomo Gabrielli; Matt Horsnell; Grigorios Magklis; Alejandro; Martinez; Nathanael Premillieu; Alastair Reid; Alejandro Rico; Paul Walker

arXiv:1803.06185·cs.AR·March 19, 2018

The ARM Scalable Vector Extension

Nigel Stephens, Stuart Biles, Matthias Boettcher, Jacob Eapen, Mbou, Eyole, Giacomo Gabrielli, Matt Horsnell, Grigorios Magklis, Alejandro, Martinez, Nathanael Premillieu, Alastair Reid, Alejandro Rico, Paul Walker

PDF

TL;DR

The ARM SVE architecture extends vector processing capabilities with scalable vector lengths, supporting diverse applications and enabling efficient auto-vectorization without software rework.

Contribution

It introduces a scalable, vector-length agnostic architecture that enhances auto-vectorization and supports multiple implementations, addressing key challenges in high-performance computing.

Findings

01

Supports vector lengths from 128 to 2048 bits

02

Enables code to scale automatically across vector lengths

03

Introduces features to improve auto-vectorization

Abstract

This article describes the ARM Scalable Vector Extension (SVE). Several goals guided the design of the architecture. First was the need to extend the vector processing capability associated with the ARM AArch64 execution state to better address the computational requirements in domains such as high-performance computing, data analytics, computer vision, and machine learning. Second was the desire to introduce an extension that can scale across multiple implementations, both now and into the future, allowing CPU designers to choose the vector length most suitable for their power, performance, and area targets. Finally, the architecture should avoid imposing a software development cost as the vector length changes and where possible reduce it by improving the reach of compiler auto-vectorization technologies. SVE achieves these goals. It allows implementations to choose a vector register…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.