Ara: A 1 GHz+ Scalable and Energy-Efficient RISC-V Vector Processor with Multi-Precision Floating Point Support in 22 nm FD-SOI
Matheus Cavalcante, Fabian Schuiki, Florian Zaruba, Michael Schaffner,, Luca Benini

TL;DR
Ara is a scalable, energy-efficient 64-bit RISC-V vector processor implemented in 22nm FD-SOI technology, achieving over 1 GHz and high floating-point performance with multi-precision support.
Contribution
This paper introduces Ara, a novel scalable vector processor based on RISC-V, optimized for high performance and energy efficiency in 22nm FD-SOI technology.
Findings
Achieves over 1 GHz clock speed.
Reaches up to 33 DP-GFLOPS performance.
Attains up to 41 DP-GFLOPS/W energy efficiency.
Abstract
In this paper, we present Ara, a 64-bit vector processor based on the version 0.5 draft of RISC-V's vector extension, implemented in GlobalFoundries 22FDX FD-SOI technology. Ara's microarchitecture is scalable, as it is composed of a set of identical lanes, each containing part of the processor's vector register file and functional units. It achieves up to 97% FPU utilization when running a 256 x 256 double precision matrix multiplication on sixteen lanes. Ara runs at more than 1 GHz in the typical corner (TT/0.80V/25 oC) achieving a performance up to 33 DP-GFLOPS. In terms of energy efficiency, Ara achieves up to 41 DP-GFLOPS/W under the same conditions, which is slightly superior to similar vector processors found in literature. An analysis on several vectorizable linear algebra computation kernels for a range of different matrix and vector sizes gives insight into performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
