Generation of the Single Precision BLAS library for the Parallella   platform, with Epiphany co-processor acceleration, using the BLIS framework

Miguel Tasende

arXiv:1608.05265·cs.DC·November 17, 2016

Generation of the Single Precision BLAS library for the Parallella platform, with Epiphany co-processor acceleration, using the BLIS framework

Miguel Tasende

PDF

TL;DR

This paper presents a new single-precision BLAS library for the Parallella platform, leveraging Epiphany co-processor acceleration with the BLIS framework to improve overall linear algebra performance.

Contribution

It introduces an Epiphany-accelerated BLAS library for the Parallella platform, enhancing performance for scientific computing applications on hybrid architectures.

Findings

01

Achieved improved BLAS performance on the Parallella platform.

02

Demonstrated potential for practical scientific computing applications.

03

Identified bandwidth limitations affecting full platform performance.

Abstract

The Parallella is a hybrid computing platform that came into existence as the result of a Kickstarter project by Adapteva. It is composed of the high performance, energy-efficient, manycore architecture, Epiphany chip (used as co-processor) and one Zynq-7000 series chip, which normally runs a regular Linux OS version, serves as the main processor, and implements "glue logic" in its internal FPGA to communicate with the many interfaces in the Parallella. In this paper an Epiphany-accelerated BLAS library for the Parallella platform was created (which could be suitable, also, for similar hybrid platforms that include the Epiphany chip as a coprocessor). For the actual instantiation of the BLAS, the BLIS framework was used. There have been previous implementations of Matrix-Matrix multiplication, on this platform, that achieved very good performances inside the Epiphany chip (up to 85% of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.