Mapping Stencils on Coarse-grained Reconfigurable Spatial Architecture

Jesmin Jahan Tithi; Fabrizio Petrini; Hongbo Rong; Andrei Valentin,; Carl Ebeling

arXiv:2011.05160·cs.DC·March 24, 2021·1 cites

Mapping Stencils on Coarse-grained Reconfigurable Spatial Architecture

Jesmin Jahan Tithi, Fabrizio Petrini, Hongbo Rong, Andrei Valentin,, Carl Ebeling

PDF

Open Access

TL;DR

This paper presents novel methods for mapping stencil computations onto coarse-grained reconfigurable spatial architectures (CGRA), leveraging data reuse and parallelism to outperform GPUs in scientific computing tasks.

Contribution

It introduces new mapping techniques for stencil computations on CGRA, fully exploiting data reuse and parallelism for enhanced performance.

Findings

01

Mappings are efficient and outperform GPUs in simulations.

02

Data reuse significantly improves performance.

03

Parallelism in CGRA benefits stencil computations.

Abstract

Stencils represent a class of computational patterns where an output grid point depends on a fixed shape of neighboring points in an input grid. Stencil computations are prevalent in scientific applications engaging a significant portion of supercomputing resources. Therefore, it has been always important to optimize stencil programs for the best performance. A rich body of research has focused on optimizing stencil computations on almost all parallel architectures. Stencil applications have regular dependency patterns, inherent pipeline-parallelism, and plenty of data reuse. This makes these applications a perfect match for a coarse-grained reconfigurable spatial architecture (CGRA). A CGRA consists of many simple, small processing elements (PEs) connected with an on-chip network. Each PE can be configured to execute part of a stencil computation and all PEs run in parallel; the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Interconnection Networks and Systems