Algorithmic patterns for $\mathcal{H}$-matrices on many-core processors

Peter Zaspel

arXiv:1708.09707·cs.DC·September 4, 2017·1 cites

Algorithmic patterns for $\mathcal{H}$-matrices on many-core processors

Peter Zaspel

PDF

Open Access 1 Repo

TL;DR

This paper develops and implements a fully GPU-based hierarchical matrix library, enabling efficient matrix operations on many-core processors with significant speedups over traditional CPU-based methods.

Contribution

It introduces novel parallel algorithmic patterns for $\\mathcal{H}$-matrix construction and multiplication tailored for GPU hardware, creating the first entirely GPU-based open-source library.

Findings

01

Achieves significant speedups compared to CPU-based libraries.

02

Successfully maps complex $\\mathcal{H}$-matrix algorithms to GPU architecture.

03

Provides an in-depth performance analysis and validation.

Abstract

In this work, we consider the reformulation of hierarchical ( $H$ ) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). $H$ matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of $H$ matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing $H$ matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full $H$ matrix construction and the fast…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zaspel/hmglib
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Matrix Theory and Algorithms · Electromagnetic Scattering and Analysis