Towards Programmable Memory Controller for Tensor Decomposition

Sasindu Wijeratne; Ta-Yang Wang; Rajgopal Kannan; Viktor Prasanna

arXiv:2207.08298·cs.DC·July 19, 2022

Towards Programmable Memory Controller for Tensor Decomposition

Sasindu Wijeratne, Ta-Yang Wang, Rajgopal Kannan, Viktor Prasanna

PDF

Open Access

TL;DR

This paper proposes a custom FPGA-based memory controller to accelerate the sparse MTTKRP kernel in tensor decomposition, addressing irregular memory access challenges for improved efficiency.

Contribution

It introduces a novel design approach for a programmable memory controller on FPGA tailored for sparse MTTKRP acceleration in tensor decomposition.

Findings

01

Custom memory controller improves MTTKRP performance.

02

Design explores parameter space for optimal FPGA implementation.

03

Enhances energy efficiency and parallelism in tensor computations.

Abstract

Tensor decomposition has become an essential tool in many data science applications. Sparse Matricized Tensor Times Khatri-Rao Product (MTTKRP) is the pivotal kernel in tensor decomposition algorithms that decompose higher-order real-world large tensors into multiple matrices. Accelerating MTTKRP can speed up the tensor decomposition process immensely. Sparse MTTKRP is a challenging kernel to accelerate due to its irregular memory access characteristics. Implementing accelerators on Field Programmable Gate Array (FPGA) for kernels such as MTTKRP is attractive due to the energy efficiency and the inherent parallelism of FPGA. This paper explores the opportunities, key challenges, and an approach for designing a custom memory controller on FPGA for MTTKRP while exploring the parameter space of such a custom memory controller.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Parallel Computing and Optimization Techniques · Computational Physics and Python Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings