IndexMAC: A Custom RISC-V Vector Instruction to Accelerate Structured-Sparse Matrix Multiplications
V. Titopoulos, K. Alexandridis, C. Peltekis, C. Nicopoulos, G., Dimitrakopoulos

TL;DR
This paper introduces IndexMAC, a custom RISC-V vector instruction designed to accelerate structured-sparse matrix multiplications in machine learning, achieving significant speedups with minimal hardware overhead.
Contribution
The paper proposes a novel vector instruction, IndexMAC, that efficiently handles structured sparsity in matrix multiplications on RISC-V vector processors.
Findings
Achieves 1.80x-2.14x speedup over state-of-the-art kernels.
Integrates with minimal hardware cost.
Effective for CNN layer computations with structured sparsity.
Abstract
Structured sparsity has been proposed as an efficient way to prune the complexity of modern Machine Learning (ML) applications and to simplify the handling of sparse data in hardware. The acceleration of ML models - for both training and inference - relies primarily on equivalent matrix multiplications that can be executed efficiently on vector processors or custom matrix engines. The goal of this work is to incorporate the simplicity of structured sparsity into vector execution, thereby accelerating the corresponding matrix multiplications. Toward this objective, a new vector index-multiply-accumulate instruction is proposed, which enables the implementation of lowcost indirect reads from the vector register file. This reduces unnecessary memory traffic and increases data locality. The proposed new instruction was integrated in a decoupled RISCV vector processor with negligible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Tensor decomposition and applications
