Low-Rank Compression for IMC Arrays

Kang Eun Jeon; Johnny Rhe; Jong Hwan Ko

arXiv:2502.07820·cs.AR·February 13, 2025

Low-Rank Compression for IMC Arrays

Kang Eun Jeon, Johnny Rhe, Jong Hwan Ko

PDF

Open Access

TL;DR

This paper introduces a low-rank compression method for in-memory computing architectures that improves efficiency and accuracy, addressing limitations of traditional pruning approaches.

Contribution

We propose a novel low-rank compression technique with SDK mapping and group convolution to enhance IMC array utilization and accuracy.

Findings

01

Achieves up to 2.5x speedup over pruning methods.

02

Provides up to +20.9% accuracy improvement.

03

Reduces area and energy overheads in IMC architectures.

Abstract

In this study, we address the challenge of low-rank model compression in the context of in-memory computing (IMC) architectures. Traditional pruning approaches, while effective in model size reduction, necessitate additional peripheral circuitry to manage complex dataflows and mitigate dislocation issues, leading to increased area and energy overheads. To circumvent these drawbacks, we propose leveraging low-rank compression techniques, which, unlike pruning, streamline the dataflow and seamlessly integrate with IMC architectures. However, low-rank compression presents its own set of challenges, namely i) suboptimal IMC array utilization and ii) compromised accuracy. To address these issues, we introduce a novel approach i) employing shift and duplicate kernel (SDK) mapping technique, which exploits idle IMC columns for parallel processing, and ii) group low-rank convolution, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Advanced Wireless Communication Techniques · Advanced MIMO Systems Optimization

MethodsPruning · Sparse Evolutionary Training