ORIGAMI: A Heterogeneous Split Architecture for In-Memory Acceleration of Learning
Hajar Falahati, Pejman Lotfi-Kamran, Mohammad Sadrosadati and, Hamid Sarbazi-Azad

TL;DR
ORIGAMI introduces a heterogeneous in-memory acceleration architecture supporting diverse ML algorithms by combining pattern matching, specialized compute engines, and split execution with external platforms, achieving significant performance and energy efficiency improvements.
Contribution
It presents a novel heterogeneous in-memory accelerator design with pattern matching and computation splitting, enabling efficient support for various ML algorithms within strict power and area constraints.
Findings
Outperforms state-of-the-art in-memory accelerators by up to 1.6x in performance.
Reduces energy-delay product by up to 31x.
Achieves results within 1% of an ideal unlimited-resource system.
Abstract
Memory bandwidth bottleneck is a major challenges in processing machine learning (ML) algorithms. In-memory acceleration has potential to address this problem; however, it needs to address two challenges. First, in-memory accelerator should be general enough to support a large set of different ML algorithms. Second, it should be efficient enough to utilize bandwidth while meeting limited power and area budgets of logic layer of a 3D-stacked memory. We observe that previous work fails to simultaneously address both challenges. We propose ORIGAMI, a heterogeneous set of in-memory accelerators, to support compute demands of different ML algorithms, and also uses an off-the-shelf compute platform (e.g.,FPGA,GPU,TPU,etc.) to utilize bandwidth without violating strict area and power budgets. ORIGAMI offers a pattern-matching technique to identify similar computation patterns of ML algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Machine Learning and Algorithms
