CapStore: Energy-Efficient Design and Management of the On-Chip Memory for CapsuleNet Inference Accelerators
Alberto Marchisio, Muhammad Abdullah Hanif, Mohammad Taghi Teimoori,, Muhammad Shafique

TL;DR
This paper introduces an energy-efficient on-chip memory hierarchy for CapsuleNet inference accelerators, significantly reducing energy consumption by up to 86% through specialized design and power-gating techniques.
Contribution
It proposes a novel on-chip memory architecture and methodology tailored for CapsuleNet accelerators, addressing high memory demands and energy efficiency challenges.
Findings
Memory design reduces energy consumption by up to 86%.
Power-gating further decreases energy use based on operational utilization.
The proposed architecture effectively minimizes off-chip memory accesses.
Abstract
Deep Neural Networks (DNNs) have been established as the state-of-the-art algorithm for advanced machine learning applications. Recently, CapsuleNets have improved the generalization ability, as compared to DNNs, due to their multi-dimensional capsules. However, they pose high computational and memory requirements, which makes energy-efficient inference a challenging task. In this paper, we perform an extensive analysis to demonstrate their key limitations due to intense memory accesses and large on-chip memory requirements. To enable efficient CaspuleNet inference accelerators, we propose a specialized on-chip memory hierarchy which minimizes the off-chip memory accesses, while efficiently feeding the data to the accelerator. We analyze the on-chip memory requirements for each memory component of the architecture. By leveraging this analysis, we propose a methodology to explore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Parallel Computing and Optimization Techniques
