HiMA: A Fast and Scalable History-based Memory Access Engine for Differentiable Neural Computer
Yaoyu Tao, Zhengya Zhang

TL;DR
HiMA is a specialized hardware engine designed to efficiently support history-based memory access in differentiable neural computers, significantly improving speed and efficiency over existing accelerators.
Contribution
The paper introduces HiMA, a scalable, tiled memory access engine with distributed memories and novel techniques for efficient hardware implementation of history-based attention mechanisms.
Findings
HiMA achieves up to 39.1x higher speed over state-of-the-art accelerators.
HiMA demonstrates up to 2,646x speedup compared to Nvidia 3080Ti GPU.
HiMA improves area and energy efficiency by over 60x.
Abstract
Memory-augmented neural networks (MANNs) provide better inference performance in many tasks with the help of an external memory. The recently developed differentiable neural computer (DNC) is a MANN that has been shown to outperform in representing complicated data structures and learning long-term dependencies. DNC's higher performance is derived from new history-based attention mechanisms in addition to the previously used content-based attention mechanisms. History-based mechanisms require a variety of new compute primitives and state memories, which are not supported by existing neural network (NN) or MANN accelerators. We present HiMA, a tiled, history-based memory access engine with distributed memories in tiles. HiMA incorporates a multi-mode network-on-chip (NoC) to reduce the communication latency and improve scalability. An optimal submatrix-wise memory partition strategy is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural Networks and Applications · Ferroelectric and Negative Capacitance Devices
MethodsSoftmax · Content-based Attention
