ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks
Salma Afifi, Ishan Thakkar, Sudeep Pasricha

TL;DR
ARTEMIS is a novel in-DRAM accelerator that combines analog and stochastic computing to significantly speed up transformer neural networks while reducing energy consumption, addressing the computational and memory challenges of traditional hardware.
Contribution
It introduces a mixed analog-stochastic in-DRAM architecture supporting transformer models with minimal DRAM modifications, enabling efficient in-memory processing.
Findings
Achieves at least 3.0x speedup over GPUs and TPUs
Reduces energy consumption by 1.8x compared to state-of-the-art hardware
Offers 1.9x better energy efficiency than existing accelerators
Abstract
Transformers have emerged as a powerful tool for natural language processing (NLP) and computer vision. Through the attention mechanism, these models have exhibited remarkable performance gains when compared to conventional approaches like recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Nevertheless, transformers typically demand substantial execution time due to their extensive computations and large memory footprint. Processing in-memory (PIM) and near-memory computing (NMC) are promising solutions to accelerating transformers as they offer high compute parallelism and memory bandwidth. However, designing PIM/NMC architectures to support the complex operations and massive amounts of data that need to be moved between layers in transformer neural networks remains a challenge. We propose ARTEMIS, a mixed analog-stochastic in-DRAM accelerator for transformer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Memory and Neural Computing · Stochastic Gradient Optimization Techniques
MethodsSoftmax · Attention Is All You Need
