IMSSA: Deploying modern state-space models on memristive in-memory compute hardware
Sebastian Siegel, Ming-Jay Yang, and John-Paul Strachan

TL;DR
This paper presents IMSSA, a method for deploying state-space models on memristive in-memory hardware, reducing computational demands and enabling efficient long sequence processing on edge devices.
Contribution
It introduces a novel approach to implement S4 models on memristive hardware using quantization-aware training and in-memory compute techniques.
Findings
Successfully deployed S4 kernels on memristive hardware.
Achieved quantization, including ternary weights, for efficient model deployment.
Demonstrated real-world application on edge hardware.
Abstract
Processing long temporal sequences is a key challenge in deep learning. In recent years, Transformers have become state-of-the-art for this task, but suffer from excessive memory requirements due to the need to explicitly store the sequences. To address this issue, structured state-space sequential (S4) models recently emerged, offering a fixed memory state while still enabling the processing of very long sequence contexts. The recurrent linear update of the state in these models makes them highly efficient on modern graphics processing units (GPU) by unrolling the recurrence into a convolution. However, this approach demands significant memory and massively parallel computation, which is only available on the latest GPUs. In this work, we aim to bring the power of S4 models to edge hardware by significantly reducing the size and computational demand of an S4D model through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural dynamics and brain function · Neural Networks and Applications
