PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference
Aayush Ankit, Izzat El Hajj, Sai Rahul Chalamalasetti, Geoffrey Ndu,, Martin Foltin, R. Stanley Williams, Paolo Faraboschi, Wen-mei Hwu, John Paul, Strachan, Kaushik Roy, Dejan S Milojicic

TL;DR
PUMA is a novel memristor-based accelerator that combines in-memory analog computing with general-purpose execution units, enabling efficient and programmable machine learning inference across diverse applications.
Contribution
The paper introduces PUMA, a programmable memristor-based accelerator with a specialized ISA and compiler, expanding memristor crossbar capabilities to a wide range of ML workloads.
Findings
Achieves up to 2446x energy efficiency improvement over GPUs
Attains 66x latency reduction compared to GPUs
Maintains similar energy and latency as application-specific memristor accelerators
Abstract
Memristor crossbars are circuits capable of performing analog matrix-vector multiplications, overcoming the fundamental energy efficiency limitations of digital logic. They have been shown to be effective in special-purpose accelerators for a limited set of neural network applications. We present the Programmable Ultra-efficient Memristor-based Accelerator (PUMA) which enhances memristor crossbars with general purpose execution units to enable the acceleration of a wide variety of Machine Learning (ML) inference workloads. PUMA's microarchitecture techniques exposed through a specialized Instruction Set Architecture (ISA) retain the efficiency of in-memory computing and analog circuitry, without compromising programmability. We also present the PUMA compiler which translates high-level code to PUMA ISA. The compiler partitions the computational graph and optimizes instruction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Machine Learning in Materials Science
