Proteus: Enabling High-Performance Processing-Using-DRAM with Dynamic Bit-Precision, Adaptive Data Representation, and Flexible Arithmetic
Geraldo F. Oliveira, Mayank Kabra, Yuxin Guo, Kangqi Chen, A. Giray Ya\u{g}l{\i}k\c{c}{\i}, Melina Soysal, Mohammad Sadrosadati, Joaquin Olivares Bueno, Saugata Ghose, Juan G\'omez-Luna, Onur Mutlu

TL;DR
Proteus introduces a dynamic, data-aware framework for processing-using-DRAM that reduces latency and energy consumption by adapting bit-precision, data representation, and execution strategies based on data characteristics.
Contribution
Proteus is the first hardware framework to dynamically optimize PUD operations by adjusting bit-precision and data representation, improving efficiency and performance.
Findings
Reduces PUD operation latency significantly.
Improves energy efficiency through dynamic bit-precision adjustment.
Enhances throughput by concurrent execution across multiple DRAM arrays.
Abstract
Processing-using-DRAM (PUD) is a paradigm where the analog operational properties of DRAM are used to perform bulk logic operations. While PUD promises high throughput at low energy and area cost, we uncover three limitations of existing PUD approaches that lead to significant inefficiencies: (i) static data representation, i.e., two's complement with fixed bit-precision, leading to unnecessary computation over useless (i.e., inconsequential) data; (ii) support for only throughput-oriented execution, where the high latency of individual PUD operations can only be hidden in the presence of bulk data-level parallelism; and (iii) high latency for high-precision (e.g., 32-bit) operations. To address these issues, we propose Proteus, the first hardware framework that addresses the high execution latency of bulk bitwise PUD operations by implementing a data-aware runtime engine for PUD.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLow-power high-performance VLSI design · Parallel Computing and Optimization Techniques · VLSI and FPGA Design Techniques
