Sorted Weight Sectioning for Energy-Efficient Unstructured Sparse DNNs on Compute-in-Memory Crossbars
Matheus Farias, H. T. Kung

TL;DR
This paper presents sorted weight sectioning (SWS), an algorithm that optimizes weight placement on compute-in-memory crossbars to significantly reduce ADC energy consumption in unstructured sparse DNNs, especially for models like BERT.
Contribution
The paper introduces SWS, a novel weight allocation algorithm that leverages weight distribution and sparsity to minimize ADC energy use in CIM crossbars for DNNs.
Findings
Reduces ADC energy by 89.5% on unstructured sparse BERT models.
Effectively exploits weight distribution and sparsity for energy savings.
Maintains accuracy while significantly decreasing energy consumption.
Abstract
We introduce (SWS): a weight allocation algorithm that places sorted deep neural network (DNN) weight sections on bit-sliced compute-in-memory (CIM) crossbars to reduce analog-to-digital converter (ADC) energy consumption. Data conversions are the most energy-intensive process in crossbar operation. SWS effectively reduces this cost leveraging (1) small weights and (2) zero weights (weight sparsity). DNN weights follow bell-shaped distributions, with most weights near zero. Using SWS, we only need low-order crossbar columns for sections with low-magnitude weights. This reduces the quantity and resolution of ADCs used, exponentially decreasing ADC energy costs without significantly degrading DNN accuracy. Unstructured sparsification further sharpens the weight distribution with small accuracy loss. However, it presents challenges in hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · Ferroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Softmax · Multi-Head Attention · Dense Connections · WordPiece · Residual Connection · Linear Warmup With Linear Decay · Dropout
