Computing-In-Memory Dataflow for Minimal Buffer Traffic
Choongseok Song, Doo Seok Jeong

TL;DR
This paper introduces a novel CIM dataflow that drastically reduces buffer traffic during depthwise convolution, significantly improving energy efficiency and latency for edge AI devices.
Contribution
It presents a new CIM dataflow designed to minimize buffer traffic and enhance memory utilization during depthwise convolution, backed by solid theoretical analysis.
Findings
Reduces buffer traffic by 77.4-87.0% in tested models.
Decreases data traffic energy by 10.1-17.9%.
Lowers latency by 15.6-27.8%.
Abstract
Computing-In-Memory (CIM) offers a potential solution to the memory wall issue and can achieve high energy efficiency by minimizing data movement, making it a promising architecture for edge AI devices. Lightweight models like MobileNet and EfficientNet, which utilize depthwise convolution for feature extraction, have been developed for these devices. However, CIM macros often face challenges in accelerating depthwise convolution, including underutilization of CIM memory and heavy buffer traffic. The latter, in particular, has been overlooked despite its significant impact on latency and energy consumption. To address this, we introduce a novel CIM dataflow that significantly reduces buffer traffic by maximizing data reuse and improving memory utilization during depthwise convolution. The proposed dataflow is grounded in solid theoretical principles, fully demonstrated in this paper.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
