Versa: A Dataflow-Centric Multiprocessor with 36 Systolic ARM Cortex-M4F Cores and a Reconfigurable Crossbar-Memory Hierarchy in 28nm
Sung Kim, Morteza Fayazi, Alhad Daftardar, Kuan-Yu Chen, Jielun Tan,, Subhankar Pal, Tutu Ajayi, Yan Xiong, Trevor Mudge, Chaitali Chakrabarti,, David Blaauw, Ronald Dreslinski, Hun-Seok Kim

TL;DR
Versa is an energy-efficient, reconfigurable multiprocessor with 36 ARM Cortex-M4F cores designed to optimize data access and reuse, achieving significant energy savings over mobile CPUs and GPUs.
Contribution
This paper introduces Versa, a novel dataflow-centric multiprocessor architecture with a reconfigurable memory hierarchy tailored for energy efficiency.
Findings
Median energy-efficiency improvements of 11.6x over mobile CPU
Median energy-efficiency improvements of 37.2x over mobile GPU
Effective reconfiguration enhances performance across diverse kernels
Abstract
We present Versa, an energy-efficient processor with 36 systolic ARM Cortex-M4F cores and a runtime-reconfigurable memory hierarchy. Versa exploits algorithm-specific characteristics in order to optimize bandwidth, access latency, and data reuse. Measured on a set of kernels with diverse data access, control, and synchronization characteristics, reconfiguration between different Versa modes yields median energy-efficiency improvements of 11.6x and 37.2x over mobile CPU and GPU baselines, respectively.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
