DICE: Enabling Efficient General-Purpose SIMT Execution with Statically Scheduled Coarse-Grained Reconfigurable Arrays

Jiayi Wang; Ang Da Lu; Zhichen Zeng; Ang Li

arXiv:2605.05496·cs.AR·May 8, 2026

DICE: Enabling Efficient General-Purpose SIMT Execution with Statically Scheduled Coarse-Grained Reconfigurable Arrays

Jiayi Wang, Ang Da Lu, Zhichen Zeng, Ang Li

PDF

TL;DR

DICE introduces a reconfigurable array-based architecture to replace SIMD units in GPUs, significantly reducing energy consumption while maintaining performance through static scheduling and innovative optimizations.

Contribution

It proposes a novel CGRA-based architecture with static scheduling and dynamic handling of runtime variability, achieving high energy efficiency in GPU-like workloads.

Findings

01

Reduces register file accesses by 68% on average.

02

Achieves 1.77-1.90x energy efficiency improvements.

03

Maintains comparable performance to traditional GPU architectures.

Abstract

While GPUs dominate massively parallel computing through the single-instruction, multiple-thread (SIMT) programming model, their underlying single-instruction, multiple-data (SIMD) execution incurs substantial energy overhead from frequent register file (RF) accesses and complex control logic. We present DICE, a novel architecture that addresses these inefficiencies by replacing the SIMD backend with minimal-overhead, statically scheduled coarse-grained reconfigurable arrays (CGRAs). Unlike SIMD units that execute warps of threads in lockstep, DICE dispatches active threads in a pipelined manner onto the CGRA fabric, where data flow directly between processing elements (PEs), reducing RF accesses for intermediate values. To handle operations with runtime dynamism, such as variable-latency memory loads and data-dependent control flow, while preserving static scheduling, DICE compiles…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.