MINISA: Minimal Instruction Set Architecture for Next-gen Reconfigurable Inference Accelerator

Jianming Tong; Devansh Jain; Yujie Li; Charith Mendis; Tushar Krishna

arXiv:2603.20623·cs.AR·March 24, 2026

MINISA: Minimal Instruction Set Architecture for Next-gen Reconfigurable Inference Accelerator

Jianming Tong, Devansh Jain, Yujie Li, Charith Mendis, Tushar Krishna

PDF

Open Access

TL;DR

MINISA introduces a minimal instruction set for reconfigurable AI accelerators, significantly reducing control overhead and instruction traffic, leading to substantial speedups across diverse workloads.

Contribution

It proposes MINISA, a minimal instruction set that programs reconfigurable accelerators at the Virtual Neurons level, optimizing control overhead and supporting flexible data layouts.

Findings

01

Reduces off-chip instruction traffic by up to 4x10^5 times.

02

Eliminates instruction-fetch stalls, improving speed by up to 31.6x.

03

Supports diverse workloads including AI, FHE, and ZKP.

Abstract

Modern reconfigurable AI accelerators rely on rich mapping and data-layout flexibility to sustain high utilization across matrix multiplication, convolution, and emerging applications beyond AI. However, exposing this flexibility through fine-grained micro-control results in prohibitive control overhead of fetching configuration bits from off-chip memory. This paper presents MINISA, a minimal instruction set that programs a reconfigurable accelerator at the granularity of Virtual Neurons (VNs), the coarsest control granularity that retains flexibility of hardware and the finest granularity that avoids unnecessary control costs. First, we introduce FEATHER+, a modest refinement of FEATHER, that eliminates redundant on-chip replication needed for runtime dataflow/layout co-switching and supports dynamic cases where input and weight data are unavailable before execution for offline layout…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmbedded Systems Design Techniques · Advanced Neural Network Applications · Parallel Computing and Optimization Techniques