FERMI-ML: A Flexible and Resource-Efficient Memory-In-Situ SRAM Macro for TinyML acceleration

Mukul Lokhande; Akash Sankhe; S. V. Jaya Chand; and Santosh Kumar Vishvakarma

arXiv:2511.12544·cs.AR·February 12, 2026

FERMI-ML: A Flexible and Resource-Efficient Memory-In-Situ SRAM Macro for TinyML acceleration

Mukul Lokhande, Akash Sankhe, S. V. Jaya Chand, and Santosh Kumar Vishvakarma

PDF

TL;DR

FERMI-ML introduces a novel, flexible SRAM macro optimized for TinyML, enabling in-situ computation and lookup operations with high efficiency and reconfigurability on low-power AIoT devices.

Contribution

It presents a new 9T XNOR-based SRAM macro with integrated compute and memory functions, supporting variable-precision MAC and CAM operations within a compact design.

Findings

01

Achieves 1.93 TOPS throughput at 350 MHz and 0.9 V in 65 nm technology.

02

Offers 364 TOPS/W energy efficiency for TinyML workloads.

03

Maintains over 97.5% QoR on models like InceptionV4 and ResNet-18.

Abstract

The growing demand for low-power and area-efficient TinyML inference on AIoT devices necessitates memory architectures that minimise data movement while sustaining high computational efficiency. This paper presents FERMI-ML, a Flexible and Resource-Efficient Memory-In-Situ (MIS) SRAM macro designed for TinyML acceleration. The proposed 9T XNOR-based RX9T bit-cell integrates a 5T storage cell with a 4T XNOR compute unit, enabling variable-precision MAC and CAM operations within the same array. A 22-transistor (C22T) compressor-tree-based accumulator facilitates logarithmic 1-64-bit MAC computation with reduced delay and power compared to conventional adder trees. The 4 KB macro achieves dual functionality for in-situ computation and CAM-based lookup operations, supporting Posit-4 or FP-4 precision. Post-layout results at 65 nm show operation at 350 MHz with 0.9 V, delivering a throughput…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.