Tempus: A Temporally Scalable Resource-Invariant GEMM Streaming Framework for Versal AI Edge

M. Grailoo; J. N\'u\~nez-Y\'a\~nez

arXiv:2605.00536·cs.DC·May 5, 2026

Tempus: A Temporally Scalable Resource-Invariant GEMM Streaming Framework for Versal AI Edge

M. Grailoo, J. N\'u\~nez-Y\'a\~nez

PDF

1 Repo

TL;DR

Tempus is a scalable, resource-invariant GEMM streaming framework optimized for AMD Versal AI Edge SoCs, enabling efficient edge inference of large language models with high performance and low resource consumption.

Contribution

It introduces a novel temporal GEMM approach that maintains scalability without hardware expansion, outperforming spatial scaling methods on resource-limited edge devices.

Findings

01

Achieves 607 GOPS at 10.677 W on-chip power.

02

211.2x higher prominence factor than spatial SOTA (ARIES).

03

0.00% utilization of URAM/DSP, with 22.0x core frugality.

Abstract

Scaling laws for Large Language Models (LLMs) establish that model quality improves with computational scale, yet edge deployment imposes strict constraints on compute, memory, and power. Since General Matrix Multiplication (GEMM) accounts for up to 90% of inference time, efficient GEMM acceleration is critical for edge AI. The Adaptive Intelligent Engines available in the AMD Versal adaptive SoCs are well suited for this task, but existing state-of-the-art (SOTA) frameworks maximize performance through spatial scaling, distributing workloads across hundreds of cores -- an approach that fails on resource-limited edge SoCs due to physical implementation failures, bandwidth saturation, and excessive resource consumption. We propose Tempus, a Resource-Invariant Temporal GEMM framework for the AMD Versal AI Edge SoC. Rather than expanding hardware resources with matrix size, Tempus employs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mgrailoo/TEMPUS
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.