TensorPool: A 3D-Stacked 8.4TFLOPS/4.3W Many-Core Domain-Specific Processor for AI-Native Radio Access Networks

Marco Bertuletti; Yichao Zhang; Diyou Shen; Alessandro Vanelli-Coralli; Frank K. G\"urkaynak; Luca Benini

arXiv:2604.02291·cs.AR·April 3, 2026

TensorPool: A 3D-Stacked 8.4TFLOPS/4.3W Many-Core Domain-Specific Processor for AI-Native Radio Access Networks

Marco Bertuletti, Yichao Zhang, Diyou Shen, Alessandro Vanelli-Coralli, Frank K. G\"urkaynak, Luca Benini

PDF

TL;DR

TensorPool is a specialized many-core processor designed for AI-native radio access networks, achieving high tensor computation throughput and energy efficiency within strict power and latency constraints.

Contribution

We introduce TensorPool, a domain-specific, 3D-stacked many-core processor with tensor acceleration, optimized for AI-based 6G RAN, demonstrating significant performance and efficiency improvements.

Findings

01

TensorPool achieves 3643 MACs/cycle with 89% tensor-unit utilization.

02

It provides 6× more tensor performance than a core-only cluster.

03

TensorPool improves GOPS/W/mm² efficiency by 9.1×.

Abstract

The upcoming integration of AI in the physical layer (PHY) of 6G radio access networks (RAN) will enable a higher quality of service in challenging transmission scenarios. However, deeply optimized AI-Native PHY models impose higher computational complexity compared to conventional baseband, challenging deployment under the sub-msec real-time constraints typical of modern PHYs. Additionally, following the extension to terahertz carriers, the upcoming densification of 6G cell-sites further limits the power consumption of base stations, constraining the budget available for compute ( $\leq$ 100W). The desired flexibility to ensure long term sustainability and the imperative energy-efficiency gains on the high-throughput tensor computations dominating AI-Native PHYs can be achieved by domain-specialization of many-core programmable baseband processors. Following the domain-specialization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.