TL;DR
TeraPool-SDR is a highly efficient, 1024-core cluster designed for next-generation software-defined radios, achieving high performance and energy efficiency for 5G RAN workloads in a compact 12nm implementation.
Contribution
This paper introduces TeraPool-SDR, a novel 1024-core cluster architecture with shared L1 memory, optimized for SDR workloads, demonstrating high throughput and energy efficiency in a compact design.
Findings
Achieves 93-125 GOPS/W on key 5G RAN kernels.
Operates at up to 924MHz in 12nm FinFET technology.
Consumes less than 10W for all tested kernels.
Abstract
Radio Access Networks (RAN) workloads are rapidly scaling up in data processing intensity and throughput as the 5G (and beyond) standards grow in number of antennas and sub-carriers. Offering flexible Processing Elements (PEs), efficient memory access, and a productive parallel programming model, many-core clusters are a well-matched architecture for next-generation software-defined RANs, but staggering performance requirements demand a high number of PEs coupled with extreme Power, Performance and Area (PPA) efficiency. We present the architecture, design, and full physical implementation of Terapool-SDR, a cluster for Software Defined Radio (SDR) with 1024 latency-tolerant, compact RV32 PEs, sharing a global view of a 4MiB, 4096-banked, L1 memory. We report various feasible configurations of TeraPool-SDR featuring an ultra-high bandwidth PE-to-L1-memory interconnect, clocked at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
