MemPool-3D: Boosting Performance and Efficiency of Shared-L1 Memory Many-Core Clusters with 3D Integration
Matheus Cavalcante, Anthony Agnesina, Samuel Riedel, Moritz Brunion,, Alberto Garcia-Ortiz, Dragomir Milojevic, Francky Catthoor, Sung Kyu Lim and, Luca Benini

TL;DR
MemPool-3D enhances many-core clusters by leveraging 3D integration to improve performance and energy efficiency through smart memory partitioning across stacked dies, outperforming 2D designs in key metrics.
Contribution
This paper introduces a novel 3D MemPool design that optimally partitions memory across layers, demonstrating significant performance and energy efficiency gains over traditional 2D architectures.
Findings
9.1% performance improvement in matrix multiplication
15% reduction in energy consumption for 4 MiB memory
3.7% energy savings with reduced memory capacity
Abstract
Three-dimensional integrated circuits promise power, performance, and footprint gains compared to their 2D counterparts, thanks to drastic reductions in the interconnects' length through their smaller form factor. We can leverage the potential of 3D integration by enhancing MemPool, an open-source many-core design with 256 cores and a shared pool of L1 scratchpad memory connected with a low-latency interconnect. MemPool's baseline 2D design is severely limited by routing congestion and wire propagation delay, making the design ideal for 3D integration. In architectural terms, we increase MemPool's scratchpad memory capacity beyond the sweet spot for 2D designs, improving performance in a common digital signal processing kernel. We propose a 3D MemPool design that leverages a smart partitioning of the memory resources across two layers to balance the size and utilization of the stacked…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D IC and TSV technologies · Parallel Computing and Optimization Techniques · Interconnection Networks and Systems
