GEN-Graph: Heterogeneous PIM Accelerator for General Computational Patterns in Graph-based Dynamic Programming
Yanru Chen, Runyang Tian, Zheyu Li, Mahbod Afarin, Weihong Xu, Tajana Rosing

TL;DR
GEN-Graph introduces a heterogeneous PIM chiplet with specialized tiles for matrix-centric and traversal-centric dynamic programming, significantly improving performance and energy efficiency in genomics and network analytics.
Contribution
It presents the first scalable, exact heterogeneous PIM architecture tailored to different DP computation patterns, combining recursive partitioning and reconfigurable logic.
Findings
Matrix tile achieves 42.8x speedup over NVIDIA H100 for APSP.
Traversal tile reaches 2.56 million reads/sec for short-reads.
Outperforms state-of-the-art accelerators by up to 2.56x in throughput.
Abstract
While graph-based dynamic programming (DP) is a cornerstone of genomics and network analytics, its efficiency is hampered by fundamentally conflicting computational patterns. Matrix-centric DP drives regular, compute-bound network analytics, while topology-centric DP handles irregular, memory-bound genomic traversals. These two categories of DP have substantially different computation patterns and dataflows, which makes it difficult for a single homogeneous processing-in-memory (PIM) architecture to efficiently support both. This work presents GEN-Graph, a novel heterogeneous PIM chiplet that integrates two types of specialized compute tiles within a 2.5D package: Matrix-tile, a processing-using-memory (PUM) tile optimized for matrix-centric workloads, such as all-pairs shortest path (APSP); and traversal-tile, a processing-near-memory (PNM) tile optimized for traversal-centric DP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
