On the random access performance of Cell Broadband Engine with graph analysis application
Mingyu Chen, David A. Bader

TL;DR
This paper evaluates the Cell Broadband Engine's performance on random memory access applications using benchmarks GUPS and SSCA#2, showing it outperforms traditional multi-processor systems despite certain limitations.
Contribution
It provides a detailed analysis of Cell/BE's suitability for irregular memory access applications, including optimization techniques and performance comparisons.
Findings
GUPS is 40-80% faster on Cell/BE
SSCA#2 is 17-30% faster on Cell/BE
Cell/BE has potential for irregular memory access applications
Abstract
The Cell Broad Engine (BE) Processor has unique memory access architecture besides its powerful computing engines. Many computing-intensive applications have been ported to Cell/BE successfully. But memory-intensive applications are rarely investigated except for several micro benchmarks. Since Cell/BE has powerful software visible DMA engine, this paper studies on whether Cell/BE is suit for applica- tions with large amount of random memory accesses. Two benchmarks, GUPS and SSCA#2, are used. The latter is a rather complex one that in representative of real world graph analysis applications. We find both benchmarks have good performance on Cell/BE based IBM QS20/22. Com- pared with 2 conventional multi-processor systems with the same core/thread number, GUPS is about 40-80% fast and SSCA#2 about 17-30% fast. The dynamic load balanc- ing and software pipeline for optimizing SSCA#2 are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced MIMO Systems Optimization · Caching and Content Delivery · Complex Network Analysis Techniques
