Cache-aware Parallel Programming for Manycore Processors
Ashkan Tousimojarad, Wim Vanderbauwhede

TL;DR
This paper introduces a cache-aware programming technique for manycore processors that leverages distributed caches without relying on architecture-specific libraries, improving parallel efficiency.
Contribution
It presents a novel, architecture-agnostic programming approach for NUCA manycore systems that enhances parallelization efficiency.
Findings
Significant speed-up in parallel execution.
Effective utilization of distributed cache architecture.
Improved parallelization efficiency.
Abstract
With rapidly evolving technology, multicore and manycore processors have emerged as promising architectures to benefit from increasing transistor numbers. The transition towards these parallel architectures makes today an exciting time to investigate challenges in parallel computing. The TILEPro64 is a manycore accelerator, composed of 64 tiles interconnected via multiple 8x8 mesh networks. It contains per-tile caches and supports cache-coherent shared memory by default. In this paper we present a programming technique to take advantages of distributed caching facilities in manycore processors. However, unlike other work in this area, our approach does not use architecture-specific libraries. Instead, we provide the programmer with a novel technique on how to program future Non-Uniform Cache Architecture (NUCA) manycore systems, bearing in mind their caching organisation. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Advanced Data Storage Technologies
