Tascade: Hardware Support for Atomic-free, Asynchronous and Efficient Reduction Trees
Marcelo Orenes-Vera, Esin Tureci, David Wentzlaff, Margaret Martonosi

TL;DR
Tascade is a hardware-software co-design that enables scalable, asynchronous reduction trees for large manycore servers, significantly reducing communication overhead and improving performance for irregular graph workloads.
Contribution
It introduces a novel execution model and hardware support for efficient, atomic-free reductions, scaling up to a million PUs in large-scale parallel systems.
Findings
Achieves over 7600 GTEPS in BFS on RMAT-26 with a million PUs
Reduces communication and power consumption compared to prior approaches
Scales efficiently for irregular workloads on large manycore architectures
Abstract
Graph search and sparse data-structure traversal workloads contain challenging irregular memory patterns on global data structures that need to be modified atomically. Distributed processing of these workloads has relied on server threads operating on their own data copies that are merged upon global synchronization. As parallelism increases within each server, the communication challenges that arose in distributed systems a decade ago are now being encountered within large manycore servers. Prior work has achieved scalability for sparse applications up to thousands of PUs on-chip, but does not scale further due to increasing communication distances and load-imbalance across PUs. To address these challenges we propose Tascade, a hardware-software co-design that offers support for storage-efficient data-private reductions as well as asynchronous and opportunistic reduction trees. Tascade…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Memory and Neural Computing
