Flexible Support for Fast Parallel Commutative Updates
Vignesh Balaji, Dhruva Tirumala, Brandon Lucia

TL;DR
CCache enables fast parallel commutative updates by dynamically privatizing data without increasing memory or cache usage, significantly improving performance in multithreaded applications.
Contribution
This work introduces CCache, a system for on-demand privatization of commutative data, reducing memory footprint and cache contention in parallel processing.
Findings
Achieves up to 3.2x speedup in various applications
Supports on-demand privatization without extra memory overhead
Reduces cache contention and memory footprint
Abstract
Privatizing data is a useful strategy for increasing parallelism in a shared memory multithreaded program. Independent cores can compute independently on duplicates of shared data, combining their results at the end of their computations. Conventional approaches to privatization, however, rely on explicit static or dynamic memory allocation for duplicated state, increasing memory footprint and contention for cache resources, especially in shared caches. In this work, we describe CCache, a system for on-demand privatization of data manipulated by commutative operations. CCache garners the benefits of privatization, without the increase in memory footprint or cache occupancy. Each core in CCache dynamically privatizes commutatively manipulated data, operating on a copy. Periodically or at the end of its computation, the core merges its value with the value resident in memory, and when all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
