Garbage Collection for Multicore NUMA Machines

Sven Auhagen; Lars Bergstrom; Matthew Fluet; John Reppy

arXiv:1105.2554·cs.PL·May 13, 2011

Garbage Collection for Multicore NUMA Machines

Sven Auhagen, Lars Bergstrom, Matthew Fluet, John Reppy

PDF

Open Access

TL;DR

This paper introduces a garbage collector designed for multicore NUMA architectures, improving scalability of parallel functional language implementations by effectively managing memory across heterogeneous memory hierarchies.

Contribution

It presents a novel garbage collection technique integrated with Manticore, a strict parallel functional language, demonstrating improved scalability on high-core-count NUMA systems.

Findings

01

Scales effectively on 48-core AMD Opteron machine

02

Achieves better memory bandwidth utilization

03

Demonstrates improved scalability over traditional methods

Abstract

Modern high-end machines feature multiple processor packages, each of which contains multiple independent cores and integrated memory controllers connected directly to dedicated physical RAM. These packages are connected via a shared bus, creating a system with a heterogeneous memory hierarchy. Since this shared bus has less bandwidth than the sum of the links to memory, aggregate memory bandwidth is higher when parallel threads all access memory local to their processor package than when they access memory attached to a remote package. This bandwidth limitation has traditionally limited the scalability of modern functional language implementations, which seldom scale well past 8 cores, even on small benchmarks. This work presents a garbage collector integrated with our strict, parallel functional language implementation, Manticore, and shows that it scales effectively on both a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Advanced Data Storage Technologies