The Impact of Memory Models on Software Reliability in Multiprocessors
Alexander Jaffe, Thomas Moscibroda, Laura Effinger-Dean, Luis Ceze,, Karin Strauss

TL;DR
This paper investigates how different memory consistency models in multiprocessors affect software reliability, revealing that weaker models increase bug vulnerability with few threads but become less significant as thread count grows.
Contribution
It introduces a probabilistic model to analyze the impact of memory models on concurrency bug vulnerability, providing bounds and insights into their effects based on thread count.
Findings
Weaker memory models increase bug vulnerability with few threads.
As thread count increases, the impact of memory model strength diminishes.
Implications for designing future multi-core systems' memory models.
Abstract
The memory consistency model is a fundamental system property characterizing a multiprocessor. The relative merits of strict versus relaxed memory models have been widely debated in terms of their impact on performance, hardware complexity and programmability. This paper adds a new dimension to this discussion: the impact of memory models on software reliability. By allowing some instructions to reorder, weak memory models may expand the window between critical memory operations. This can increase the chance of an undesirable thread-interleaving, thus allowing an otherwise-unlikely concurrency bug to manifest. To explore this phenomenon, we define and study a probabilistic model of shared-memory parallel programs that takes into account such reordering. We use this model to formally derive bounds on the \emph{vulnerability} to concurrency bugs of different memory models. Our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Radiation Effects in Electronics
