Louvre: Lightweight Ordering Using Versioning for Release Consistency
Pranith Kumar, Prasun Gera, Hyojong Kim, Hyesoon Kim

TL;DR
Louvre introduces a lightweight versioning mechanism for fences in weakly consistent multi-core processors, reducing overhead and improving performance by relaxing ordering constraints and minimizing stalls.
Contribution
The paper proposes a novel versioning-based approach to relax fence constraints, decreasing execution overhead and enhancing performance in weakly consistent memory models.
Findings
Reduces ordering instruction latency by 39.6%.
Improves overall program performance by 11%.
Effectively relaxes fence constraints to optimize execution.
Abstract
Fence instructions are fundamental primitives that ensure consistency in a weakly consistent shared memory multi-core processor. The execution cost of these instructions is significant and adds a non-trivial overhead to parallel programs. In a naive architecture implementation, we track the ordering constraints imposed by a fence by its entry in the reorder buffer and its execution overhead entails stalling the processor's pipeline until the store buffer is drained and also conservatively invalidating speculative loads. These actions create a cascading effect of increased overhead on the execution of the following instructions in the program. We find these actions to be overly restrictive and that they can be further relaxed thereby allowing aggressive optimizations. The current work proposes a lightweight mechanism in which we assign ordering tags, called versions, to load and store…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Interconnection Networks and Systems
