Warp-Level Parallelism: Enabling Multiple Replications In Parallel on GPU
Jonathan Passerat-Palmbach (ISIMA, LIMOS, UBP), Jonathan Caux (LIMOS,, UBP, IBC, ISIMA), Pridi Siregar (IBC), Claude Mazel (LIMOS, UBP, ISIMA),, David Hill (UBP, LIMOS, ISIMA)

TL;DR
This paper introduces Warp-Level Parallelism (WLP), a GPU-based method that enables multiple stochastic simulation replications to run simultaneously, significantly accelerating the process compared to traditional SIMT approaches.
Contribution
The paper presents WLP, a novel GPU parallelization technique that efficiently executes multiple replications in parallel, improving speed over existing SIMT-based methods.
Findings
WLP achieves up to six times faster computation of multiple replications.
Benchmark results show significant speedup over traditional SIMT approaches.
WLP effectively leverages GPU warp-level parallelism for stochastic simulations.
Abstract
Stochastic simulations need multiple replications in order to build confidence intervals for their results. Even if we do not need a large amount of replications, it is a good practice to speed-up the whole simulation time using the Multiple Replications In Parallel (MRIP) approach. This approach usually supposes to have access to a parallel computer such as a symmetric mul-tiprocessing machine (with many cores), a computing cluster or a computing grid. In this paper, we propose Warp-Level Parallelism (WLP), a GP-GPU-enabled solution to compute MRIP on GP-GPUs (General-Purpose Graphics Processing Units). These devices display a great amount of parallel computational power at low cost, but are tuned to process efficiently the same operation on several data, through different threads. Indeed, this paradigm is called Single Instruction, Multiple Threads (SIMT). Our approach proposes to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Distributed systems and fault tolerance
