Block-Parallel IDA* for GPUs (Extended Manuscript)

Satoru Horie; Alex Fukunaga

arXiv:1705.02843·cs.AI·May 9, 2017·1 cites

Block-Parallel IDA* for GPUs (Extended Manuscript)

Satoru Horie, Alex Fukunaga

PDF

Open Access

TL;DR

This paper introduces Block-Parallel IDA* (BPIDA*) for GPUs, which improves parallel search efficiency by assigning subtrees to thread groups, significantly accelerating IDA* search on the 15-puzzle compared to sequential methods.

Contribution

The paper proposes BPIDA*, a novel GPU parallelization technique for IDA*, addressing warp divergence and load imbalance issues in prior methods.

Findings

01

BPIDA* achieves a 4.98x speedup on the 15-puzzle.

02

Assigning subtrees to thread blocks improves GPU search efficiency.

03

The method outperforms previous thread-based parallelization techniques.

Abstract

We investigate GPU-based parallelization of Iterative-Deepening A* (IDA*). We show that straightforward thread-based parallelization techniques which were previously proposed for massively parallel SIMD processors perform poorly due to warp divergence and load imbalance. We propose Block-Parallel IDA* (BPIDA*), which assigns the search of a subtree to a block (a group of threads with access to fast shared memory) rather than a thread. On the 15-puzzle, BPIDA* on a NVIDIA GRID K520 with 1536 CUDA cores achieves a speedup of 4.98 compared to a highly optimized sequential IDA* implementation on a Xeon E5-2670 core.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Algorithms and Data Compression