Efficient Batch Search Algorithm for B+ Tree Index Structures with Level-Wise Traversal on FPGAs
Max Tzschoppe, Martin Wilhelm, Sven Groppe, Thilo Pionteck

TL;DR
This paper presents an FPGA-optimized batch search algorithm for B+ tree indexes that significantly accelerates search operations by level-wise traversal and node reuse, outperforming CPU implementations.
Contribution
The authors developed a flexible FPGA-based B+ tree search kernel with level-wise traversal, supporting variable configurations, achieving substantial speedups over CPU methods.
Findings
4.9x speedup with batch size 1000 on FPGA
2.1x performance improvement with four parallel kernels
Outperforms CPU-based B+ tree search in experiments
Abstract
This paper introduces a search algorithm for index structures based on a B+ tree, specifically optimized for execution on a field-programmable gate array (FPGA). Our implementation efficiently traverses and reuses tree nodes by processing a batch of search keys level by level. This approach reduces costly global memory accesses, improves reuse of loaded B+ tree nodes, and enables parallel search key comparisons directly on the FPGA. Using a high-level synthesis (HLS) approach, we developed a highly flexible and configurable search kernel design supporting variable batch sizes, customizable node sizes, and arbitrary tree depths. The final design was implemented on an AMD Alveo U250 Data Center Accelerator Card, and was evaluated against the B+ tree search algorithm from the TLX library running on an AMD EPYC 7542 processor (2.9 GHz). With a batch size of 1000 search keys, a B+ tree…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
