
TL;DR
This paper introduces the split block Bloom filter, a variant optimized with SIMD instructions that significantly enhances speed, and is adopted by multiple major data systems.
Contribution
The paper presents the split block Bloom filter, leveraging SIMD instructions for substantial speed improvements over traditional Bloom filters.
Findings
Speed increased by 30%-450% using SIMD optimization.
Adopted by several major data processing systems.
Demonstrates practical efficiency gains in real-world applications.
Abstract
This short note describes a Bloom filter variant that takes advantage of modern SIMD instructions to increase speed by 30%-450%. This filter, the split block Bloom filter, is used by StarRocks, Apache Impala, Apache Kudu, Apache Parquet, Apache Arrow, Apache Drill, and Alibaba Cloud's Hologres.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Peer-to-Peer Network Technologies · Image and Video Quality Assessment
