Benchmarking Hashing Algorithms for Load Balancing in a Distributed Database Environment
Alexander Slesarev, Mikhail Mikhailov, George Chernishev

TL;DR
This paper evaluates various hashing algorithms for load balancing in distributed databases, focusing on their distribution uniformity, data movement during node changes, and computational efficiency through simulated and real experiments.
Contribution
It provides a comprehensive benchmark of multiple hashing algorithms using a new suite based on Unidata MDM, highlighting their performance differences.
Findings
Identified algorithms with optimal balance of uniformity and minimal data movement.
Provided a comparative assessment of hashing algorithms' speed and distribution quality.
Developed a benchmark suite for evaluating load balancing algorithms in MDM systems.
Abstract
Modern high load applications store data using multiple database instances. Such an architecture requires data consistency, and it is important to ensure even distribution of data among nodes. Load balancing is used to achieve these goals. Hashing is the backbone of virtually all load balancing systems. Since the introduction of classic Consistent Hashing, many algorithms have been devised for this purpose. One of the purposes of the load balancer is to ensure storage cluster scalability. It is crucial for the performance of the whole system to transfer as few data records as possible during node addition or removal. The load balancer hashing algorithm has the greatest impact on this process. In this paper we experimentally evaluate several hashing algorithms used for load balancing, conducting both simulated and real system experiments. To evaluate algorithm performance, we have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Caching and Content Delivery · Data Management and Algorithms
