NoSQL Database Tuning through Machine Learning

Florian Eppinger; Uta St\"orl

arXiv:2212.12301·cs.DB·December 26, 2022·1 cites

NoSQL Database Tuning through Machine Learning

Florian Eppinger, Uta St\"orl

PDF

Open Access

TL;DR

This paper presents a machine learning-based approach to automatically optimize NoSQL database configurations, significantly improving throughput and reducing latency in Apache Cassandra through surrogate modeling and black-box optimization.

Contribution

It introduces a novel method using Random Forest and Gradient Boosting models to tune NoSQL databases, addressing the complexity of configuration inter-dependencies.

Findings

01

Up to 4% throughput improvement

02

Latency reductions of up to 43% (read) and 39% (write)

03

Feasibility demonstrated across various physical configurations

Abstract

NoSQL databases have become an important component of many big data and real-time web applications. Their distributed nature and scalability make them an ideal data storage repository for a variety of use cases. While NoSQL databases are delivered with a default ''off-the-shelf'' configuration, they offer configuration settings to adjust a database's behavior and performance to a specific use case and environment. The abundance and oftentimes imperceptible inter-dependencies of configuration settings make it difficult to optimize and performance-tune a NoSQL system. There is no one-size-fits-all configuration and therefore the workload, the physical design, and available resources need to be taken into account when optimizing the configuration of a NoSQL database. This work explores Machine Learning as a means to automatically tune a NoSQL database for optimal performance. Using Random…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Water Quality Monitoring and Analysis