Quality-of-Data for Consistency Levels in Geo-replicated Cloud Data Stores
\'Alvaro Garc\'ia-Recuero, S\'ergio Esteves, Lu\'is Veiga

TL;DR
This paper introduces a flexible data semantics model integrated into HBase to optimize data consistency levels in geo-replicated cloud data stores, balancing latency, staleness, and application-specific requirements.
Contribution
It presents a novel three-dimensional vector-field model for selective data provisioning, enhancing existing consistency models with application-aware, on-demand data replication in NoSQL stores.
Findings
Improved data consistency control tailored to application needs.
Enhanced HBase with atomic batch updates and semantic tagging.
Better handling of network partitions and disconnection periods.
Abstract
Cloud computing has recently emerged as a key technology to provide individuals and companies with access to remote computing and storage infrastructures. In order to achieve highly-available yet high-performing services, cloud data stores rely on data replication. However, providing replication brings with it the issue of consistency. Given that data are replicated in multiple geographically distributed data centers, and to meet the increasing requirements of distributed applications, many cloud data stores adopt eventual consistency and therefore allow to run data intensive operations under low latency. This comes at the cost of data staleness. In this paper, we prioritize data replication based on a set of flexible data semantics that can best suit all types of Big Data applications, avoiding overloading both network and systems during large periods of disconnection or partitions in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
