Joint Data Purchasing and Data Placement in a Geo-Distributed Data Market
Xiaoqi Ren, Palma London, Juba Ziani, Adam Wierman

TL;DR
This paper addresses the complex problem of jointly optimizing data purchasing and placement in geo-distributed cloud markets, proposing algorithms that are near-optimal and computationally efficient, with practical validation.
Contribution
It introduces Datum, a novel algorithm that decomposes the joint data purchasing and placement problem into two manageable subproblems, achieving near-optimal solutions.
Findings
Datum is within 1.6% of optimal in case studies.
The joint problem is NP-hard, but can be approximated efficiently.
The approach generalizes from single to multiple data centers.
Abstract
This paper studies two design tasks faced by a geo-distributed cloud data market: which data to purchase (data purchasing) and where to place/replicate the data for delivery (data placement). We show that the joint problem of data purchasing and data placement within a cloud data market can be viewed as a facility location problem, and is thus NP-hard. However, we give a provably optimal algorithm for the case of a data market made up of a single data center, and then generalize the structure from the single data center setting in order to develop a near-optimal, polynomial-time algorithm for a geo-distributed data market. The resulting design, Datum, decomposes the joint purchasing and placement problem into two subproblems, one for data purchasing and one for data placement, using a transformation of the underlying bandwidth costs. We show, via a case study, that Datum is near-optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Optimization and Search Problems · Cryptography and Data Security
