Try Before You Buy: A practical data purchasing algorithm for real-world data marketplaces
Santiago Andr\'es Azcoitia, Nikolaos Laoutaris

TL;DR
This paper introduces a practical algorithm called Try Before You Buy (TBYB) for data marketplaces, enabling buyers to make near-optimal dataset purchasing decisions using limited performance information, thus reducing complexity from exponential to linear.
Contribution
The paper proposes TBYB, a novel algorithm that approximates full-information buying strategies using only dataset-specific performance measures, making data purchasing more feasible in real-world marketplaces.
Findings
TBYB achieves near-optimal purchasing performance in synthetic datasets.
TBYB performs effectively on real-world datasets, reducing information requirements.
The algorithm simplifies decision-making from exponential to linear complexity.
Abstract
Data trading is becoming increasingly popular, as evident by the appearance of scores of Data Marketplaces (DMs) in the last few years. Pricing digital assets is particularly complex since, unlike physical assets, digital ones can be replicated at zero cost, stored, and transmitted almost for free, etc. In most DMs, data sellers are invited to indicate a price, together with a description of their datasets. For data buyers, however, deciding whether paying the requested price makes sense, can only be done after having used the data with their AI/ML algorithms. Theoretical works have analysed the problem of which datasets to buy, and at what price, in the context of full information models, in which the performance of algorithms over any of the O(2^N) possible subsets of N datasets is known a priori, together with the value functions of buyers. Such information is, however, difficult to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Blockchain Technology Applications and Security · Auction Theory and Applications
