DAVED: Data Acquisition via Experimental Design for Data Markets
Charles Lu, Baihe Huang, Sai Praneeth Karimireddy, Praneeth Vepakomma,, Michael Jordan, Ramesh Raskar

TL;DR
This paper introduces DAVED, a federated data acquisition method inspired by experimental design, which efficiently selects valuable data points for machine learning in decentralized data markets without needing labeled validation data.
Contribution
It proposes a novel federated data acquisition approach based on linear experimental design that improves prediction accuracy in data markets without centralized data access.
Findings
Achieves lower prediction error compared to baseline methods.
Operates efficiently in federated, decentralized settings.
Does not require labeled validation data for optimization.
Abstract
The acquisition of training data is crucial for machine learning applications. Data markets can increase the supply of data, particularly in data-scarce domains such as healthcare, by incentivizing potential data providers to join the market. A major challenge for a data buyer in such a market is choosing the most valuable data points from a data seller. Unlike prior work in data valuation, which assumes centralized data access, we propose a federated approach to the data acquisition problem that is inspired by linear experimental design. Our proposed data acquisition method achieves lower prediction error without requiring labeled validation data and can be optimized in a fast and federated procedure. The key insight of our work is that a method that directly estimates the benefit of acquiring data for test set prediction is particularly compatible with a decentralized market setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Platforms and Economics · Auction Theory and Applications · Consumer Market Behavior and Pricing
MethodsSparse Evolutionary Training
