Optimal Pricing for Data-Augmented AutoML Marketplaces
Minbiao Han, Jonathan Light, Steven Xia, Sainyam Galhotra, Raul Castro Fernandez, Haifeng Xu

TL;DR
This paper introduces a practical data-augmented AutoML marketplace that automatically enhances models with external data, using performance-based pricing to improve ML outcomes and create a sustainable data monetization framework.
Contribution
It proposes a novel marketplace design integrating external data augmentation with performance-based pricing, addressing strategic behavior and valuation diversity in AutoML environments.
Findings
Effective performance-based data pricing mechanism
Enhanced ML model quality through external data integration
Sustainable economic framework for data monetization
Abstract
Organizations often lack sufficient data to effectively train machine learning (ML) models, while others possess valuable data that remains underutilized. Data markets promise to unlock substantial value by matching data suppliers with demand from ML consumers. However, market design involves addressing intricate challenges, including data pricing, fairness, robustness, and strategic behavior. In this paper, we propose a pragmatic data-augmented AutoML market that seamlessly integrates with existing cloud-based AutoML platforms such as Google's Vertex AI and Amazon's SageMaker. Unlike standard AutoML solutions, our design automatically augments buyer-submitted training data with valuable external datasets, pricing the resulting models based on their measurable performance improvements rather than computational costs as the status quo. Our key innovation is a pricing mechanism grounded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation and Mobility Innovations · Privacy-Preserving Technologies in Data · Blockchain Technology Applications and Security
