How to Purchase Labels? A Cost-Effective Approach Using Active Learning Markets
Xiwen Huang, Pierre Pinson

TL;DR
This paper proposes a novel market-based framework for purchasing labels through active learning, optimizing label acquisition under budget constraints to improve predictive models efficiently.
Contribution
It formalizes active learning markets as an optimization problem, integrating budget and improvement thresholds, and compares strategies in real-world applications.
Findings
Proposed strategies outperform benchmarks in label efficiency.
Market-based approach is robust across datasets.
Achieves better model performance with fewer labels.
Abstract
We introduce and analyse active learning markets as a way to purchase labels, in situations where analysts aim to acquire additional data to improve model fitting, or to better train models for predictive analytics applications. This comes in contrast to the many proposals that already exist to purchase features and examples. By originally formalising the market clearing as an optimisation problem, we integrate budget constraints and improvement thresholds into the label acquisition process. We focus on a single-buyer-multiple-seller setup and propose the use of two active learning strategies (variance based and query-by-committee based), paired with distinct pricing mechanisms. They are compared to benchmark baselines including random sampling and a greedy knapsack heuristic. The proposed strategies are validated on real-world datasets from two critical application domains: real estate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Data Stream Mining Techniques · Machine Learning and Data Classification
