Incentivizing User Data Contributions for LLM Improvement under Withdrawal Rights

Di Feng; Chenhao Zhang; Zhanzhan Zhao

arXiv:2605.07419·cs.GT·May 11, 2026

Incentivizing User Data Contributions for LLM Improvement under Withdrawal Rights

Di Feng, Chenhao Zhang, Zhanzhan Zhao

PDF

TL;DR

This paper designs incentive mechanisms for user data contribution to improve large language models, addressing privacy, effort, and threshold effects through subsidies and withdrawal rights.

Contribution

It introduces a theoretical framework combining subsidies and withdrawal rights to ensure efficient data contribution and model improvement.

Findings

01

Decentralized responses may fall below the improvement threshold, causing subsidy waste.

02

Combining cost reporting with personalized assignment prevents subsidy leakage.

03

Sequential withdrawal protocols can incentivize more data provision and improve success probability.

Abstract

The continued improvement of large language models (LLMs) increasingly depends on eliciting high-quality, user-generated data, yet such data are costly to provide and often withheld due to privacy and effort concerns. This creates a fundamental design challenge: how to incentivize data contribution when model improvements require coordinated, threshold-level inputs, while contributions remain privately costly and partially reversible. We develop and theoretically analyze incentive mechanisms for user data contribution that explicitly account for threshold effects and reversibility, focusing on how subsidies and withdrawal rights can be jointly designed to overcome coordination failure. As a natural benchmark, we first consider subsidy-based incentives, under which users respond to posted payments with privately optimal floor contributions. These decentralized responses may fall below…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.