Incentivizing Truthfulness and Collaborative Fairness in Bayesian Learning
Rachael Hwee Ling Sim, Jue Fan, Xiao Tian, Xinyi Xu, Patrick Jaillet, Bryan Kian Hsiang Low

TL;DR
This paper introduces a mechanism for Bayesian collaborative learning that guarantees fairness and incentivizes truthful data sharing, addressing manipulation issues in data valuation.
Contribution
It proposes a novel mechanism combining semivalues and a validation-based data valuation function to ensure fairness and truthfulness at equilibrium.
Findings
The mechanism provably ensures collaborative fairness and truthfulness.
Validated on synthetic and real datasets with positive results.
Addresses manipulation in data sharing by incentivizing honest contributions.
Abstract
Collaborative machine learning involves training high-quality models using datasets from a number of sources. To incentivize sources to share data, existing data valuation methods fairly reward each source based on its data submitted as is. However, as these methods do not verify nor incentivize data truthfulness, the sources can manipulate their data (e.g., by submitting duplicated or noisy data) to artificially increase their valuations and rewards or prevent others from benefiting. This paper presents the first mechanism that provably ensures (F) collaborative fairness and incentivizes (T) truthfulness at equilibrium for Bayesian models. Our mechanism combines semivalues (e.g., Shapley value), which ensure fairness, and a truthful data valuation function (DVF) based on a validation set that is unknown to the sources. As semivalues are influenced by others' data, we introduce an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
