Replication-Robust Payoff-Allocation for Machine Learning Data Markets

Dongge Han; Michael Wooldridge; Alex Rogers; Olga Ohrimenko; Sebastian; Tschiatschek

arXiv:2006.14583·cs.LG·November 17, 2022·1 cites

Replication-Robust Payoff-Allocation for Machine Learning Data Markets

Dongge Han, Michael Wooldridge, Alex Rogers, Olga Ohrimenko, Sebastian, Tschiatschek

PDF

Open Access

TL;DR

This paper investigates how to allocate payoffs fairly in submodular function-based machine learning data markets, focusing on robustness against data replication and manipulation, and provides theoretical and empirical insights into solution stability.

Contribution

It introduces a systematic study of replication robustness in submodular games and characterizes the robustness of semivalue solution concepts, including the Shapley and Banzhaf values.

Findings

01

Theoretical conditions for robustness of semivalue solutions.

02

Replication manipulation can undermine payoff fairness.

03

Empirical validation on ML data markets confirms theoretical insights.

Abstract

Submodular functions have been a powerful mathematical model for a wide range of real-world applications. Recently, submodular functions are becoming increasingly important in machine learning (ML) for modelling notions such as information and redundancy among entities such as data and features. Among these applications, a key question is payoff allocation, i.e., how to evaluate the importance of each entity towards the collective objective? To this end, classic solution concepts from cooperative game theory offer principled approaches to payoff allocation. However, despite the extensive body of game-theoretic literature, payoff allocation in submodular games are relatively under-researched. In particular, an important notion that arises in the emerging submodular applications is redundancy, which may occur from various sources such as abundant data or malicious manipulations where a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Cryptography and Data Security