A Note on "Towards Efficient Data Valuation Based on the Shapley Value''

Jiachen T. Wang; Ruoxi Jia

arXiv:2302.11431·stat.ML·February 23, 2023

A Note on "Towards Efficient Data Valuation Based on the Shapley Value''

Jiachen T. Wang, Ruoxi Jia

PDF

Open Access

TL;DR

This paper analyzes and improves the efficiency of the Group Testing-based Shapley value estimator for data valuation, highlighting its limitations and providing insights into more effective estimation strategies.

Contribution

It offers refined analysis and design improvements for the existing SV estimator, addressing sample reuse issues and enhancing understanding of efficient data valuation methods.

Findings

01

Identifies limitations in sample reuse of the current estimator

02

Provides improved analysis and design choices for SV estimation

03

Contributes insights into challenges of efficient data valuation

Abstract

The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in the analysis and design choices of this SV estimator. Moreover, we point out that the Group Testing-based SV estimator does not fully reuse the collected samples. Our analysis and insights contribute to a better understanding of the challenges in developing efficient SV estimation algorithms for data valuation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Distributed Sensor Networks and Detection Algorithms · SARS-CoV-2 detection and testing