A Note on "Towards Efficient Data Valuation Based on the Shapley Value''
Jiachen T. Wang, Ruoxi Jia

TL;DR
This paper analyzes and improves the efficiency of the Group Testing-based Shapley value estimator for data valuation, highlighting its limitations and providing insights into more effective estimation strategies.
Contribution
It offers refined analysis and design improvements for the existing SV estimator, addressing sample reuse issues and enhancing understanding of efficient data valuation methods.
Findings
Identifies limitations in sample reuse of the current estimator
Provides improved analysis and design choices for SV estimation
Contributes insights into challenges of efficient data valuation
Abstract
The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in the analysis and design choices of this SV estimator. Moreover, we point out that the Group Testing-based SV estimator does not fully reuse the collected samples. Our analysis and insights contribute to a better understanding of the challenges in developing efficient SV estimation algorithms for data valuation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Distributed Sensor Networks and Detection Algorithms · SARS-CoV-2 detection and testing
