Is Data Valuation Learnable and Interpretable?
Ou Wu, Weiyao Zhu, Mengyang Li

TL;DR
This paper investigates whether data valuation can be learned and interpreted, proposing new models that offer fixed parameters and interpretability, supported by experiments on benchmark datasets.
Contribution
It introduces two novel data valuation frameworks using MLP and regression trees, demonstrating learnability and interpretability of data valuation models.
Findings
Learned valuation models are feasible and effective.
Interpretable models can explain data sample importance.
Experimental results support the positive potential of the proposed methods.
Abstract
Measuring the value of individual samples is critical for many data-driven tasks, e.g., the training of a deep learning model. Recent literature witnesses the substantial efforts in developing data valuation methods. The primary data valuation methodology is based on the Shapley value from game theory, and various methods are proposed along this path. {Even though Shapley value-based valuation has solid theoretical basis, it is entirely an experiment-based approach and no valuation model has been constructed so far.} In addition, current data valuation methods ignore the interpretability of the output values, despite an interptable data valuation method is of great helpful for applications such as data pricing. This study aims to answer an important question: is data valuation learnable and interpretable? A learned valuation model have several desirable merits such as fixed number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Reporting and Valuation Research
MethodsBalanced Selection
