Scalable Data Point Valuation in Decentralized Learning
Konstantin D. Pandl, Chun-Yin Huang, Ivan Beschastnikh, Xiaoxiao Li,, Scott Thiebes, Ali Sunyaev

TL;DR
This paper introduces DDVal, a scalable decentralized data valuation method that accurately estimates individual data point contributions in federated and swarm learning, even with non-IID data distributions.
Contribution
Develops DDVal, a novel approach for decentralized data valuation that scales efficiently and accurately estimates contributions in federated and swarm learning scenarios.
Findings
DDVal achieves 99.969% cosine similarity in estimating Shapley values.
It scales with data points, not clients, with loglinear complexity.
DDVal is effective in scenarios with many small-data clients.
Abstract
Existing research on data valuation in federated and swarm learning focuses on valuing client contributions and works best when data across clients is independent and identically distributed (IID). In practice, data is rarely distributed IID. We develop an approach called DDVal for decentralized data valuation, capable of valuing individual data points in federated and swarm learning. DDVal is based on sharing deep features and approximating Shapley values through a k-nearest neighbor approximation method. This allows for novel applications, for example, to simultaneously reward institutions and individuals for providing data to a decentralized machine learning task. The valuation of data points through DDVal allows to also draw hierarchical conclusions on the contribution of institutions, and we empirically show that the accuracy of DDVal in estimating institutional contributions is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Recommender Systems and Techniques
