# Proof-of-Contribution-Based Design for Collaborative Machine Learning on   Blockchain

**Authors:** Baturalp Buyukates, Chaoyang He, Shanshan Han, Zhiyong Fang, and Yupeng Zhang, Jieyi Long, Ali Farahanchi, Salman Avestimehr

arXiv: 2302.14031 · 2023-02-28

## TL;DR

This paper presents a blockchain-based data marketplace for decentralized federated learning that ensures fair contribution-based rewards, privacy, robustness, verifiability, and efficiency.

## Contribution

It introduces a novel blockchain marketplace design incorporating proof-of-contribution, privacy preservation, robustness, and verifiability using zero-knowledge proofs, with practical implementation and experiments.

## Key findings

- Successful implementation of the proposed blockchain marketplace.
- Effective contribution assessment and outlier detection demonstrated.
- Fair reward distribution aligned with contributions verified through zero-knowledge proofs.

## Abstract

We consider a project (model) owner that would like to train a model by utilizing the local private data and compute power of interested data owners, i.e., trainers. Our goal is to design a data marketplace for such decentralized collaborative/federated learning applications that simultaneously provides i) proof-of-contribution based reward allocation so that the trainers are compensated based on their contributions to the trained model; ii) privacy-preserving decentralized model training by avoiding any data movement from data owners; iii) robustness against malicious parties (e.g., trainers aiming to poison the model); iv) verifiability in the sense that the integrity, i.e., correctness, of all computations in the data market protocol including contribution assessment and outlier detection are verifiable through zero-knowledge proofs; and v) efficient and universal design. We propose a blockchain-based marketplace design to achieve all five objectives mentioned above. In our design, we utilize a distributed storage infrastructure and an aggregator aside from the project owner and the trainers. The aggregator is a processing node that performs certain computations, including assessing trainer contributions, removing outliers, and updating hyper-parameters. We execute the proposed data market through a blockchain smart contract. The deployed smart contract ensures that the project owner cannot evade payment, and honest trainers are rewarded based on their contributions at the end of training. Finally, we implement the building blocks of the proposed data market and demonstrate their applicability in practical scenarios through extensive experiments.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14031/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14031/full.md

## References

60 references — full list in the complete paper: https://tomesphere.com/paper/2302.14031/full.md

---
Source: https://tomesphere.com/paper/2302.14031