TDML -- A Trustworthy Distributed Machine Learning Framework
Zhen Wang, Qin Wang, Guangsheng Yu, Shiping Chen

TL;DR
This paper introduces TDML, a blockchain-based framework that enhances trustworthiness, security, and efficiency in distributed machine learning, addressing resource constraints and malicious threats in large-scale model training.
Contribution
The paper proposes a novel blockchain-enabled framework for trustworthy distributed machine learning, integrating workload validation and security measures to improve scalability and robustness.
Findings
TDML effectively detects malicious nodes.
It improves training efficiency on distributed resources.
The framework ensures data privacy and integrity.
Abstract
Recent years have witnessed a surge in deep learning research, marked by the introduction of expansive generative models like OpenAI's SORA and GPT, Meta AI's LLAMA series, and Google's FLAN, BART, and Gemini models. However, the rapid advancement of large models (LM) has intensified the demand for computing resources, particularly GPUs, which are crucial for their parallel processing capabilities. This demand is exacerbated by limited GPU availability due to supply chain delays and monopolistic acquisition by major tech firms. Distributed Machine Learning (DML) methods, such as Federated Learning (FL), mitigate these challenges by partitioning data and models across multiple servers, though implementing optimizations like tensor and pipeline parallelism remains complex. Blockchain technology emerges as a promising solution, ensuring data integrity, scalability, and trust in distributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Explainable Artificial Intelligence (XAI)
