Exploring the Robustness of Decentralized Training for Large Language   Models

Lin Lu; Chenxi Dai; Wangcheng Tao; Binhang Yuan; Yanan Sun; Pan Zhou

arXiv:2312.00843·cs.LG·December 5, 2023·2 cites

Exploring the Robustness of Decentralized Training for Large Language Models

Lin Lu, Chenxi Dai, Wangcheng Tao, Binhang Yuan, Yanan Sun, Pan Zhou

PDF

Open Access

TL;DR

This paper examines the security vulnerabilities and challenges in decentralized training of large language models, emphasizing the need for robust frameworks to ensure safe and effective deployment.

Contribution

It identifies key vulnerabilities, distinguishes decentralized training from federated learning, and discusses essential components for secure decentralized large language model training.

Findings

01

Decentralized training faces hardware, data, and model vulnerabilities.

02

Security techniques from federated learning are not directly applicable.

03

A case study models a concrete threat scenario.

Abstract

Decentralized training of large language models has emerged as an effective way to democratize this technology. However, the potential threats associated with this approach have not been carefully discussed, which would hinder the development of decentralized training infrastructures. This paper aims to initiate discussion towards this end by exploring the robustness of decentralized training from three main perspectives. First, we demonstrate the vulnerabilities inherent in decentralized training frameworks in terms of hardware, data, and models. Second, we highlight the fundamental difference between decentralized foundation model training and vanilla federated learning, where the security techniques employed in federated learning cannot be applied directly. Third, we discuss the essential components required for a robust and efficient decentralized training framework and present a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Access Control and Trust · Adversarial Robustness in Machine Learning