Covenant-72B: Pre-Training a 72B LLM with Trustless Peers Over-the-Internet
Joel Lidin, Amir Sarfi, Erfan Miahi, Quentin Anthony, Shivam Chauhan, Evangelos Pappas, Benjamin Th\'erien, Eugene Belilovsky, and Samuel Dare

TL;DR
Covenant-72B is a large, 72-billion-parameter language model trained via a globally distributed, permissionless process supported by blockchain technology, demonstrating the feasibility of democratized participation at unprecedented scale.
Contribution
This work introduces the first large-scale, permissionless, globally distributed pre-training of a 72B LLM using blockchain, enabling open participation without whitelisting.
Findings
Model performs competitively with centralized models on similar compute budgets.
Democratized, permissionless training at large scale is feasible.
Supports dynamic peer participation with a blockchain protocol.
Abstract
Recently, there has been increased interest in globally distributed training, which has the promise to both reduce training costs and democratize participation in building large-scale foundation models. However, existing models trained in a globally distributed manner are relatively small in scale and have only been trained with whitelisted participants. Therefore, they do not yet realize the full promise of democratized participation. In this report, we describe Covenant-72B, an LLM produced by the largest collaborative globally distributed pre-training run (in terms of both compute and model scale), which simultaneously allowed open, permissionless participation supported by a live blockchain protocol. We utilized a state-of-the-art communication-efficient optimizer, SparseLoCo, supporting dynamic participation with peers joining and leaving freely. Our model, pre-trained on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Advanced Graph Neural Networks · Blockchain Technology Applications and Security
