Learning Provably Correct Distributed Protocols Without Human Knowledge
Yujie Hui, Xiaoyi Lu, Andrew Perrault, Yang Wang

TL;DR
This paper introduces GGMS, a novel learning framework that combines Monte Carlo Tree Search, transformer encoders, and model checking to automatically discover provably correct distributed protocols, overcoming limitations of prior methods.
Contribution
The paper presents GGMS, a new approach that effectively learns correct distributed protocols by integrating advanced search, encoding, and verification techniques, with proven completeness guarantees.
Findings
GGMS successfully learns correct protocols in larger settings than previous methods.
Protocols generated by GGMS are verified correct via exhaustive model checking.
The approach guarantees eventual discovery of a correct protocol if one exists.
Abstract
Provably correct distributed protocols, which are a critical component of modern distributed systems, are highly challenging to design and have often required decades of human effort. These protocols allow multiple agents to coordinate to come to a common agreement in an environment with uncertainty and failures. We formulate protocol design as a search problem over strategies in a game with imperfect information, and the desired correctness conditions are specified in Satisfiability Modulo Theories (SMT). However, standard methods for solving multi-agent games fail to learn correct protocols in this setting, even when the number of agents is small. We propose a learning framework, GGMS, which integrates a specialized variant of Monte Carlo Tree Search with a transformer-based action encoder, a global depth-first search to break out of local minima, and repeated feedback from a model…
Peer Reviews
Decision·Submitted to ICLR 2026
Applying learning for knowledge discovery is an exciting goal.
I do not find the motivation compelling at all. Distributed algorithms is a fascinating area where very smart people have contributed highly sophisticated algorithms over decades. It's unclear how the methods described in this paper can advance the area. Below are a few representative gaps between the toy setting used in the paper and distributed protocols: (1) use of FSMs to describe individual processes of a protocol is common, but one needs some form of symbolic descriptions that uses variabl
1. The paper introduces a new integration of guided MCTS, DFS freezing, and model-checking feedback for protocol synthesis. It provides theoretical results ensuring that freezing does not exclude correct solutions. 2. The clarity is good. Assumptions are stated (synchronous, same state machine per process, last-round decisions). Pseudo-code and examples make the method understandable. The modeling of processes as deterministic state machines, the learning–verification loop, and counterexample-g
1. GGMS assumes synchronous communication, identical deterministic state machines, and decisions in the final round. These simplifications are standard in this line of research. Still, showing one partially relaxed setting (e.g., I guess early decision or message subset) would strengthen the message that GGMS generalizes beyond toy cases. --- 2. The empirical section is minimal. Only a few experiments are tested, with few figures and no summary tables for success rates, runtime, or verificatio
I believe the authors tackle a challenging problem, automatically learning provably correct distributed protocols without human knowledge. I found quite intuitive the formulation of protocol design as a search problem in an imperfect information game. However I am not familiar with these protocols and I cannot comment on whether this is the first work attempting that. Nevertheless, The GGMS framework introduces several innovations that go beyond traditional MCTS including the integration of glob
The paper makes several restrictive assumptions that significantly limit its applicability to real-world distributed protocols -although clearly highlighted by the authors-: 1) The method is restricted to synchronous networks 2) GGMS assumes that processes can only send identical messages to all other processes (rather than different messages to different recipients) 3) The usage of model-checking, which is computationally exhaustive, restricts the number of processes that can be checked. The e
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
