Fast Space Optimal Leader Election in Population Protocols
Leszek Gasieniec, Grzegorz Stachowiak

TL;DR
This paper introduces a new population protocol for leader election that is both fast, running in O(log^2 n) parallel time, and space optimal, using only O(log log n) states per agent, matching theoretical lower bounds.
Contribution
It presents a novel leader election protocol that achieves optimal space complexity while maintaining fast convergence, utilizing phase clocks for synchronization.
Findings
Protocol runs in O(log^2 n) parallel time
Each agent uses O(log log n) states
Solution is correct and fast with high probability
Abstract
The model of population protocols refers to the growing in popularity theoretical framework suitable for studying pairwise interactions within a large collection of simple indistinguishable entities, frequently called agents. In this paper the emphasis is on the space complexity in fast leader election via population protocols governed by the random scheduler, which uniformly at random selects pairwise interactions within the population of n agents. The main result of this paper is a new fast and space optimal leader election protocol. The new protocol utilises O(log^2 n) parallel time (which is equivalent to O(n log^2 n) sequential pairwise interactions), and each agent operates on O(log log n) states. This double logarithmic space usage matches asymptotically the lower bound 1/2 log log n on the minimal number of states required by agents in any leader election algorithm with the…
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\newfloatcommand
capbtabboxtable[][\FBwidth]
Fast Space Optimal Leader Election in Population Protocols111This work is sponsored in
part by the University of Liverpool initiative Networks Sciences and Technologies (NeST) and by the Polish National Science Centre grant DEC-2012/06/M/ST6/00459.
Leszek Gąsieniec
University of Liverpool
Grzegorz Stachowiak
Uniwersytet Wrocławski
Abstract
The model of population protocols refers to the growing in popularity theoretical framework suitable for studying pairwise interactions within a large collection of simple indistinguishable entities, frequently called agents. In this paper the emphasis is on the space complexity in fast leader election via population protocols governed by the random scheduler, which uniformly at random selects pairwise interactions from the population of agents.
The main result of this paper is a new fast and space optimal leader election protocol. The new protocol operates in parallel time equivalent to sequential pairwise interactions, in which each agent utilises states. This double logarithmic space utilisation matches asymptotically the lower bound on the number of states utilised by agents in any leader election algorithm with the running time , see [7].
Our solution relies on the concept of phase clocks, a fundamental synchronisation and coordination tool in the field of Distributed Computing. We propose a new fast and robust population protocol for initialisation of phase clocks to be run simultaneously in multiple modes and intertwined with the leader election process. We also provide the reader with the relevant formal argumentation indicating that our solution is always correct and fast with high probability.
1 Introduction
The model of population protocols adopted in this paper was introduced in the seminal paper of Angluin et al. [3]. Their model provides a universal theoretical framework for studying pairwise interactions within a large collection of indistinguishable entities, very often referred to as agents equipped with fairly limited communication and computation ability. The agents are modelled as finite state machines. When two agents engage in a direct interaction they mutually access the contents of their local states. On the conclusion of the encounter their states are modified according to the transition function that forms an integral part of the population protocol. In the probabilistic variant of population protocols, considered in [3] and adopted in this paper, in each step the random scheduler selects a pair of agents uniformly at random. In this variant in addition to the space utilisation determined by the maximum number of distinct states used by each agent, one is also interested in the running time of considered algorithmic solutions. More recent studies on population protocols focus on performance in terms of parallel time defined as the total number of pairwise interactions leading to stabilisation divided by the size (in our case ) of the population. Please note that the parallel time can be also interpreted as the local time observed by agents.
A population protocol terminates with success if the whole population eventually stabilises, i.e., arrives at and stays indefinitely in the final configuration of states reflecting the desired property of the solution. For example, in protocols targeting majority in the population, the final configuration corresponds to each agent being in the unique state representing the colour of the majority, see, e.g., [4, 6, 32, 33, 41]. In leader election, however, in the final configuration a single agent is expected to conclude in a state and all other agents must stabilise in states. The leader election problem received in recent years greater attention in the context of population protocols thanks to several important developments in closely related problems [21, 25]. In particular, the results from [21, 25] laid down the foundation for the proof that leader election cannot be solved in sublinear time with agents utilising a fixed number of states [27]. In further work [9], Alistarh and Gelashvili studied the relevant upper bound, where they proposed a new leader election protocol stabilising in time assuming states at each agent.
In a very recent work Alistarh et al. [7] consider a general trade-off between the number of states used by agents and the time complexity of the stabilisation process. In particular, the authors provide a separation argument distinguishing between slowly stabilising population protocols which utilise states and rapidly stabilising protocols requiring states at each agent. This result nicely coincides with another fundamental observation by Chatzigiannakis et al. [20] which states that population protocols utilising states can cope only with semilinear predicates while the use of states admits computation of symmetric predicates.
Our results.
In this paper we show that the space complexity lower bound in fast leader election proved in [7] is asymptotically tight. The lower bound states that any leader election algorithm with the time complexity requires states in each agent. In this paper we present a new fast leader election algorithm which stabilises in time in populations with agents utilising states, for a small positive constant .
In the most recent work on majority problem in population protocols [8], Alistarh et al. show a lower bound of states for any protocol which stabilises in time, for any constant They also match this bound from above by an algorithm which utilises states at each agent, and stabilises in time .
Our algorithm utilises a fast and small space reduction of potential leaders (candidates) in the population. The reduction process is intertwined with a robust initialisation and further utilisation of phase clocks, a core synchronisation tool developed and broadly explored in self-stabilising literature [37]. This includes the seminal work on clock synchronisation by Arora et al. [10], further extension by Dolev and Welsh [24] to distributed systems prone to Byzantine faults, and related study on pulse synchronisation by Daliot et al. [26]. Our variant of the phase clock refers directly to the work of Angluin et al. [5] in which the authors propose efficient simulation of a virtual register machine supporting basic arithmetic operations. The simulation in [5] assumes availability of a single leader which coordinates the relevant exchange of information. In the same paper, the authors provide also some intuition behind the phase clock coordinated by a junta of leaders, for a small positive constant In this work we formally prove that the phase clock based on junta of cardinality for any allows to count time units assuming a constant number of states at each agent. We also consider an extension of the phase clock allowing to compute time for any integer constant Our main result is based on rapid computation of junta of leaders followed by fast election of a single leader, all in time and states available at each agent.
Related work.
Leader election is one of the fundamental problems in the field of Distributed Computing on par with other core problems including broadcasting, mutual-exclusion, consensus, see an excellent text book by Attiya and Welch [13]. The problem was originally studied in networks with nodes having distinct labels [39], where an early work focuses on the ring topology in synchronous [29, 38] as well as in asynchronous models [18, 43]. Also, in networks populated by mobile agents the leader election was studied first in networks with labeled nodes [36]. However, very often leader election is used as a powerful symmetry breaking mechanism enabling feasibility and coordination of more complex protocols in systems based on uniform (indistinguishable) entities. There is a large volume of work [2, 11, 12, 16, 17, 44, 45] on leader election in anonymous networks. In [44, 45] we find a characterisation of message-passing networks in which leader election is feasible when the nodes are anonymous. In [44], the authors study the problem of leader election in general networks under the assumption that node labels are not unique. In [28], the authors study feasibility and message complexity of leader election in rings with possibly non-unique labels, while in [23] the authors provide solutions to a generalised leader election problem in rings with arbitrary labels. The work in [31] focuses on space requirements for leader election in unlabelled networks. In [30], the authors investigate the running time of leader election in anonymous networks where the time complexity is expressed in terms of multiple network parameters. In [22], the authors study feasibility of leader election for anonymous agents that navigate in a network asynchronously. Also, an interesting study on trade-offs between the time complexity and knowledge available in anonymous trees can be found in recent work of Glacet et al. [35].
Finally, a good example of recent extensive studies on the exact space complexity in related models refers to plurality consensus. In particular, in [15] Berenbrink et al. proposed a plurality consensus protocol for original opinions converging in synchronous rounds using only bits of local memory. They also show a slightly slower solution converging in rounds using only bits of local memory. This disproved the conjecture by Becchetti et al. [14] implying that any protocol with local memory has the worst-case running time In [34] Ghaffari and Parter propose an alternative algorithm converging in time in wich messages and local memory utilise bits. In addition, some work on the application of the random walk in plurality consensus protocols can be found in [14, 32].
2 Preliminaries
We consider population protocols defined on the complete graph of interactions where the random scheduler picks uniformly at random pairs of agents drawn from the population of size . The agents are anonymous, i.e., they don’t have identifiers. The protocol assumes all agents start in the same initial state. Our protocol utilises the classical model of population protocols [3, 5] where the consecutive interactions refer to ordered pairs of agents . On the conclusion of each interaction the two participating agents change their states into according to a fixed deterministic transition function denoted by
Random coins.
For the simplicity of presentation, in this paper we dispense fair random coins whp by observing actions of the random scheduler. It has been shown, however, that agents can generate synthetic coins which become almost uniform after a constant number of interactions [7]. This method is based on concentration properties of random walks on the hypercube, see, e.g., [1]. A similar approach can be found in [19] where Cardelli et al. generate randomness in chemical reaction networks.
We focus here on two complexity measures: (1) the space complexity defined as the number of states required by each agent, and (2) the time complexity reflecting the number of interactions needed to stabilise the population protocol. Similarly to other recent work in the field, the emphasis in this paper is on parallel time of the solution defined as the total number of interactions divided by the size of the population. This time can be also seen as the local time observed by an agent, i.e., the number of pairwise interactions in which the agent is involved in. In this work we aim at protocols formed of interactions equivalent to the parallel running time
Our leader election algorithm is always correct and it runs fast with high probability (whp) which we define as follows. Let be a universal constant referring to the quality of our protocols. We say that an event occurs with negligible probability if it occurs with probability at most , and an event occurs with high probability (whp) if it occurs with probability at least . This estimate is of asymptotic nature, i.e., we assume is large enough to validate the results. Similarly, we say that an algorithm succeeds with high probability if it succeeds with probability at least . When we refer to a probability of failure different from , we say explicitly whp .
2.1 One-way epidemics
In our solution we adopt the notion of one-way epidemic introduced in [5]. One-way epidemic refers to the population protocol with the state space and the transition rule . One interprets 0’s as susceptible agents and 1’s as infected ones. This protocol corresponds to a simple epidemic in which transmission of the infection occurs if and only if the initiator is infected and the responder is susceptible. We will use the following theorem introduced in [5].
Theorem 1** ([5])**
In order to conclude one-way epidemic (infect all agents) one needs pairwise interactions with high probability.
3 Phase clock revisited
In [5] Angluin et al. defined and further analysed the concept of phase clocks capable of counting parallel time approximately, in which each agent participating in the population protocol utilises a constant number of states. The phase clocks studied in [5] work under the assumption of having already determined a unique leader in the population. In the same paper, the authors argue without giving a formal proof that phase clocks should also work when the unique leader is replaced by a junta of leaders, for some unspecified constant Further on, the authors suggest also that once the phase clock is in motion the leadership team can be reduced to a single leader with the help of coin tossing combined with propagation via one-way epidemic. This would allow election of a single leader in expected interactions if a junta phase clock is established. In this paper we adopt a similar mechanism to determine a single leader, however here the junta team has to be computed first. We implement this process in two loops, one nested inside the other. The internal loop operates in (parallel) time equivalent to interactions allowing to distribute 1’s via one-way epidemic. This internal loop in principle mimics actions of a finite state phase clock. The external loop is used to count executions of the internal loop. The external loop is controlled by a finite state phase clock too. However this time the agents execute clock operations more seldom, i.e., only when they act as responders for the first time after each full execution of the internal loop. This way a single execution of the external phase clock refers to time counted by the internal loop, which is in turn equivalent to the total time
In this section we propose and analyse a modified version of phase clocks capable of counting approximately time under assumption that each agent utilises a constant number of states and the junta of leaders is of cardinality , for any constant . Without loss of generality and for some technical reasons we assume for a positive integer .
The states of agents controlling the phase clock protocol are structured in pairs . The entry has value for leaders in the junta and for all other agents. The entry represents a phase denoted by a number of an agent drawn from the set , for some integer constant . The phases can be interpreted as hours on the dial of an analogue clock. The increment of clock phases is periodic and computed using the arithmetic modulo denoted by . We also define the maximum of two phases in set as:
[TABLE]
Finally we define the circular order (which is not partial) on as iff
Now we are ready to formally define the transition function in our version of phase clocks as
[TABLE]
and
[TABLE]
In this paper we study phase clocks which operate in two (nested in one another) modes: the ordinary mode (analogue of the internal loop) and the external mode (analogue of the external loop). In each mode, we say the phase clock passes through [math] whenever its current phase is reduced in absolute terms (e.g., changes from phase 5 to phase 3). As hinted earlier the two modes differ in selecting pairwise interactions to the relevant phase clock actions. In particular:
- •
In the ordinary mode all interactions triggered by the random scheduler prompt actions of the phase clock. And once the ordinary mode clock passes through zero a meaningful interaction of external mode occurs, i.e.,
- •
A meaningful interaction refers to the first interaction of an agent after its ordinary phase clock passes through [math] in which also the agent acts as the responder.
- •
In the external mode, however, interactions used to propel actions of the phase clock form series of interactions in which every agent acts as the responder exactly once. In each subsequent series the initiators are chosen at random (by the random scheduler) and the order in which agents appear as responders is irrelevant. In our algorithms such series are series of meaningful interactions following passes through zero of the ordinary mode clock.
Before we proceed with the full proof of Theorem 2, i.e., the main result of this section, we share with the reader several useful lemmas. In the proofs referring to the ordinary mode we utilise Theorem 1 showing that one-way epidemic protocol concludes after interactions whp. And for the external mode we need an analogue of this theorem.
Lemma 1
One-way epidemic applied in the external mode requires interactions whp to stabilise.
Proof: Let be the first infected agent. By the Chernoff bound, for any constant the number of interactions agent needs to infect directly agents is bounded by whp . Thus the number of infected agents after interactions is at least whp . Also by the Chernoff bound, there exists a constant s.t., if the number of infected agents before a series of interactions is where , then on the conclusion of the series the number of infected agents is at least whp . Thus if we take , thanks to the exponential growth, the number of infected agents after interactions is at least whp . Furthermore, by taking an extra pairwise interactions each uninfected (yet) agent interacts times as the responder. One can choose a constant s.t., the probability of not getting infected during these interactions is at most for a fixed uninfected agent. Finally, by the Union bound the probability of failure in any of these steps is at most .
For the simplicity of presentation we assume in the next few lemmas that the agents start in phase [math]. The main purpose of these lemmas is to bound from above the sizes of sets of agents in phases on the conclusion of interactions. There are separate collections of lemmas for the ordinary and the external modes. Also here we assume and . In what follows we state two lemmas with similar proofs for phase clocks in the ordinary and the external modes.
Lemma 2
Assume and interactions of the phase clock are performed in the ordinary mode. Assume also that at some point the number of agents in phase is at most for all and some value . Then after interactions the number of agents in phase is at most for all and whp .
Proof: We prove this lemma by induction on . For the thesis holds since the number of agents in phase is at most with probability . Assume now, the thesis is true for and we prove it for . By the inductive assumption after the series of interactions the number of agents in phase is bounded from above by for all whp . Two types of agents can enter phase during these interactions.
The first type refers to the leaders. A leader can enter phase if it acts as the responder in the interaction with some initiator in phase Assume the number of the potential initiators is at most , which happens according to the inductive hypothesis whp. Thus the probability that a new leader enters phase during any of interactions is at most . We attribute a binary [math]- sequence of length to these interactions. Initially is empty and during each interaction we pad with one bit as follows. If a new leader in phase appears, we add to . If no new leader in phase is selected, is inserted to but only with probability and [math] otherwise. This way all entries of are independently equal to with probability . If the number of s in is smaller or equal to , the number of new leaders in phase is not larger than . The expected number of s in is . By the Chernoff bound, the probability this number is larger than is negligible and smaller than for sufficiently large . Thus the number of new leaders in phase is not larger than whp .
The second type of new agents in phase refers to followers. A follower enters phase , if it is a responder to an initiator in phase . Also here we attribute a [math]- sequence of length to the relevant interactions. Prior to these interactions is empty. During each interaction gets extended by a single bit. Let be the probability of getting a new follower in phase in a subsequent interaction . If , then is inserted to with probability and [math] otherwise. If and a new follower in phase appears, is added to . If and no new follower in phase appears, then is added to with probability and [math] otherwise. Note that until more than new followers occur in phase , . If the number of s in is smaller or equal to , the number of new followers in phase is not larger than . The expected number of s in is . By the Chernoff bound the probability that this number is larger than is negligible and smaller than for sufficiently large . Thus the number of new followers in phase is not larger than whp at least . This concludes the proof that the number of agents in phase is at most for all whp .
We formulate now the analogous lemma for the external mode.
Lemma 3
Assume and interactions of the phase clock are performed in the external mode. Assume also that at some point the number of agents in phase is at most for all and some value . Consider a series of at most interactions in which at most leaders act as responders. After this series of interactions, the number of agents in phase is at most for all and whp at least .
Proof: We prove the lemma by induction on . For the thesis holds since the number of agents in phase is at most with probability . Assume now, the thesis is true for and we prove it for . By the inductive assumption after the series of interactions from the Lemma’s thesis the number of agents in phase is bounded from above by for all whp . Two types of agents can enter phase during these interactions.
The first type refers to the leaders. A leader can enter phase if it acts as the responder in the interaction with some initiator in phase Assume the number of the potential initiators is at most , which happens according to the inductive hypothesis whp. There are at most interactions in the series in which a leader is the responder. The probability that such a leader enters phase during interaction is at most . We attribute a binary [math]- sequence of length to these interactions. Initially is empty and during each interaction we pad with one bit as follows. If a new leader in phase appears, we add to . If no new leader in phase is selected, is inserted to but only with probability and [math] otherwise. This way all entries of are independently equal to with probability . If has less than entries we add lacking entries by independent coin tosses each time obtaining 1 with probability and [math] with the remaining probability. If the number of s in is smaller or equal to , the number of new leaders in phase is not larger than . The expected number of s in is . By the Chernoff bound, the probability this number is larger than is negligible and smaller than for sufficiently large . Thus the number of new leaders in phase is not larger than whp .
The second type of new agents in phase refers to followers. A follower enters phase , if it is a responder to an initiator in phase . Also here we attribute a [math]- sequence of length to the relevant interactions. Prior to these at most interactions is empty. During each interaction gets extended by a single bit. Let be the probability of getting a new follower in phase in a subsequent interaction . If , then is inserted to with probability and [math] otherwise. If and a new follower in phase appears, is added to . If and no new follower in phase appears, then is added to with probability and [math] otherwise. Note that until more than new followers occur in phase , . If the number of s in is smaller or equal to , the number of new followers in phase is not larger than . The expected number of s in is . By the Chernoff bound the probability that this number is larger than is negligible and smaller than for sufficiently large . Thus the number of new followers in phase is not larger than whp at least . This concludes the proof that the number of agents in phase is at most for all whp at least .
All lemmas below apply to both (the ordinary and the external) modes of the phase clock.
Lemma 4
Assume all agents start in the clock phase 0. The probability that after interactions (in either of the phase clock modes) there are at least agents in phases is at most .
Proof: In the beginning the number of agents in phase [math] is and there are no agents in any other phase. So the number of agents in phases is at most for all and . To conclude the proof we apply Lemma 2 (or Lemma 3, respectively to the mode) times for the series of subsequent interactions. For the ordinary mode we get the thesis by applying Lemma 2 to subsequent series of interactions.
For the extended mode we can split each series of interactions into eight subseries. In each subseries there are at most interactions and in at most interactions leaders acts as responders. We can apply Lemma 3 to these subseries to obtain the thesis.
Namely in these applications is equal , and by Lemmas 2 and 3 the number of agents in phase after all interactions exceeds with probability at most .
Lemma 5
Assume all agents start in the clock phase 0. The probability that on the conclusion of interactions (in either of the phase clock modes) there are some agents in clock phase is .
Proof: The (clock) phase can be entered only by a leader which acts as the responder in the interaction with an agent in clock phase . Since the number of agents in clock phase is at most , the probability of having such interaction in each series of interactions is at most . By the Union bound the probability of having such interaction during subsequent interactions is .
Lemma 6
Assume all agents start in clock phase and is a positive constant. Then there exists an integer constant s.t., the first agent enters phase before interaction with negligible probability for sufficiently large
Proof: Assume . We can divide all phases into consecutive chunks having phases each. Let for all be the first interaction in which an agent enters phase where . Note that just before interaction all agents are in phases . Thus after each subsequent interaction all agent phases are not larger () as if they all started from phase just before interaction . By Lemma 5 the probability that is smaller than for some constant . The probability, that for at least different values we have is by Union bound smaller than
[TABLE]
Now take and . Thus for sufficiently large we obtain with probability at most .
Lemma 7
For any constant there is another constant s.t., if and after interaction there is an agent in phase and all other agents are in phases , then whp
- •
the first interaction when an agent enters phase satisfies , and
- •
during interaction all agents are in phases s.t., .
Proof: By Theorem 1 and Lemma 1 there exists a positive constant s.t., one-way epidemic succeeds within interactions whp. On the other hand by Lemma 6, for a constant there is s.t., all agents starting in phase move to phase smaller or equal after interactions whp. It is easy to see that the same holds if all agents start in phases . Thus whp. Since one way epidemic initiated by an agent in phase during interaction succeeds whp, all agents after interaction are in phase whp.
Consider now the interactions in which phase clocks in different agents pass through We say that passes through [math] of two agents are equivalent if they occur during a period in which all agents are in phases . This notion defines a relation which is reflexive and symmetric. For big enough by Lemma 7 with high probability this relation is transitive and any two agents have equivalent passes through [math]. Thus passes of agents through [math] form equivalence classes. This allows us to use argumentation similar to the one proposed in [5], however this time for the junta of leaders rather than for a single leader.
Theorem 2
Assume all agents start the phase clock protocol from the initial phase [math] when at most leaders are already selected. For any fixed , there exists a constant s.t., the finite-state phase clock with parameter completes passes through s.t., the following conditions are satisfied with high probability for sufficiently large .
- •
completes First passes through [math] of all agents form equivalence classes in which each agent contributes exactly once and the number of interactions between closest passes through [math] in different equivalence classes is at least .
- •
The number of interactions between two subsequent passes through [math] in any agent is whp.
Proof: By Lemma 7 there exists s.t., for the thesis of this Lemma holds for the same as in the Theorem. We consider ten subsets of defined as . By Lemma 7 phases of all agents progress whp from to (modulo 10) in at least interactions whp. This implies that agents’ passes through [math] form equivalence classes whp and the number of interactions between closest passes through 0 in different equivalence classes is at least whp. Since one way epidemic operates in interactions whp each agent increments its phase in interactions. Thus the number of interactions between two subsequent passes through [math] in any agent is whp.
In conclusion, we formulate two useful facts related to phase clocks. Fact 1 states that if some leaders become followers during the phase clock protocol, then the phase clock can only slow down, but the upper bound on the number of interactions remains . Fact 2 states that any unsuccessful interactions can only slow down the phase clock.
Fact 1
The reduction of the number of leaders during execution of the phase clock protocol can only slow down phase progression of agents on their clocks. And if at least one agent remains as leader the number of interactions between two subsequent passes through [math] in any agent is still whp.
Fact 2
If some interactions of the phase clock are faulty, i.e., they do not result in progression, then the phases of all agents do not become larger comparing to the protocol without faults.
4 Forming a junta
In this section we describe protocol. The purpose of this protocol is to rapidly elect from agents a junta of leaders assuming each agent utilises states. This junta of leaders will be used to support phase clocks and eventual election of a unique leader.
The states of agents are represented as pairs where . The value is a non-negative integer which we refer to as level. During execution of the protocol agents with do not update their states. However, any agent with value increments its level by or changes its value to [math] during all interactions participates in. The protocol stabilises when all agents conclude with . The transition function is defined, s.t., on the conclusion of this protocol there are agents equipped with the highest computed value whp. These agents form the desired junta of leaders.
All agents start in the same state . As agents in states do not get updated, we only need to specify how to update agents in states during pairwise interactions. The transition function at level differs from levels . When an agent in state interacts with any agent in state , the final state of the initiator is and of the responder, i.e.,
[TABLE]
When an agent in state interacts with any agent in state for levels or with an agent in state , the resulting state of is . If for any an agent in state participates in an interaction, its state changes only if acts as the responder. If the initiator is in state such that , the responder’s state becomes . If the initiator is in state such that , the responder’s state becomes .
Let be the number of agents which reach level during execution of Forming_junta. The value of depends on the execution thread of the protocol. We first prove an upper bound on .
Lemma 8
For large enough .
Proof: During an interaction of two agents in states exactly half of the participating agents increase their level to . The remaining half ends up in state which becomes their final state. During any other interaction in which an agent in state participates, changes its state to . So at least half of the agents end up in state . Finally, since the first interaction of the protocol is between two agents in states , so at least one agent results in a state with .
Due to the reduction property of the protocol we have , and in turn there exists the last for which value . We prove that and in turn We obtain this result by limiting values of for all .
Lemma 9
Assume and , then whp .
Proof: An agent contributing to value results in state as soon as it gets to level during the relevant interaction . Consider the first interaction succeeding in which acts as the responder. During this interaction the initiator is on level with probability Thus moves to level with probability at most as otherwise the responder would end up in state and would not contribute to . Consider now the sequence of all interactions , in which agents in state act as responders. We can attribute to this sequence a binary [math]- sequence of length , s.t., if during interaction an agent ends up in state the respective entry in becomes 1. Otherwise, this entry becomes 1 with probability and 0 with probability . Thus the probability of each entry being is independently equal to and the number of s in is at least . The expected number of these s is . By the Chernoff bound with probability at most .
Lemma 10
If we get with probability at most .
Proof: If the probability for any agent on level to get to level is at most . Thus by the Union bound the probability of some agent getting to level is at most .
Lemma 11
There exists a constant for which if the probability of is negligible.
Proof: Consider a group of agents which move to level after this level is already reached by other agents. Any agent in this group moves to level with probability at least Since all these agents advance to level independently, the probability that is at most
[TABLE]
This last value is smaller than for large enough.
Theorem 3
In protocol the largest level for which satisfies for some constant and whp.
Proof: By Lemma 8 we have . By Lemma 9 we conclude whp . Furthermore, whp . And in general whp . Thus for we get whp. Further on by Lemma 10 the value where , equals for some constant whp, and we have . By Lemma 11 on the last level for which we have whp. Thus both conditions hold whp.
The last lemma bounds from above the running time of protocol .
Lemma 12
The protocol stabilises in iteractions whp.
Proof: Recall from Lemma 8 that and the number of agents with the final state is at least . Each agent in this group ends up in this state during its first interaction. Since every agent interacts at least once during the first interactions of the protocol whp, all agents ending up in state do so during this time whp. One can show that an agent does not experience an interaction during the first interactions, for a constant with probability
[TABLE]
Thus there exists a positive constant for which after interactions each agent experiences its first interaction whp . Any agent that interacts as the responder with an agent in state sets its value to [math] which concludes the transition process. And after at least agents are in state , the probability that the current interaction is one of such interactions w.r.t. a particular responder is at least . Thus the probability that a given agent does not have after iterations is
[TABLE]
and for the constant big enough . Thus the number of interactions needed to obtain in all agents is whp.
Finally we prove a corollary stating that “spoiling” (for the definition check below) protocol does not affect validity of statements of Theorem 3 and Lemma 12. Using the notion of a spoiled protocol instead of the flawless one is needed to bound the total number of states in the leader election protocol to . Let spoiled protocol be any protocol obtained by changing spontaneously some states from to , where is not the highest level reached so far in the population. We denote the total numbers of agents that reach level in this spoiled protocol by and the highest level for which by . Observe that in the spoiled protocol all agents at level never go through state .
Corollary 1
Level satisfies the condition and whp. Also spoiled protocol stabilises after interactions whp.
Proof: The numbers of agents reaching level in the spoiled protocol are not larger respectively than numbers from the flawless protocol, thus . Also Lemma 11 still bounds from above by whp. Thus the running time of the spoiled protocol is not larger than the flawless one.
5 Leader election
In this section we describe how to combine the protocols described in the two previous sections to obtain a new fast population protocol for leader election. This new leader election protocol operates in (parallel) time on populations with agents equipped with states.
The new leader election protocol assumes that at the beginning there is a non-empty subset (possibly the whole population) of agents which are candidates for leaders, and this subset is gradually reduced to a singleton. The protocol consists of repetitions of the external loop, each formed of interactions controlled by the ordinary mode of the phase clock. This emulates a leader election scheme starting from a set of leader candidates in which during each repetition every candidate picks independently at random either bit [math] or by tossing a fair coin. In real terms, the coin tossing process relies on the initiator versus responder selection performed by the random scheduler. The candidates which pick broadcast message "" to all other agents. And when a candidate with chosen [math] receives message "" it stops being a candidate for the leader.
Theorem 4
The scheme proposed above selects a unique leader during repetitions whp.
Proof: If the number of candidates is at least 2, the probability that in the relevant repetition of the external loop at least half of the candidates draw [math] is at least . Consider a series of consecutive repetitions and form a binary [math]- sequence of length in which the entries correspond to these repetitions. If prior to a repetition only one candidate remains, the entry in is chosen uniformly at random by a single coin toss. If there are more candidates drawing s than [math]s, then the relevant entry becomes . If there is more than one candidate and at least half of them draw [math], en extra random selection is triggered, s.t., the probability of choosing [math] is exactly . Note, that if the sequence has at least s, then exactly one leader remains. By the Chernoff bound the probability, that contains less than s is smaller than and in turn smaller than for a constant large enough.
The main problem with utilisation of the protocol described above is the need of implementing a counter of repetitions utilising a very small memory. We also need to implement multi-broadcast of s which requires interactions whp. The multi-broadcast can be implemented via one-way epidemic described in Section 2. The two processes can be controlled by the phase clock run in both the external and the ordinary modes respectively, using a constant number of states. This is conditioned by forming a junta of at most leaders. In Section 4 we described the relevant protocol which reduces the number of leaders to and which utilises states at each agent. Our leader election protocol starts with a single execution of protocol which is followed by the leader reduction mechanism allowing to reduce the original junta team to a single leader.
All agents enter the leader election protocol in the same state. The current state of an agent is represented by a vector where all entries, with exception of have constant size descriptions. A non-negative integer refers to the number of levels bounded by . Other positions contain small integer constants , , which refer to the leadership status, and are utilised for the phase clock’s ordinary and external modes respectively. The remaining state overheads imposed by our protocol are encoded in which is limited to a constant number of values used to steer the protocol of leader elimination. Here means participation in either drawing 0,1 by leaders or spreading value 1 if was drawn. Since assumes values and all other variables can have only a constant number of values, the total number of states in the protocol is . This number of states can be more precisely estimated since can be upperbounded by , where is a constant depending on . Indeed a more careful estimation based on Theorem 3 gives the number of levels needed to reduce the number of agents to whp. And complete elimination of agents would require extra constant levels whp . Taking into account the number of possible values of variables we get the total number of states for depending on .
Spoiled Forming_junta protocol.
All agents start the leader election protocol with and they run Forming_junta protocol in state for as long as . As soon as gets value which is irreversible, according to Forming_junta protocol the state of the relevant agent becomes . This happens only when is not at the highest level in the population, thus the protocol Forming_junta gets spoiled this way only occasionally. The relevant detail will be described in the next paragraph. According to Corollary 1 each agent should conclude spoiled Forming_junta protocol in the first interactions whp.
Phase clocks on different levels.
Once value becomes [math], the agent starts its phase clocks on level as the leader with parameters . When an agent with phase clock on level interacts with an agent with the phase clocks on a higher level , its state is rewritten . This way the agent aligns its phase clocks in phase [math] on level and ends up in state in the spoiled variant of Forming_junta protocol. The level of the phase clock can be incremented this way many times until it attains the maximum level ever reached by the population. Thus eventually all agents run together the phase clock on level . All agents which advance to level in spoiled Forming_junta protocol are the leaders of the phase clocks and others act as followers. Note that agents that end Forming_junta protocol on level are never notified that they computed the highest level. Only agents that end Forming_junta on a lower level are forced in some moment to join the protocol with level as followers. We run the phase clock in the ordinary mode and in the external mode simultaneously to implement the two loops described in the beginning of Section 3. The phase clock in the ordinary mode is driven by all interactions in which the responder has value . If the responder interacts with an initiator on a higher level it advances its clock level as described above. If the responder has the same clock level as the initiator, they both perform one interaction in the ordinary mode. If the responder interacts with the initiator on a lower level or having , then this interaction is void in the ordinary mode. The phase clock operates in the ordinary mode until it passes through [math] for the first time. And it counts for each agent the first interactions by Fact 2.
Random coin tosses.
Each remaining leader picks randomly [math] or during the first interaction with a non-leader after the phase of (in the ordinary mode of the clock) passes through [math]. If the non-leader is the initiator chooses otherwise picks [math]. This gives a truly random value to each leader, and since there are leaders, this process is completed whp during a constant number of interactions.
Leader candidate elimination on the highest level.
After choosing a value [math] or at random, the leaders multi-broadcast s to the whole population via one-way epidemic. The required interactions are counted with the help of the phase clock in the ordinary mode. In order to obtain a unique leader whp, this process is iterated times by the external loop and controlled by the phase clock in the external mode. The protocol concludes at each agent, when its external clock attains phase . The following theorem holds.
Theorem 5
The protocol described above finds a unique leader in interactions whp.
Now we formulate a Las Vegas variant of our algorithm to more accurately match the existing lower bound on the number of states in fast leader election [7].
Theorem 6
For agents equipped with states, there exists a leader election protocol which always gives the correct answer and works in parallel time whp.
Proof: In the Las Vegas variant of our protocol the external clock utilises the set of transitions defined as before. However, we replace by the standard maximum as we assume that the external clock stops after reaching phase We also need to impose here the level limit in Forming_junta protocol. If this level is achieved, which occurs with a negligible probability, the agent’s level is no longer incremented as it plays the role of . In a very unlikely event an agent may interact with any other agent with a distant ordinary phase clock value, i.e., the relevant phase clock values and satisfy . In such case we make agent utilise all subsequent interactions in its external clock as meaningful. In addition, it also infects with this setting all other agents via one-way epidemic. By Theorem 5 we can construct a fast leader election protocol with the clock phases drawn from , s.t., a single leader is elected and the external phase clocks in all agents conclude in phase during the first interactions whp Thanks to Lemma 7 used in the proof of correctness of the relevant clock construction we can derive an extra observation that no two agents can have distant ordinary phase clock values during execution of the protocol whp .
If a leader enters external phase in the fast protocol we have just described, it can no longer be eliminated by this protocol. Independently, all agents run from the beginning a slow two-state based leader election protocol with the expected number of interactions [27]. In this slow protocol, whenever two leader candidates interact directly the initiator eliminates the responder. If a leader candidate of this slow protocol reaches phase in the external mode, it stops being a candidate for the leader, unless it is still a leader in the fast protocol. The leaders reaching external phase in the external clock eliminate other leaders in the fast protocol in direct pairwise interactions according to the slow protocol principle.
Note that all agents complete Forming_junta protocol with expected interactions. Assume this part of leader election is already completed. Let be the expected number of interactions in the leader election algorithm. We have
[TABLE]
In this formula and are the expected numbers of interactions if we start from the worst case configurations respectively not containing () and containing () distant ordinary clock phases. If we start from the configuration not containing distant ordinary clock phases, the external phase clock reaches phase in all agents or all leaders disappear during interactions whp unless an interaction between agents with distant ordinary clock phases occurs at some point. This can be proved using Lemma 7 and argument analogous to the proof of Theorem 5. In the latter case the external clock reaches phase whp in interactions (after a distant interaction takes place) unless all leaders in the fast protocol disappear. When the fast leader election protocol fails, i.e., it either produces multiple leaders or all candidates for leaders disappear, the leader election process is completed during interactions of the slow leader election protocol. Thus
[TABLE]
If we get from this inequality. When we start in the worst case configuration in which there are two agents with distant ordinary phase clock values, they meet during the first interaction of the protocol with probability at least . And when this happens, the external clock reaches phase in interactions whp and also in this case the unique leader is selected whp during interactions of the slow protocol. Thus
[TABLE]
If , we get from this inequality. And since we conclude .
6 Conclusion
We studied in this paper fast and space efficient leader election in population protocols. Our new protocol stabilises in (parallel) time when each agent is equipped with states. This double logarithmic space utilisation matches asymptotically the lower bound on the minimal number of states required by agents in any leader election algorithm with the running time , see [7]. For the convenience of the reader we provide the logical structure of the full argument in the form of a diagram, see Figure 1.
We also share with the reader a diagram illustrating transitions between states during leader election protocol, see Figure 2.
Further extensions.
In this paper we propose a Las Vegas type algorithm which achieves stabilisation in the sense, that eventually a single (unique) agent arrives in a leader state and all other agents arrive in follower states. More precisely, in some (unlikely) scenarios the chosen leader can hover between different leader states and similarly all other agents can switch between different follower states indefinitely. The stabilisation process can be also considered in a stronger sense where the states of all agents finally freeze as pointed out to us by Dominik Kaaser. In what follows we explain briefly how our algorithm can be modified to meet this stronger requirement.
In the enhanced variant of our algorithm on the conclusion of spoiled Forming_junta protocol we run three different phase clocks including: (1) the ordinary clock run by all leaders computed by Forming_junta protocol; (2) an external clock run by all leaders computed by Forming_junta protocol; and (3) an external clock run by all leaders that remain active.
As in the Las Vegas protocol is replaced by in external clock protocols. Similarly to the protocols proposed in this paper the leaders are eliminated using coin tossing. The outcome of coin tosses is communicated via one-way epidemic protocols controlled by the ordinary clock (1), and the number of the relevant loop repetitions is controlled by clock (3). Once a leader makes a losing coin toss (picks value 0), it is not instantly eliminated. Instead, the leader becomes inactive and awaits any motion of clock (3). When clock (3) eventually moves the leader gets eliminated. When clock (2) reaches the last phase, all agents reach the final state associated with the clock level. This process guarantees that at least one leader survives. Finally if there is more than one leader in the final state, the remaining leaders elect a unique leader using the slow leader election protocol.
This process finishes in expected time assuming flawless performance of clock (1). A problem may occur in an unlikely event when there are agents for which phases on clocks of type (1) are distant. And indeed if during an interaction of two agents their respective phase clocks of type (1) are in distant phases, these two agents inform all other agents (including leaders) about broken (out of phase) clocks via one-way epidemic. Thus all agents get to the final state associated with their clock level. As in the previous protocol if an agent on a given level (even in the final state) interacts with an agent on a higher level, it switches to this level resetting its phase clocks.
Open problems.
There are some interesting unanswered questions left for further consideration. For example, whether one can select whp a unique leader in time with states available at each agent. Another question refers to the exact space complexity of majority as well as plurality consensus in deterministic population protocols considered recently, see, e.g., [33].
Acknowledgements.
We would like to thank our research collaborators Tomasz Jurdziński, Aris Pagourtzis, Tomasz Radzik, Michał Różański, and Paul Spirakis for their suggestions in early stages of this work, as well as Dave Doty, Rati Gelashvili and Dominik Kaaser for their helpful comments on earlier versions of this paper. Special thanks go to all anonymous reviewers whose exhaustive comments helped us to improve the quality of the final version.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Andoni and I.P. Razenshteyn, Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing, Proc. Symposium on Computational Geometry 2016, paper 9:1-9:11.
- 2[2] D. Angluin, Local and global properties in networks of processors, Proc. 12th Annual ACM Symposium on Theory of Computing , STOC 1980, 82–93.
- 3[3] D. Angluin, J. Aspnes, Z. Diamadi, M.J. Fischer, and R. Peralta, Computation in networks of passively mobile finite-state sensors. Proc. 23rd Annual ACM Symposium on Principles of Distributed Computing , PODC 2004, 290–299.
- 4[4] D. Angluin, J. Aspnes, Z. Diamadi, M.J. Fischer, and R. Peralta, Computation in networks of passively mobile finite-state sensors. Distributed Computing, 18(4), 2006, 235–253.
- 5[5] D. Angluin, J. Aspnes, D. Eisenstat. Fast computation by population protocols with a leader, Distributed Computing 21(3), 2008, 183–199.
- 6[6] D. Angluin, J. Aspnes, and D. Eisenstat. A simple population protocol for fast robust approximate majority, Distributed Computing, 21(2), 2008, 87–102.
- 7[7] D. Alistarh, J. Aspnes, D. Eisenstat, R. Gelashvili and R.L. Rivest, Time-Space Trade-offs in Population Protocols, Proc. 28th Annual ACM-SIAM Symposium on Discrete Algorithms , SODA 2017, 2560–2579.
- 8[8] D. Alistarh, J. Aspnes, and R. Gelashvili, Space-Optimal Majority in Population Protocols, Proc. 29th Annual ACM-SIAM Symposium on Discrete Algorithms , SODA 2018, also ar Xiv:1704.04947 [cs.DC] .
