The Power of Distributed Verifiers in Interactive Proofs
Moni Naor, Merav Parter, Eylon Yogev

TL;DR
This paper introduces a framework for distributed interactive proofs that significantly reduces proof size and rounds for verifying complex computations and graph properties in distributed settings.
Contribution
It develops a general method to convert standard interactive protocols into distributed ones with small proof size, improving efficiency for various computational and graph problems.
Findings
Distributed protocols for $O(n)$ time computations with $O( ext{log} n)$ proof size.
Protocols for small space and NC computations with $O(1)$ rounds and $O( ext{log} n)$ proof size.
Improved protocol for Graph Non-Isomorphism with 4 rounds and $O( ext{log} n)$ proof size.
Abstract
We explore the power of interactive proofs with a distributed verifier. In this setting, the verifier consists of nodes and a graph that defines their communication pattern. The prover is a single entity that communicates with all nodes by short messages. The goal is to verify that the graph belongs to some language in a small number of rounds, and with small communication bound, i.e., the proof size. This interactive model was introduced by Kol, Oshman and Saxena (PODC 2018) as a generalization of non-interactive distributed proofs. They demonstrated the power of interaction in this setting by constructing protocols for problems as Graph Symmetry and Graph Non-Isomorphism -- both of which require proofs of -bits without interaction. In this work, we provide a new general framework for distributed interactive proofs that allows one to translate standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The Power of Distributed Verifiers in Interactive Proofs
Moni Naor Department of Computer Science and Applied Mathematics, Weizmann Institute of Science Israel. Supported in part by a grant from the Israel Science Foundation (no. 950/16). Incumbent of the Judith Kleeman Professorial Chair.
Merav Parter Department of Computer Science and Applied Mathematics, Weizmann Institute of Science Israel. Supported in part by grants from the Israel Science Foundation (no. 2084/18)
Eylon Yogev Department of Computer Science, Technion, Haifa, Israel. Supported by the European Union’s Horizon 2020 research and innovation program under grant agreement no. 742754.
Abstract
We explore the power of interactive proofs with a distributed verifier. In this setting, the verifier consists of nodes and a graph that defines their communication pattern. The prover is a single entity that communicates with all nodes by short messages. The goal is to verify that the graph belongs to some language in a small number of rounds, and with small communication bound, i.e., the proof size.
This interactive model was introduced by Kol, Oshman and Saxena (PODC 2018) as a generalization of non-interactive distributed proofs. They demonstrated the power of interaction in this setting by constructing protocols for problems as Graph Symmetry and Graph Non-Isomorphism – both of which require proofs of -bits without interaction.
In this work, we provide a new general framework for distributed interactive proofs that allows one to translate standard interactive protocols (i.e., with a centralized verifier) to ones where the verifier is distributed with a proof size that depends on the computational complexity of the verification algorithm run by the centralized verifier. We show the following:
- •
Every (centralized) computation that can be performed in time can be translated into three-round distributed interactive protocol with proof size. This implies that many graph problems for sparse graphs have succinct proofs (e.g., testing planarity).
- •
Every (centralized) computation implemented by either a small space or by uniform NC circuit can be translated into a distributed protocol with rounds and bits proof size for the low space case and many rounds and proof size for NC.
- •
We also demonstrate the power of our compilers for problems not captured by the above families. We show that for Graph Non-Isomorphism, one of the striking demonstrations of the power of interaction, there is a 4-round protocol with proof size, improving upon the proof size of Kol et al.
- •
For many problems we show how to reduce proof size below the naturally seeming barrier of . By employing our RAM compiler, we get a 5-round protocols with proof size for a family of problems including Fixed Automorphism, Clique and Leader Election (for the later two problems we actually get proof size).
- •
Finally we discuss how to make these proofs non-interactive arguments via random oracles.
Our compilers capture many natural problems and demonstrates the difficultly in showing lower bounds in these regimes.
Contents
1 Introduction
That rug really tied the room together.
Big Lebowski, Coen Brothers, 1991
Interactive proofs are an extension of non-determinism and have proven to be a fundamental tool in complexity theory and cryptography. Their development has led us, among others, to the exciting notions of zero knowledge proofs [GMR89, GMW91] and probabilistically checkable proofs (s).
Interactive proof is a protocol between a randomized verifier and a powerful but untrusted prover. The goal of the prover is to convince the verifier regarding the validity of a statement, usually stated as membership of an instance to a language . The two main requirements of the protocol are: completeness: a verifier should accept with high probability (or probability one if we want perfect completeness) a true statement if the prover is honest, and soundness: if the statement is false, then for any dishonest (unbounded) prover participating in the protocol the verifier should reject with high probability (over its internal random coins). In the classical case, the prover is computationally all powerful and the verifier runs in polynomial time. In a celebrated result, interactive proofs are proved to be very powerful allowing for efficient verification of any language in with a polynomial verifier [LFKN92, Sha92]. Another striking result was the [GMW91] protocol for Graph Non-isomorphism (GNI).
Interactive proofs are largely concerned with verifiers that are computationally bounded, but are relevant for verifiers with any sort of limitation (e.g., finite automata [DS92, Con92]). They have been studied in other settings such as communication complexity [BFS86, GPW18] and their connection to circuit complexity [KW90, AW09, Wil16] and property testing [RVW13, GR18]. Of particular interest to us are interactive proofs for graph problems in P with a presumably weaker verifier (e.g. in NC) [GKR15, RRR16] and a polynomial prover (i.e., prover restricted to polynomial computation). Our results however also capture problems that go beyond P.
One schism in interactive proofs is whether the verifier has some private coins where the prover does not get to see them (as in the original [GMR89]) or if all coins are public (as in [BM88]), usually denote with AM for Arthur-Merlin. Goldwasser and Sipser [GS89] gave a compiler for converting private-coins into public-coins that is relevant for polynomial-time verifiers. When applied to the protocol of [GMW91] for Graph Non-isomorphism it yields a two round public-coins (AM) protocol for showing that two graphs are not isomorphic.
In this work we study interactive proofs where the verifier is a distributed system: a network of nodes that interact with a single untrusted prover. The prover sees the entire network graph while each node in the network has only a local view (i.e., sees only its immediate neighbors) in the graph. The goal of the prover is to convince the nodes of a global statement regarding the network. The two main complexity measures of the protocol (which we aim to minimize) are the number of rounds, and the size of the proof (i.e., communication bound between the network and the prover). In this context, we ask:
What is the power of interactive proofs with a distributed verifier?
The notion of interactive proofs with a distributed verifier was introduced recently by Kol, Oshman and Saxena [KOS18] as a generalization of its non-interactive version known as “distributed NP” proofs (in its various versions, e.g., [KKP10, GS16, FKP11]). The prover interacts with the nodes of the network in rounds. In each round, a node sends the prover a random challenge . Then, the prover responds by sending each node its respond . Nodes can exchange their proof only with their immediate neighbors in the network in order to decide whether to accept the proof. For accepting a proof all nodes must accept and to reject it is enough that one node rejects.
A simple example for a “distributed NP” proof is 3-coloring of a graph: the prover gives each node in the graph its color, and nodes exchange colors with their neighbors to verify the validity of the coloring. In such a case, we say that the proof size is a constant (each color can be described using two bits). Korman et. al. [KKP10] introduced this notion as a “proof labeling scheme” and showed that there is a long list of problems for which a short distributed proof exists. Other problems (see also e.g. [GS16]) requires proofs with bits, and thus cannot be distributed in any non-trivial manner.
There is a long line of research on the power of distributed proofs focusing on different notions of “proof”. For example, Göös and Suomela [GS16] studied distributed proofs that can be verified with a constant-round verification algorithm. Baruch et al. [BFP15] studied the power of a randomized verifier in distributed proofs and Fraigniaud et al. [FHK12] studied the effect of such proofs when nodes are anonymous. Feuilloley et al. [FFH16] considered the first interactive proof system which consists of three players: a centralized prover, disprover and a distributed network verifier. We further discuss these works in Section 1.3.
Kol et al. [KOS18] took an important step towards understanding the power of interaction in distributed proofs. As an analog to the class (Arthur-Marlin), they defined the class to contain all -vertex graph problems that admit a two-message protocol where the communication between the prover and each node in network is bounded by . As in AM, the protocols in this class must be “public-coins”, that is, the node’s messages to the prover are simply independent random bits (no other randomness is allowed). The class is defined similarly for three-message protocols (and so forth), and in general denotes protocols with rounds and communication complexity bounded by .
Their main positive results are for two problems Sym and GNI which have an -bit lower bound in the non-interactive setting [GS16]. In the problem of Sym, the network should decide whether the network graph has a non-trivial automorphism. In GNI problem, the goal is to decide whether the network graph is not isomorphic to an additional input graph. Specifically, they show that and that . This is a huge improvement over the lower bound for the non interactive version of this problems. On the hardness/impossibility side they show an (unconditional) lower bound for the Sym problem for two-message protocols: if then .111The authors of [KOS18] also reported an improvement of , see [Osh18].
1.1 Our Results
The Model. We follow the distributed interactive proof model of Kol et al. [KOS18]: The protocol proceeds in rounds in which nodes exchange (short) messages with the prover as well as with their neighbors in the graph. The messages that the nodes send to each other are essentially the proofs they received from the prover. Thus in the model of [KOS18] the nodes are assumed to get from the prover their own proof as well as the proofs of their neighbors. Note that in the distributed interactive setting, a proof size of bits is a trivial upper bound for all graph problems, since the nodes are computationally unbounded (i.e., only their information on the network is bounded). Our key results demonstrate that many natural graph problems on sparse graphs admit logarithmic-size proofs. In addition, it is also possible to go below the -regime (even for dense graphs), and obtain -size proofs for a wide class of problems.
General Compiler for RAM-Verifiers.
One of our key contribution is in presenting general methods for converting “standard” interactive proofs (i.e., proofs where the verifier is a centralized algorithm) to protocols where with distributed verifier. The cost of this transformation in terms of the proof-size depends on the computational complexity of the centralized verification algorithm. Our first result concerns a RAM-verifier (i.e., where the verification algorithm runs a RAM machine). We show a general compiler that takes any -protocol with a RAM-verifier with verification complexity (i.e., the time complexity of the verification algorithm is operations over words on length ) and transforms is into an -round distributed interactive protocol with proof-size . Specifically, for a verifier that runs in time a distributed protocol.
Theorem 1**.**
Let be an -round public-coin protocol for languages of -vertex graphs where the verifier is a RAM program with running time , then . In particular, if and the verifier runs in time then .
The benefit of this compiler is in its generality: the transformation works for any problem while paying only in the running-time of the verifier. This is particularly useful when the graph is sparse. For instance, it is possible to verify whether a graph is planar in using the linear time algorithm for planarity [HT74]. Any other linear time algorithms on sparse graphs can be applied as well. As we will see next, this compiler is used as a basic building block in many of our protocols. Even for those that concern dense graphs, and even for those the go below the regime. On of the most notable example for the usefulness of this compiler is for the problems of graph non-isomorphism and related variants.
Graph Isomorphism and Asymmetry with -bit Proofs.
We combine our linear-RAM compiler with the the well-known Goldwasser-Sipser GNI protocol [GS89, GMW91]. Note that the GNI problem involves two graphs, and its definition in the distributed setting might be interpreted in two ways (either the second graph is also a communication graph, or not). For this reason, we start by considering an almost equivalent problem of “graph asymmetry” (Asym) where the prover wishes to prove that the communication graph has no (non-trivial) automorphism. The protocol for GNI can be naturally augmented to this problem we well. Since the running time of the verifier in the (centralized) protocol is linear in the size of the graph, applying our compiler immediately yields that and (for any definition of GNI) which matches the result of [KOS18] for the same problem.
To achieve the desired bound of proof-size, we will not use the compiler as a black-box. Instead our strategy is based on first reducing the problem to one that is verifiable in linear-time (in the number of vertices) using -bits of proofs. Then in the second phase, we will apply the RAM-compiler on this reduced problem, using proofs of size -bit again. Our end result is a protocol for Asym, an exponential improvement over Kol et al. protocol. This also applies to the GNI problem where both graphs are part of the communication network (see in case they do not correspond to the communication graph below). In contrast, for proof labeling schemes there is an lower bound [GS16].
Theorem 2**.**
, and .
One of the tools used for the compiler is a protocol for the permutation problem Permutation. Here, each node has a value and we need to verify that these values form a permutation over . We give an dAM protocol for this problem using proofs of size . This was posed as an open problem by the authors of [KOS18].222The problem was posed in the Interactive Complexity workshop at the Simon’s Institute [Osh18].
Theorem 3**.**
.
Compilers for small space and low depth verifiers.
If we allow even more rounds of communication, then we can capture a richer class of languages. Specifically, we show how to leverage the RAM-compiler to transform the protocols of Goldwasser, Kalai and Rothblum [GKR15] and Reingold, Rothblum and Rothblum [RRR16] into distributed protocols. The result is that any low space (and poly-time) computation can be compiled to constant-rounds distributed protocols with proof size and any “uniform NC” (circuits of polylog depth, polynomial size and unbounded fan-in) computations can be compiled into a distributed protocol with rounds and proof size. The main work performed by the verifier in both of these protocols is interpreting the input as a function and evaluating its low degree extension at a random point. We show how to implement this using a distributed verifier. See more details in Section 6. This is true also for the case when the computation verified can be performed by a low depth (uniform) circuit, but in this case we need a number of rounds proportional to the depth of the circuit.
Theorem 4**.**
Let be a language.
There exists a constant such that if can be decided in time and space then . 2. 2.
If is in uniform NC then .
This can be used in turn for the GNI problem and obtain a even for the case where one of the graphs does not correspond to the communication graph. Another example is verifying the a tree is a minimal spanning tree (MST). One can verify that a tree is a MST by a centralized algorithm with small space. Thus, we get that . Without interaction, there is a matching upper bound and lower bound of , where is an upper bound on the weights [KK07].
1.2 Below the -Regime
At this point, there is still a gap between our above mentioned results and the lower bound of [KOS18]. One reason for this gap is that constructing protocols with proofs seems quite hard. The prover is somewhat limited as the basic operations such as pointing a neighboring node, counting, specifying a specific node ID, all require bits.
Perhaps surprisingly, we show that using our RAM compiler with additional rounds of interaction can lead to an exponentially improvement in the proof size for a large family of graph problems. Obtaining these improved protocols calls for developing a totally new infrastructure that can replace the basic -primitives (i.e., with a logarithmic proof size) with an equivalent -primitives (e.g., verifying a spanning tree). While these do not yield a full RAM compiler, they are indeed quite general and can be easily adapted to classical graph problems. Two notable examples are DSym and problems that can be verified by computing an aggregate function of the vertices.
The DSym problem is similar to the Sym problem except that the automorphism is fixed and given to all nodes. This problem was studied by [KOS18] where they showed that but any distributed NP proof for requires a proof of size . We show that using a five message protocol we can reduce the proof size to :
Theorem 5**.**
.
Depending on the problem, our techniques can be used to get even smaller proofs. In particular, if the aggregate function is over constant size elements then the proof be of constant size. For example, we show that the Clique problem can be solved using a proof of size in only three rounds. In contrast, without interaction, there is an lower bound [KKP10].
Corollary 1**.**
.
For instance, we show an protocol for proving that the graph is not two-colorable. This is in contrast to non-interactive setting [GPW18] that requires bits for this problem. Another interesting example is the “leader election” problem where it is required to verify the exactly one nodes in the network is marked as a leader. As this problem can also be casted as an aggregate function of constant sized element, we get:
Corollary 2**.**
.
Argument Labeling Schemes.
Can the interaction be eliminated? The simple answer for that is no! We have observed by now several examples where few rounds of interactions break the non-interactive lower bounds (e.g. for Symmetry and Asymmetry). However, this does not seem to be the end of the story. In the centralized setting there are various techniques for eliminating interaction from protocols, especially public-coins ones. A “standard” such technique is the Fiat-Shamir transformation or heuristic that converts a public-coins interaction to one without an interaction. Here, we assume that parties have access to a random oracle, and that the prover is computationally limited: it can only perform a bounded number of queries to the random oracle. In such a case, we end up with an “argument” system rather than with a “proof” system. In an argument system proofs of false statements exist but it is computationally hard to find them. Therefore, such protocols do not contradict the lower bounds for proof labeling schemes. We call such a protocol an “argument labeling scheme”. These systems can have significant savings in distributed verification systems. More details are in Section 8.
1.3 Related Work
The concept of distributed-NP is quite broad and contains (at least) three frameworks. This area was first introduced by Korman-Kutten-Peleg [KKP10] that formalized the model of proof-labeling schemes (PLS). In their setting, communications are restricted to happen exactly once between neighbors. A more relaxed variant is locally checkable proofs (LCP) [GS16] introduced by Göös and Suomela which allows several rounds of verification in which nodes can also exchange their inputs. The third notion which is also the weakest is non-deterministic local decision (NLD) introduced by Fraigniaud-Korman-Peleg [FKP11]. In NLD the prover cannot use the identities of the nodes in its proofs, that is the proofs given to the nodes are oblivious to their identity assignment.
We note that when allowing prover–verifier interaction some of the differences between these models disappear. At least in the -proof regime, using more rounds of interactions allows the nodes to send their IDs to the prover, and the prover can use these IDs in its proofs. Our protocols with -bit proofs are not based on the actual identity assignment, but rather only on their port ordering.
Prior to the distributed interactive model of [KOS18], Feuilloley, Fraigniaud and Hirvonen [FFH16] considered the first interactive proof system which consists of three players: a centralized prover, a decentralized disprover and a distributed verifier (the network). This model gives considerably more power to the verifier as it can get some help from the strong disprover. [FFH16] showed that such interaction between a prover and a disprover can considerably reduce the proof size. The most dramatic effect is for the nontrivial automorphism problem which requires bits with no interaction, but can be verified with bits with two prover–disprover rounds.
Very recently, Feuilloley et al. [FFH*+*18] considered another generalization of [KOS18] where instead of allowing several rounds of interaction between the prover and the verifier, they allow several verification rounds. That is, the prover gives each node a proof at the first round, it then disappears and the nodes continue to communicate for many rounds. They showed that for several “simple” graph families such as trees, grids, etc. every proof labeling with bits, can be made an -bit proof when allowing verification rounds. Note that our distributed protocols can simulate such a scheme, but since our protocols use a small number of interactive rounds, the reduction in the proof size that we get from the framework of [FFH*+*18] in negligible.
2 Our Techniques
2.1 The RAM Program Compiler
Many non-interactive distributed proofs (known as “proof labels”) [KKP10] are based on the basic primitive of verifying that a given marked subgraph is a spanning tree [AKY97]. In particular, in most of these applications, the subgraph itself is given as part of the proof to the nodes (i.e., a vertex gets its parent ). I.e., the prover computes a spanning tree for the vertices to facilitate the verification of the problem in hand (e.g., cliques, leader election etc.). Indeed, throughout we will use the prover to help the network compute various computations to facilitate the verification of the problem in hand. We start by briefly explaining the proof labeling of spanning trees, which becomes useful in our compiler as well.
A Spanning Tree.
The proof contains of several fields, which will be explained one by one along with their roles. The first field in the proof given to is its parent in the tree . This can be indicated by sending the port number that points to its parent. Let be the graph defined by the pointers. We must verify that is indeed a tree (i.e., contains no cycles) and that it spans . To verify that there are no cycles, the second field in the proof of contains the distance to the root in the tree, . The root should be given distance 0, and each node verifies with its parent in the tree that . If there is a cycle in , then no value for can satisfy this requirement for all nodes on the cycle. Finally, to be able to verify that spans , the third field in the proof is the ID of the root. Nodes verify with their neighbors that have the same root ID. If does not span then there must be two trees with two different roots. Since the graph is connected there must be an edge from one tree to the other which will spot the inconsistency of the root IDs.
The tree is used as a basic component in many protocols as it allows summing values held by each node (or computing other aggregative functions). For example, suppose we want to use compute where is some number that is known to node . Let be the subtree of rooted at . We can use the prover to help us in this computation. Since the prover is untrusted, we will also need to verify this computation. This is done as follows. The prover sends the value , the sum of the values in the subtree . Then, verifies that is consistent with the values given to its children in the tree. That is, where are its children (the leaves have no children). If all values are consistent then we know that the root of the tree has the desired value . We call such a procedure summing up the tree” as it will be useful later on in different contexts.
A Reduction to Set Equality.
Our main observation is that obtaining a general RAM compiler translates into a specific problem of Set Equality. Let be a standard interactive protocol (with a centralized verifier). We construct a distributed protocol as follows. First, we let the prover compute a spanning tree of the graph as described above, and assign IDs in the range of to for the nodes in the graph. The correctness of the spanning tree computation is verifies is in the labeling schemes described above. We later describe how to also verify the correctness of the consecutive IDs in . We will also solve this by a reduction to set equality.
The high level idea is to use the fact that the protocol is public-coin, and thus allows the prover to run the centralized verifier on its own. We now need the prover to convince the network that it simulated the verification algorithm correctly. For that purpose the verification of the RAM computation made by the prover is distributed among the nodes. Since the centralized RAM program consists of steps, each vertex can be in charge of locally verifying constant number of steps in this program. To verify that the computation is correct globally, we will reduce the problem to Set Equality.
We now explain it in more details. Let be the nodes ordered by their assigned IDs. Given this ordering, we can split the communication between the prover and verifier in to equally parts where node is responsible to communicate and store the responses of the chunk of the messages. Since is a public coin protocol, the messages to the prover from each node are simply random coins. Finally, we need to simulate the verifier of by a distributed protocol. We assume that the verifier is implemented by a RAM program.
Consider a RAM program . An execution of can be described as a sequence of read and write instructions to a memory with cells, where each operation consists of a short local state and a triplets where is the value (value read from memory or written to memory), is the address in the memory, is the timestamp of memory cell (i.e., when it was last changed). We set the size of a cell to be bits such that each tuple the state and the triplets can be represented by bits.
Let be the set of all read triplet operations and let be the set of all write triplet operations. Note that in general it might be that , e.g., if a cell is written once but read multiple times. Following the steps of [BEG*+*94] in the context of memory checking, we can transform any program to a canonical form where while paying only a constant factor in the running time. We assume from hereon that is given in this canonical form. Thus, we have that and describe an honest execution of the program if and only if .
With this in mind, we can design the final step. Let be the running time of verifier . The prover runs the verifier and writes the list of triplets and local states of the program . Each node is responsible for steps of the program, and the prover divides the triples and states of each instruction to the nodes. Each node that is responsible of step verifies that the states and triples are consistent with the instructions of step in the program . What the node cannot verify locally is that the values read from the memory are consistent with the program. That is, we are left to verify is that the two sets and defined by the triplets are equal (as multisets). That is, we need a protocol for the problem SetEquality.
A Protocol for Set Equality.
As we have shown a protocol for Set Equality is the basis for the compiler. Actually, this protocol is used for other problems as well, and we describe it in its generality. Assume each node has an input and where are long bit strings, and let and similarly . We want to verify that as multisets. We will describe here an protocol for this problem, which captures the mains ideas. In Section 4.1 we show how this protocol can be compressed to two message , and we also show how to support each node holding two lists of up to elements (instead of single elements and ).
In the first message, we let the prover compute a spanning tree of the graph along with a proof as described above. Then, to check that we define a polynomial and over a field of size as follows:
[TABLE]
As we show in the analysis, it holds that if and only if . Moreover, since the polynomials have low degree compared to the field size ( vs. ) in order to check if they are equal it suffices to compare them on a random field element (if the two polynomials are different they can agree on at most element in the field).
We let the root of the tree sample a random field element , and send it to the prover. The prover sends to all nodes of the graph. Nodes compare with their neighbors to verify that everyone has the same element . Then, we are left with evaluating the two polynomials and . To compute these polynomials we use a spanning tree , and compute them “up the tree”. We let the prover give each node the evaluation of the polynomials on the subtree , that is, and . Nodes check consistency with their children in the to assure that all partial evaluations are correct. That is, they check that
[TABLE]
where are the children of in the tree. Finally, the root of the tree holds the two complete evaluations of polynomials and and verifies that . If all verifications pass then we know that with high probability .
Assigning IDs.
In the description above we assumed that unique IDs in the range of to are honestly generated. We show that this assumption is without loss of generality. Let the ID of node be . Each node verifies that ( is known to all nodes). We want to verify that the are all distinct. That is, we want to verify that the ’s are a permutation of . This is also called the Permutation problem.
This is solved by reducing it to the Set Equality problem. Each node sets . Let and . Our key observation here is that the ’s are distinct if and only if . Thus, we run the set equality protocol on and . Note that this can be performed in parallel to the compiler’s protocol, and thus does not add to the round complexity.
2.2 Asymmetry and Graph Non-Isomorphism
We first give a short description of a standard (centralized) interactive protocol for Asym which is a simple adaptation of the public-coin protocol for graph non-isomorphism [GS89, GMW91] (see also [BM88]). From hereon we denote this protocol by the “GNI protocol”. Then we show how to transform it to a distribute protocol.
Let be the set of all graphs that are isomorphic to . That is, . The main observation of the GNI protocol which follows here directly is that if has no (non-trivial) automorphism then while if does have an automorphism then . Thus, the focus of the protocol is on estimating the size of .
The verifier samples a hash function , where is roughly and sends it to the prover. The prover seeks for a graph such that . The main observation is that the probability that such a graph exists is higher when is larger which allows the verifier to distinguish between the cases of . That is, the verify will accept if .
Let us begin with an immediate solution for sparse graphs. Suppose that the graph is sparse (has edges) and thus can be represented by bits. One can observe that in this case the total communication of the “GNI protocol” is linear in the input size, that is, and thus can be distributed among the nodes such that each node gets bits. Finally, the verifier is required to compute the hash function . We need a very fast (linear-time) pairwise hash function for this. Luckily, Ishai et al. [IKOS08] (see Corollary 3) constructed such a hash function that can be computed in operations over words of size . Thus, applying our RAM compiler with this hash function gives a protocol for the problem: the first message is sending and messages 2-3 are sending and verifying that .
The protocol above of course works only for sparse graphs as they had a small representation. While graphs in general have representation of size roughly , since the size of the set is at most , any graph in can be indexed to have size . Thus, we want to hash the set using a hash function to a set such that and each elements in is represented using bits. While this approach is simple, it has a major caveat: computing is exactly the task we wished to avoid! However, there is an important difference: the function had exactly bits of output where has for a large constant . This slackness in the constant lets us compose a special hash function that can be computed locally. Them, we will apply to the smaller elements of and compute it using the RAM compiler as before. Together, we will verify that .
In more details, our hash function will be composed of hash functions. Each node chooses a seed for an -almost-pairwise hash function
[TABLE]
where . The seed length of is bits. Let be the chosen hash function ordered by the index of the nodes. Let where is the indicator vector for the neighbors of node in . Then, we define a hash function as
[TABLE]
Using we can define the set . It is easy to see that . The fact that is locally computable means that it has a very bad collision probability. If two inputs are differ only on a single bit then the probability that they collide depend only on a single which rather small compared to the total range of . To show that there will be no collisions under under we exploit the specific properties of . The key point is that contains only graphs that are all isomorphic to each other and hence there are not many isomorphic graphs that differ only on a small part. This lets us bound the collision probability of two graphs as a function of their hamming distance and union bound over the number of isomorphic graphs of distance . We show that with high probability we have that and thus we can apply the protocol for instead of .
Graph Non-Isomorphism.
The end result is a protocol for Asym in . In Section 5.1 we show how to adapt this protocol for GNI, where we assume that in the GNI problem formulation nodes can communicate on both graphs and . We note that while this improves upon the of [KOS18], our protocol works only when the GNI problem is defined such that nodes can communicate on both graphs and . The protocol of [KOS18] works also on the definition GNI where only is the communication graph and is given as input nodes. That is, each node is given a list of its neighbors in but cannot communicate with them directly. This is not an issue when the proof complexity is as the prover can send each node its neighbors in the graph . However, when restricting the communication size to this raises many difficulties, which seem hard to overcome.
2.3 A Compiler for Small Space and Low Depth
We describe how to get a compiler for small space computation (Item 1 in Theorem 4). The main tool behind the construction is the interactive protocol of Reingold, Rothblum and Rothblum [RRR16]. They show that for every statement that can be evaluated in polynomial time and bounded-polynomial space there exists a constant-round (public-coin) interactive protocol with an (almost) linear verifier. This is an excellent starting point for us, as our RAM compiler is most efficient for linear verifiers.
There is a subtle point here however. A linear-time in [RRR16] is with respect to the size of the graph, i.e., , whereas a linear time for our RAM compiler is with respect to the number of vertices . To handle this, we first reduce the running time of the centralized verifier to before applying our RAM compiler. Indeed, as already observed in [RRR16], the running time of the verifier can be made sublinear (e.g., for some small constant ) if the verifier is given an oracle access to a low degree extension of the input (the input is the graph and possibly additional individual inputs held by each node). Our protocol will run the RAM-compiler on this sublinear version of the verifier while providing it this query access. Luckily, evaluating a point of a low degree extension of the input is a task that is well suited for a distributed system, as it is a linear function of the input and hence can be computed “up the tree” using the prover. Thus, the [RRR16] protocol can be compiled to a distributed one with constant number of rounds and proof size.
A protocol with the same properties is given by Goldwasser, Kalai and Rothblumin [GKR15] in the context of low depth circuits (as opposed to small space). Let the class “uniform NC” be the class of all language computable by a family of -space uniform circuits of size and depth . They showed the any languages computable by “uniform NC” there is a public-coin interactive protocol where verifier runs in time given oracle access to a low degree extension of the input and the communication complexity is . Using the same approach as we did for the [RRR16] protocol, we can also compile this protocol to a distributed one with polylogarithmic number of rounds and proof size.
2.4 Below the Barrier
To construct protocols with proofs, we need to re-develop the basic “distributed NP” primitives only with a proof size in the required regime. Similar to the generality of the basic tree construction in distributed NP proofs, these tools are useful for many problems.
Constructing a Spanning Tree.
We begin by showing how to compute a spanning tree in the graph using only bits. We let the prover compute a BFS tree in the graph. However, the prover cannot even give a node its parent in the graph, let alone prove its validity.
We take a different approach, using the specific properties of a BFS tree. If a node is in level in the BFS tree, then its neighbors are all in level , or . Thus, we let the prover give each node its distance from the root modulo 3. This gives each node sufficient information to divide its neighbors into three groups: neighbors in the same level as , neighbors that are one level closer to the root, , and neighbors that are one level below, . The node defines its parent to be its neighbors in level with the minimal port number (all neighbors of each node are ordered by an arbitrary port numbering that is known to the prover). This way, each node has a defined parent in the graph, except if it had no neighbors of level which means that it is the root.
Let be the graph defined by . As in the standard proof labeling scheme for verifying a spanning tree, we first verify that is a tree (has no cycles), and that verify that it is also spanning.
First, we verify that there are no cycles in . Towards this end, we let each node sample a uniform bit and send it to the prover. Let be the path in the tree that the prover computed from to the root. The prover sends each node the number , that is the sum of the ’s on the path from to the root modulo 2. Nodes exchange this value with their parent in the tree. Each node verifies that . In the analysis, we show that if contains a cycle, then with probability the nodes will reject (this happens when the sum of the values on a cycle is odd).
By now, we know that contains no cycles. However, it might still be the case that is a forest. In such a case it will contain more than one root node. To eliminate this, we have the prover broadcast the value where is the root of the tree he computed. If there are more than one root in , then with probability their values will be different and thus nodes will detect this inconsistency. This insures that has no cycles and a single root thus it must be a spanning tree of . Of course, the soundness can be amplified by standard (parallel) repetition.
A corollary of the constructing such a tree is that the root of the tree is a unique chosen node in the network. Thus, this protocol also solves the “Leader Election” problem (LeaderElection) with a constant size proof in 3-rounds.
Super Protocols.
Our next step is to show how to run what we call “super protocols”. A super protocol simulates a protocol with proof size using only bits, by making computation on a super graph that contains super-nodes. The super graph is defined by decomposing the graph into blocks of size roughly such that each block will simulate a single node in the protocol. The benefit of this approach is that a block has a proof capacity of by having each node get only a single bit. In other words, a super-node (that corresponds to the block of nodes) can be given a proof of size in a distributed manner: giving a single bit proof for each of node in that block.
This brings along several challenges as no node knows the proof, but rather it is distributed among several nodes. To be able to work with these “fragmented proofs” we will need to come up with protocol that work on the super graph. Suppose a node in the super graph represents a block . To simulate a local verification of in the super graph , we need all nodes to cooperate to perform this verification. Towards this end, we will use the RAM compiler on a program that performs the verification, but we run the compiler only on the block , as if it was the entire graph. Since the size of the block is the cost of this compiler is only ! Furthermore, the node performs consistency checks with its neighbors in . Here again we use the RAM compiler, but on a graph that contains and a child of . The graph of these two blocks is connected, and of size . This is carefully performed in parallel for all children .
This was a very high level overview, and we proceed with formally explaining how to defines the blocks and the corresponding super graph. The spanning tree (whose construction was described before) is partitioned into edge-disjoint subtrees , which we call blocks. The precise protocol for this decomposition is given in Section 7.3. The main point here is that at the end of the protocol, each node knows its neighbors within the block.
Using the block decomposition, we show how to reduce the proof size in the protocol for SetEquality to , albeit at the expense of more rounds. The prover orders the nodes inside each block and sends each node its index inside the block. Since the blocks are of size the index requires only bits. To verify that the indexes are indeed a permutation, we apply the permutation protocol described above. However, we run it on each block separately as if the block was the whole graph. Since each block is of size the final cost of this protocol within each block is only !
We wish to run this protocol in parallel for all blocks in the graph. This works if the blocks vertex disjoint, however, the block we have are only edge disjoint. Nodes that participate in several blocks will get a proof for each block which blows up the proof size. Instead, we show how such node get divide their proofs among the blocks. At the end, we are able to run the protocols in parallel without paying an additional cost for these nodes.
The next step of the SetEquality protocol, is to have the root choose a field element described by bits. Let be the root of the tree and let be the block containing . We let the block to distributively choose , where each node picks a single bit. The prover reconstructs and can continue with the protocol. The main challenge now is that no individual node knows , only the prover.
After has been chosen and sent to the prover, the next step of the protocol is to compute the products and and verify that they are equal. First, we compute each product within a block. Let be a block rooted at , then we want the block to compute . Thus, we let the prover compute and send it to the block . To verify this, we can the RAM compiler on the block for a program that reconstructs , computes and finally compares it to (and similarly for the ’s). Again, this is performed for all blocks in parallel and has a cost of bits.
Each node in the super graph now has the value , and we verified that is indeed the product of all elements inside this block. Now, the prover computes the values where is the subtree of rooted at , and sends to the block (and similar for ). Now, node needs to verify this value by computing the product of for all its children .
We note that the block of and its children blocks are connected. Assume for simplicity, that has only a constant number of children blocks. Let be the graph that contains all these blocks. Then, we have that consists of vertices. Thus, we run the RAM compiler on this graph, for a program that on input all the values of the nodes, collects the bits of and reconstructs it, then reconstructs and for all the children blocks, and verifiers . The size of the graph is and thus again running this will cost bits.
This worked since we assumed that there are only a few child blocks, however the number of such blocks in general might be large. In such a case, we compute by computing them in pairs , such that for each pair the graph is always of size . This takes some delicate care of details. While this process is sequence and will take many iterations (as the number of children) we show how to parallel this using the prover.
There many technical challenges to make this plan go through and we refer the reader to Section 7 for the full details. The result is a five message protocol: first the prover sends the tree (and it is verified in messages 2-3), then the network chooses and then we run the RAM compiler in messages 3-5.
Once we have a protocol for SetEquality using bits of proof, we immediately get a protocol for DSym. In this problem, the nodes know a permutation and need to verify that it is an automorphism. We simply run the SetEquality protocol on the two sets of edges for and .
A Protocol for .
We describe a protocol for the clique problem, where the goal is to prove that the graph contains a clique of size where is known to all. The prover marks a clique of size selects one of the nodes in the clique to be a leader. We run the leader protocol described above to verify that indeed a single leader is selected. Finally, each marked nodes verify that indeed of its neighbors are marked and that one of them is the leader. This assures that there are exactly marked nodes and that they form a clique.
3 Definitions
3.1 Interactive Proofs with a Distributed Verifier
Our definition follows the definition in [KOS18]. An interactive proof is a protocol between a verifier and a powerful prover, where the goal of the prover is to convince the verifier that for some common instance and language . Usually, the verifier and prover are turning machines with different computational power. Here, we consider the case where the verifier is distributed.
Our model consists of a network of computation units that communicate in synchronous rounds. The communication pattern between the units is defined by an -vertex graph . In additional, each node may hold an additional input . Let be the set of all inputs. Then, the graph and the inputs define an instance , and the goal of the network is to determine if for some language , where is a family of vertex graphs and is a set of inputs where is the input of node .
The network is equipped with an one extra entity, , which we call the prover. This prover is connected to all the vertices in , and knows the entire input instance . Roughly speaking, the goal of this powerful prover is to convince the network that , where if we ask that the network will not be convinced no matter what the prover does. The prover knows the entire graph: it knows the ordering or the neighbors for each node in the graph.
The Complexity Measures.
Our primary goal in this paper is to minimize the bandwidth, that is, the size of messages sent in each round (within the network and also between the nodes and the prover). The total amount of messages sent is called the proof size (or proof complexity) of the protocol.
The class :
Let be a language of graphs and inputs and let be two parameters. For a verifier and a prover we let denote the protocol between them and we let be final output of the vertex in the protocol. We say that if there exists an -round protocol (i.e., messages) with verifier with the following properties:
Completeness: For every , there exist a prover such thats for it holds that . 2. 2.
Soundness: For every and every prover , we have for it holds that .
The probabilities are taken over the random coins of the nodes of the distributed verifier in the protocol between verifier and the prover .
When and prover goes first, this is the standard notion of distributed proofs (or proof labeling schemes). When the verifier sends the first message this is the analog of the AM calls and denoted as . Similarly, we define for three rounds and for four message and so on.
3.2 Limited Independence
A family of functions mapping domain to range is -almost pairwise independent if for every , , we have
[TABLE]
Theorem 6**.**
There exists a family of -almost pairwise independent functions from to such that choosing a random function from requires bits.
Circuits.
Some of our results used the notions of circuit. In this work, we consider circuits of constant fan-in and fan-out. The term “linear size” circuits refers to circuits whose size is linear in the sum of their input size and output size.
Linear Hash Functions.
Ishai et al. [IKOS08] showed how to construct a pairwise independent hash function that can be computed by a linear-sized circuit. Specifically:
Corollary 3**.**
[IKOS08*, Follows from Theorem 3.3]**
Let be a field of size . There exists a family of pairwise independent hash functions from to such that choosing a random function from requires field elements and evaluating any can be performed by an -sized circuit with gates that operate over .*
Definition 1** (Aggregate Function).**
We say that a function is an aggregate function if there exists a function such that where for and , and is computable in by a RAM program with operations over words of length .
3.3 Graph Definitions
We usually denote the graph by where is the set of vertices and is the set of edges. We let denote the neighborhood of in . We also call the vertices in nodes.
Definition 2** (Isomorphism).**
We say that two graphs and are isomorphic if there exists a bijection between and such that for any two nodes it holds that if and only if . We denote this by .
Definition 3** (Automorphism).**
A graph has an automorphism if there exists a non-trivial permutation such that for every it holds that if and only if (we call such a graph Symmetric).
4 A RAM Program Compiler
In this section we show our RAM program compiler. We take standard interactive protocols over -vertex graphs and transform them into distributed protocols. The cost of the distributed protocol depends on the running time of the verifier in the protocol when implemented as a RAM program.
A construction of a spanning tree in the graph is a basic tool in distributed proofs in general [KKP10] and in our context as well. Here, we let the prover compute a spanning tree rooted at an arbitrary node and send each node its parent in the tree (the parent of the root is ). Note that once each node knows its parent in the tree, it also knows its children in the tree.
Then, to prove that this is indeed a tree, the prover additionally gives each node its distance from the root, in the tree . Each node verifies consistency with its parent, i.e., (the root verifies that ). One can observe that verifying the distances from the root assures that there are no cycles in as otherwise there must be a node and its parent with inconsistent distances. Finally, to prove that the tree is spanning the prover gives each node the ID of the root where nodes verify consistency of the ID with their neighbors.
Using this tree, we develop an interactive protocol for a new problem we call SetEquality (defined next). This protocol will be used several times in our compiler (and later on) and in particular is used in a protocol for the Distinctness problem and Permutation program (also defined next). Next, we describe the SetEquality problem.
4.1
The SetEquality equality checks the equality of two (multi)sets and is formally defined as follows.
Definition 4** (SetEquality).**
In this problem each node holds two lists of elements and where for all it holds that for some constant and . Let and be two multisets. The goal of the SetEquality problem is to prove that as multisets.
Let be an -vertex graph and let be a field of size . We interpret the elements of and as elements in the field . To check that (as multisets) we define a polynomial and according to the elements of and respectively. That is, we define
[TABLE]
Note that and are polynomial of degree at most . We show that if and only if . Since the polynomials have low degree (compared to the field size), in order to check if they are equal it suffices to compare them on a random field element. For clarity of presentation, let us assume that nodes have shared randomness. At the end, we show to sample this shared randomness using the prover.
Thus, let be a random field element defined from the shared randomness. Then, we are left with evaluating the two polynomials and . To compute these polynomials we use a spanning tree construction, as described above. We let the prover compute a spanning tree and prove its validity. We use the tree to compute the two polynomials on . Towards this end, the prover sends each node the evaluation of the polynomials on the subtree : and . Nodes check consistency with their children in the to assure that all partial evaluations are correct. That is, they check that
[TABLE]
where are the children of in the tree. Finally, the root of the tree holds the two complete evaluations of polynomials and and verifies that .
This completes the description of the protocol assuming the element is shared randomness. To construct such shared randomness we do the following. We let each node sample at random, along with a random number . The node with the minimal “wins” in terms that we set and (observe that we cannot have the prover decide who wins, as otherwise could be biased). The prover will announce to everyone the winning and . Nodes verify the consistency of and with their neighbors, and thus assure that a nodes in the graph has the exact same elements and . We are left to verify that indeed is the minimal one value.
To verify this, each node will check that indeed where we expect exactly a single node to have equality. We count the number of such nodes by having the prover send each node the number of nodes that have equality in its subtree. That is, the prover sends node the value where if and 0 otherwise. The nodes check consistency of the with their children in the tree and finally the root verifies that . This assumes a common random string . The formal protocol is given in Figure 1.
We show correctness and soundness of the protocol.
Correctness.
The protocol succeeds as long as the is uniquely the minimal value. However, it is easy to see that . Thus, we continue the analysis as if all the ’s are distinct. Assume that as multisets. Then for any it holds that . For any tree with root it holds that and also that . Thus, the root will output 1, and in addition all intermediate nodes will output 1 after their local verification.
Soundness.
Assume that as multisets. Suppose that . In order for the prover to cheat, it must give the root values such that either or , since otherwise the node will output 0. However, since the node performs the local check with its neighbors in the tree, it holds that the prover must give wrong values to one of its children as well. This continues until the prover gives a wrong value to a leaf, where the leaf can verify locally and output 0 indicating that it revived a wrong proof.
Thus, we bound the probability that the two products collide (notice that the sets are fixed before the choice of ). Consider the polynomial , which is of degree at most over the field .
Claim 1**.**
* is not the zero polynomial.*
Proof.
We know that . Suppose that there exists an element . Then, we get that and , therefore and thus is not the zero polynomial. A similar arguments holds if . Since and are multisets there is a third possibility that the multisets share the same elements only with different multiplicities. Let be the multiset of their intersection . Define
[TABLE]
It suffices to show that is not the zero function. Define and . For these subsets we know that there must be an element that is in one set and not in the other. Assume without loss of generality that there must exist an element . Then, since we get that and therefore is not the zero polynomial. ∎
The polynomial has at most roots and since the field is of size we get that
[TABLE]
Communication Complexity.
Computing the tree and its proof take proof size, as shown in [KKP10]. Elements in the field are represented using and each node is given a constant number of elements (). We have that and thus also has short representation. The are a constant number of bits. Altogether, each node sends and receives bits.
4.2 Distinctness
In the Distinctness problem each node has a single value and the goal to verify that all values are distinct. That is, the output of the protocol is 1 if and only if it holds that for all such that we have that .
We show that this problem can be actually reduced to the SetEquality problem. Assume that the values are sorted such that . The prover sends node the value . Denote by the actual value received by a node . Then, node sets a bit to be 1 if and only if .
Let be all the original values and let be the set of all values given by the prover. Then, we run the protocol for SetEquality to verify that . Moreover, we run a sum protocol to verify that .
Completeness.
If all values are distinct then the honest prover will set . Thus, we will have that for all and and therefore . Moreover, we have that the values of are exactly the values of shifted by 1. That is, as sets we have that and thus the SetEquality protocol will pass as well.
Soundness.
To show soundness we define an -vertex directed graph with nodes being . Since the SetEquality protocol have passed successfully, we know that that as multi-sets. It therefore holds that for any there exists a such that . We then add the directed edge to the graph.
By the construction, we get that the in-degree and the out-degree of each node in this graph are exactly the node’s multiplicity in , which is at least 1. Thus, by an Euler argument, the graph can be decomposed to edge-disjoint cycles. However, since we know that all but one edge are strictly increasing in values. Thus, the decomposition can contain only a single cycle, and thus the in-degree and out-degree are exactly 1, which means that the values of are all distinct.
The Permutation Problem.
A specific instance of the Distinctness problem is when for we have that . In such a case, we are actually checking if the sequence form a permutation. We denote this problem by Permutation.
We observe that in this problem, the first message from the prover is redundant: node can compute by itself the value . Thus, we only need to run messages 2-3 of the Distinctness protocol which yields an protocol with the same proof complexity. That is, we have that . The formal protocol is given in Figure 3.
4.3 The Compiler
We present a general compiler that takes standard interactive protocols and transforms them into distributed interactive protocols. Let be an interactive protocol.
We show how to constructed the distributed version of for an -vertex graph with input . First (message 1), we let the prover give unique IDs in the range of to to the nodes. This is verified using the Distinctness protocol (in parallel to messages 2-3). This lets us order the nodes according to their assigned IDs.
Once the nodes are ordered, we can split the communication between the verifier and the prover to small parts for each node. Suppose that in the protocol the verifier sends a message to the prover. Since the protocol is public-coin, we know that is simply a random string. Thus, in the distributed version each node will send the prover a small random string, , where . The prover collects all and composes the random string according to the order of the nodes . Then, in the protocol the prover responds with a message . In the distributed version the prover distributes the string among the nodes. Each node gets where and . This continues for all rounds of the protocol . If the total communication complexity of is then the communication per node in so far in the protocol is .
Let be the verifier in the protocol . At the end of the protocol, the verifier has pairs for each round of the protocol distributed among the nodes. Let be the collection of all the ’s and let be the collection of all the ’s. According to in order to decide whether to accept we need to compute . However, computing this in a distributed manner is challenging as each node has a different part of the input to the ’s program.
Here we let the prover help us in computing by a three message protocol which we describe next. If the running time of is then the communication complexity (per node) of the final protocol will be . In general, our compilers takes as input a description of any -round IP protocol where the computation of the verifier can be done a time by a RAM program, and transforms it to a distributed -round protocol with proof complexity of . In particular, for time programs the proof size if .
A Canonical Form for RAM Programs.
A RAM program is modeled as a CPU that has a small state (e.g., containing the context of the registers) and performs a sequence of instructions to an external memory (the input to the program is assumed to be stored in the memory). Each instruction operations (i.e., can read and write) on the local state and on a single cell in the external memory. The instructions are numbered and we say that instruction number happened on time . Without loss of generality, we have each memory cell contain a timestamp of the last time it was updated. That is, if at time the program writes to memory address then the timestamp in cell will be updated to .
Observe that using this formation, the set of (value, address, time) triples which are written is equal to the set of (value, address, time) that are read. However, and might differ as multisets, as if the program writes a value once and then reads it multiple times. We follow the footsteps of Blum et al. [BEG*+*94] and show that any program can be easily transformed into a canonical form where and are equal as multisets as well, while paying only a constant factor in the running time. In short, we make the program read any location after writing it, and vice versa. Formally, we replace the read and write operations with the follows.
Write of value to address at time with state :
read value and time stored at address . 2. 2.
write value and time to address . 3. 3.
update state .
Read address at time with state :
read value and time stored at address . 2. 2.
write value and time to address . 3. 3.
update state .
In our setting, the state and timestamps are each of size and the memory is of size . Thus, a tuple (corresponding o the state, value to read/write, memory address and timestamp) can be described using bits, and the execution of a can be described by a list of such tuples.
The Compiler.
We let the prover run the computation of the program to get the description of its execution, i.e., a list of tuples . We divide the steps among the nodes, such that each node is assigned arbitrary steps of the execution. If the output of the program is a Boolean value, then an arbitrary node that is assigned the last instruction of the program will have the final output . Denote by the set of step numbers that are assigned to node . Then the prover sends node the tuples .
Our goal now is to verify that this is an honest execution of the program. Define to be the set of all for all the read operations and let be defined similarly for the write operations. Each node holds a list of tuples where each tuple is either in or in . Thus, to verify that the sets are equal we run the SetEquality protocol as described in Section 4.1.
Then, we check that the addresses and write values correspond to the instructions of the program. Let be the tuple corresponding to instruction of the program given to node , and suppose it is a write instruction. Then runs the instruction of the program on state to get address , value and new state , that is, . Then , checks that indeed and (for a read instruction we check only ). Denote by the set of all pairs given to the nodes and denote by the set of all pairs where is the state computed above (if is the state of the last instruction then we let be the state of the first instruction). Then, are we need to verify that and again use the protocol for SetEquality for this. The protocol is given in Figure 4.
Completeness.
The completeness follows directly from the construction. An honest prover will provide true IDs for the nodes, follow the computation of the RAM program each provide each node with the correct memory values read by the computation. Then, we will have that and thus the node will accept.
Soundness.
If the prover did not provide unique IDs in the range of to the nodes, then this will be detected by the distinctness protocol. If the prover provided quadruplets such that then, this will be detected by the SetEquality protocol. Thus, it must be the case that but the RAM program outputs 0. This means that the out that receives is 1, and thus there must be an inconsistency in the computation. However, by the canonical form of the computation, we know that any inconsistency implies that .
Number of Rounds.
The protocol will have some addition rounds to . In the first round, have the prover sends the IDs of the nodes. If begins with a message from the prover to the verifier, then this can be sent in parallel to this message. Otherwise, a simple solution is to have this as the first message before the message from the verifier. This would add the round complexity by 1.
The last message of the protocol is an message. At this point, we run a protocol to simulate the computation of . This is an protocol. Thus, we can have the first message be in parallel to the last message of the . Then, we have an addition 2 message.
Overall, if is an message protocol then will be either message, if the prover goes first in , or an message protocol otherwise.
We observe that we can avoid that addition first message in the case where starts with the verifier. In the first message, the nodes chooses random values and send them to the prover. Then, we force the prover to send the IDs (in the second message) according to the ordering of the values. Let be the nodes ordered according to the values and let be the ID given to node . The prover additional sends each node the value . Each node sets and . The nodes run a SetEquality equality protocol for and where and . We observe that the prover is honest if and only if (the reasoning is very similar to the soundness argument in the Distinctness protocol).
5 Graph Asymmetry and GNI
The graph Asymmetry language consists of all graphs that do not have a non-trivial automorphism, i.e. every non-identity permutation of its vertices yields a different graph.
Definition 5** (Graph Asymmetry).**
The language Asym contains all graphs that do not have a non-trivial automorphism (all asymmetric graphs).
We show a public-coin protocol for graph asymmetry that uses our RAM program compiler. The protocol consists of 4 rounds with communication complexity. Formally, we show that
Theorem 7**.**
.
We begin by a short description of a standard (centralized) interactive protocol for Asym which is a simple adaptation of the public-coin protocol for graph non-isomorphism [GS89, GMW91] (see also [BM88]). From hereon we denote this protocol by the “GNI protocol”.
Let be the set of all graphs that are isomorphic to . That is, . The main observation of the GNI protocol which follows here directly is that if has no (non-trivial) automorphism then while if does have an automorphism then . Thus, the focus of the protocol is on estimating the size of .
The verifier samples a pair-wise hash function , where is the smallest number such that and sends it to the prover. The prover seeks for a graph such that . The main observation is that the probability that such a graph exists is higher when is larger which will allows us to distinguish between the cases of . As observed by [KOS18] using an almost-pairwise hash function suffices for these purposes, and a seed to such a function takes bits. Note that to send it is enough to send a permutation such that , which can also be represented with bits. These facts are important, since if we wish to get an distributed protocol per node, we must have the underlying centralized protocol as communicating a total of bits.
Our goal is to simulate this protocol with a distributed verifier. The nodes have unique IDs (strings of length ) and let be the ordering of the nodes sorted by their ID. Let be the index of node in this ordering, and let be the ID of node . Note that a node does not know in advance, but the prover knows the entire ordering.333This is without loss of generality, as the nodes can pick unique IDs in the range to (uniqueness holds w.h.p.) and send them to the prover.
As our first step, we hash the set to a set of the same size (w.h.p.), and where elements in have a small representation. In particular, while the graphs in are represented using roughly bits, the hashed values in will have only bits of representation. More importantly, the hash used to compute will be local and can be easily computed by the nodes of network by considering only their own neighborhood.
In more detail, our hash function will be composed of hash functions. Each node chooses a seed for an -almost-pairwise hash function (according to Theorem 6):
[TABLE]
where . The seed length is of size bits. Let be the chosen hash function ordered by the index of the nodes (i.e., by ). Let where is the indicator vector for the neighbors of node in . Then, we define a hash function as
[TABLE]
Using we can define the set . It is easy to see that . We note that the fact that is locally computable means that the probability of any two graphs colliding under is not as small as we would like (say in a pair-wise independent function) and we need finer analysis to show that is close to . The key point is that contains graphs that are all isomorphic to each other and hence we are able to show that with high probability it holds that (this is shown in Claim 2).
Assume that indeed . Then, we can continue to simulate the centralized protocol where we replace the set with the set . That is, to sample the pairwise hash pairwise hash function which has a seed length of we let each node sample a chunk of the seed, , of length . Then, we let the seed of to be where again the ordering of the nodes are according to their IDs. The prover knows this ordering and can construct accordingly.
The family of functions that we pick for the task is that of Corollary 3: it has a succinct description and can be evaluated by a linear sized circuit over the field (or a linear-time RAM). It is crucial to use a hash function that can be computed in linear time since we are going to apply our RAM compiler on this computation and want a minimal overhead.
The prover sends the graph by sending the permutation in the following way: it sends the node its new ID in . Each node learns the IDs of their neighbors in , denoted by . Moreover, the prover sends each node its index . The validity of these index will be checked next. Our goal now is to verify that , and to check the validity of the index’s given by the prover. Let . Notice that can be computed locally, since is a local hash function, that is, each node computes . Then, we claim that the rest of the computation, i.e., computing can be performed by a linear-time RAM program and therefore applying our RAM program compiler of Section 4.3 on this linear time program will finish the last part of the protocol with proof of size .
That is, we write a RAM program as follows. The input to is a list of tuples . The program composes the seeds according to the ordering to get the seed . Then, it composes , computes and verifies that . Finally, the program verifies the indexes given by the prover. It create an array of length and sets to be the ID of the node with index . Then, it traverses and verifies that which guarantees that the IDs where given by the right order. This completes the description of the protocol, see Figure 5 for more details.
Completeness.
To show completeness, we need to show that with high probability there is a graph such that . We show that the set (with high probability) will have the same size as .
Claim 2**.**
.
Proof.
Let and be two graphs, and define if they differ on neighborhoods. That is, if and then
[TABLE]
Let be the set of indices on which and differ. Then,
[TABLE]
Thus, the probability of collision is smaller as is larger. We want to take a union over all pairs to show that there are no collisions. One concern here is the high collision probability for small values of . However, what we show is that there are only a few graphs in that have small distance.
We bound the number of pairs of graphs in that have distance . There are possible locations for which a pair of distance can differ on. Fix a specific set . Since , the locations in each graph are a permutation of either . Thus, there are possibilities for each graph, and possibilities for the pair. All together, we have
[TABLE]
Thus, we can bound the probability that a colliding pair exists:
[TABLE]
∎
Therefore, we condition on the event that . Recall that is the smallest number such that . Following the analysis of the GNI protocol we show that
[TABLE]
The upper bound follows by a simple union bound. For the lower bound we observe that
[TABLE]
Let where . If has no automorphism then and thus
[TABLE]
On the other hand, if has an automorphism then and thus
[TABLE]
Thus, we have completeness and soundness . Since we can perform parallel repetition, it suffices to show that is bounded by a constant. Since it holds that
[TABLE]
Thus, repeating a constant number of times, we can push these parameters to and for any constant while paying only a constant factor in the communication complexity. This completes the analysis of the protocol.
5.1 Graph Non-Isomorphism
The protocol above for Asym can be easily adapted to solve the GNI problem with the same complexity. In the GNI the input is two graphs and however since there is only one network graph, there are several interpretations of what is the distributed analog of this problem. This is the reason we focused on the Asym problem where there is no ambiguity.
Definition 6** (Graph Non-Isomorphism).**
The language GNI consists of all pairs of graphs where is not isomorphic to the graph .
Here we assume that the communication graph is the union of and where node gets a trinary inputs for each incident edge indicating if the edge is contained in , or both.
When adapting the Asym protocol for GNI this results in a GNI problem where nodes can communicate on both graph and . That is, there is one set of vertices for the network and both and are defined over . The edge set of the network is the union of the edges of and . The edges are marked if they belong to or or both.
The protocol we presented for Asym was an adaptation of the protocol for GNI. In the GNI protocol, we define the set
[TABLE]
The key point is that if then while if then . Thus, the goal is to estimate the size of just as in the Asym protocol. One difference is that here we need to additionally verify that is an automorphism. We employ the protocol of [KOS18] for this. The result is the following
Corollary 4**.**
.
We note that while this improves upon the of [KOS18], our protocol works only when the GNI problem is defined such that nodes can communicate on both graphs and , where the protocol of [KOS18] works also on the definition GNI where only is the communication graph and is given as input nodes. In Section 6 we show a protocol for GNI in this harder definition as well, that has constant many rounds and bits.
6 Compilers for Small Space and Low Depth Verifiers
In this section we present compilers that transform any centralized prover-verifier interaction on the graph where the verifier uses either small space or requires low-depth into a distributed constant round interactive protocol
We start with verifiers that require only small space and show that they can be be turned into a distributed constant round interactive protocol with a small proof size. Formally, we show the following:
Theorem 8**.**
There exists a constant such that if is a language that can be decided in time and space then .
The main tool behind this theorem is the interactive protocol of Reingold, Rothblum and Rothblum [RRR16]. They show that for every language that can be evaluated in polynomial time and bounded-polynomial space there exists a constant-round interactive proof such that the verifier runs in (almost) linear time. This is an excellent starting point for us, as our RAM compiler works great for compiling verifiers that run in linear time.
However, a linear-time here is with respect to the size of the graph, i.e., , and we wish to reduce the running time to before we apply the compiler. As already observed in [RRR16], the running time of the verifier can be made sublinear (e.g., for some small constant ) if the verifier is given oracle access to a low degree extension of the input (the input is the graph and possibly additional individual inputs held by each node). Luckily, computing a point in a low degree extension of the input is a task that is well suited for a distributed system, as it is a linear function of the input and hence can be computed “up the tree”.
The following theorem is a simple adaptation of the result of Reingold et al. [RRR16]. Here we state the theorem with respect to inputs of length (the size of the graph), so as not to confuse it with the parameter , which in our context denotes the number of nodes in the graph.
Theorem 9** **(Follows444In the original Theorem
of [RRR16] the number of queries to the low degree extension of the input is bounded only by . However, for low degree extensions, this can be reduced to a single query. The high level idea is to consider a low degree curve that agrees with all the queried points. The prover specifies the values for the points on the curve and the verifier checks answer on a random point on the curve. See [KR08, Section 6] for further details. from [RRR16]).
Let be a language that can be decided in time and space , and be an arbitrary (fixed) constant. There is a public-coin interactive proof for with perfect completeness and soundness error . The number of rounds is . The communication complexity is . The (honest) prover runs in time , and the verifier runs in time , given a single query access to a low-degree extension of the input.
Let be a language that can be decided in time and space , where is a small enough constants such that the communication complexity and verifier running-time from Theorem 9 are . Thus, we can distribute the communication of the protocol between the nodes such that each node gets a single bit.
Then, we need to simulate the computation of the verifier on the transcript. Since the running time of the verifier is (given oracle access to a low degree extension of the input), we use the RAM program compiler of Theorem 1 to simulate this part. The number of rounds will grow only by 2 and the proof size is .
Finally, we need to implement oracle access to the low degree extension of the input. We explain exactly what this means and how to compute it. We give a description of low degree extensions.
Low Degree Extensions.
Let be a finite field and let be an extension field of , such that . Fix an integer , and let be a function. It is well known that there exists a unique extension of into a function which agrees with on such that is an -variant polynomial of individual degree at most . The function is called the low degree extension of with respect to , and .
Furthermore, there exists a collection of functions such that each is an -variant polynomial of individual degree , and for every function it holds that
[TABLE]
Oracle Access to the Low Degree Extension.
Let be the input of the verifier. That is, contains the graph itself and additional inputs that each node has (e.g., randomness and arbitrary other inputs). We interpret as describing the truth table of a function . Then, the oracle of a low degree extension of the input is a query to the function .
Let be the query performed by the verifier, where , and let be the expected point. That is, the task of the verifier is to check that . The point and is defined in the transcript of the protocol and thus each node has a single bit of . We let the prover broadcast and to all the nodes which in turn verify consistency with their local bit. Now, we need to compute
[TABLE]
where all the nodes know and each node knows a part of the truth table of . In the [RRR16] protocol, it was shown how to compute this in (almost) linear time by a centralized prover. Here, we show how to compute this by a distributed verifier (where local computation is free).
Let be all the elements such that the node knows . This includes all edges incident to and all bits of ’s additional input. Then, can locally compute where
[TABLE]
Using this notation we have that
[TABLE]
Finally, the nodes compute the sum “up the tree”. That is, the prover sends and tree with root (along with a proof) and for each node he sends the sum , nodes check consistency with their children to assure these values. The root has value and verifies that indeed . This completes the description of the compiled protocol.
Communication Complexity.
In the [RRR16] protocol, the field is set to be of size and the field is of size . The parameter is set such that . Thus, we get that the point can be written using bits. The variable can be written using bits. Altogether, the total communication received by a node is bounded by . Finally, by repeating the protocol a constant number of times (in parallel) we can improve the soundness arbitrarily while increasing the proof size by only a constant.
Using the compiler for GNI.
In Section 5.1 we have seen a protocol for the GNI problem. However, the protocol works only for the setting in which both graphs and are communication graphs. A more difficult formulation of the problem is where is the communication graph and is given as input to the nodes. That is, each node gets as input its neighbors in but it cannot communicate with them directly.
We observe that our compiler for small space can be used to get a protocol for the GNI problem in this stricter formulation using a constant number of rounds and a proof of size . For this, we need to show a standard (centralized) interactive protocol (with public coins) where the verifier uses small space at the end to verify the interaction.
We observe that the standard Goldwasser-Sipser interactive protocol for graph non-isomorphism, as discussed in Section 5, can be implemented in small space by choosing the right hash function. We need a hash function that has a small collision probability, and the new requirement is that verifying that can be done in small space. First, as in Section 5 we let be the concatenation of hash functions such that . The ’s are chosen independently by the nodes and sent to the prover. Each is chosen from a family of almost pair-wise hash functions that can be evaluated in small space (i.e., ), and the seed length is . Many One example is to define
[TABLE]
where and . The probability of collision under is bounded by . One can observe that this family has all the required properties. Thus, has seed length and can be computed in space. Using our compiler for small space computations we get the required protocol.
6.1 A Compiler for All NC Computation
We have shown how to compile the [RRR16] protocol into a distributed verification protocol that has constant rounds and a proof of size . The crux of this solution is based on the fact that given oracle access to a low degree extension of the input, the verifier can be made very efficient. This allowed us to use the RAM compiler while supplying the low degree extension via a computation on a spanning tree.
The protocol of Goldwasser, Kalai and Rothblum [GKR15] shares similar properties with the RRR protocol which will allows us to compile this protocol as well. The protocol of [GKR15] considers verifier whose computation can be implemented by low depth circuits (as opposed to small space). Let the class “uniform NC” be the class of all language computable by a family of -space uniform circuits of size and depth . They showed that for any language computable by a uniform NC circuit there is a public-coin interactive protocol where the verifier runs in time given oracle access to a low degree extension of the input and the communication complexity is .
Similarly to our compilation of the [RRR16] protocol, we can also compile the GKR protocol with a slightly larger cost. The number of rounds and proof size will be , compared to the -round and proof size in the case of [RRR16]. We therefore have:
Theorem 10**.**
For any language in uniform NC, it holds that .
Actually, the formal statement is more general. If can be decided by circuits of depth and size then which is non-trivial even for circuits of depth .
One type of problems for which the GKR based protocol may be more useful than the RRR based one is for problems based on shortest path problems. For instance, proving that the diameter of the graph is a given value. This problem is in NC and hence our protocol is applicable to it.
7 Below the Barrier
Our protocols so far used sized proofs, which appears as a very natural barrier as simple tasks as counting and even pointing to a neighbor seem to require bits. Nevertheless, in this section we show to “push” the protocols above to use only at the price of more interaction rounds. Our main result is a 5-message protocol for the Decisional Symmetry problem which is the same as Sym except that the permutation is fixed and the task is to decide whether it is an automorphism. Kol et al. [KOS18] showed that .
We show that by adding more rounds of interaction we can further reduce the proof size to .
Theorem 11**.**
.
We begin by presenting a simple protocol, then we show how to simulate this protocol using bits. Let be the communication graph and let be the graph after applying the fixed permutation. Each node knows its neighbors in and its neighbors in where . Then check that the two graphs are equal, we need to verify that the set of edges in and in are the same. Thus, we run the SetEquality protocol on the two sets of edges.
We re-develop the basic “distributed NP” tools but pushed down to the regime. Similar to the generality of the basic tree construction in distributed NP proofs, these tools are basic and can be used for many other problems as well. We show how to compute a tree in the graph using only bits.
7.1 Tool 1: Constructing a Rooted Tree
The basic tool used for all our protocol was a spanning tree of a graph. Moreover, a useful property of a tree is that it defines a unique node in the graph, the root, which plays an important role in the protocols above. While constructing this tree is simple using messages of length bits, it is a challenging task using a proof of only bits.
In the first message of the protocol, the prover computes a BFS tree in rooted at an arbitrary node . Here it is crucial that the tree is a BFS tree. Then, it sends each node its distance from the root modulo 3, denoted by . Nodes exchange the value , to learn the distances of their neighbors. Recall that in a BFS tree, the neighbors of a node can be either in the same level of the tree, one level higher, or level lower. The values enable each node to partition its neighbors into these three groups: a neighbor such that is in the same level as , if then is one level higher than and if then is one level lower than . Each node sets its parent in the tree to be its neighbor with with the minimal port number. If no such neighbor exists, then this node is a root.
Let be the resulting graph. That is, is defined on the vertex set and has edges . Note that if the prover is honest, then the graph is indeed a tree, however, it might be different than the BFS tree computed by the prover. In any case, the only property we require from is that it be a spanning tree of . If the prover is dishonest, then might not be a tree at all, and in particular might contain cycles.
To combat such cycles, each node samples a uniform bit and sends it to the prover. For each node let be the path from the node to the root in the (alleged) tree . The prover sends each node the value , that is the sum of the ’s on the path from to the root modulo 2. Nodes exchange these values with their parent in the tree. Each node verifies that . If contains a cycle, then we claim that with probability at least the nodes will reject. Indeed, let be a cycle in . If (which happens with probability ), then the values on this cycle must be inconsistent and thus there will be at least one node that rejects.
So we know that contains no cycles or the cheating prover is caught with a reasonable probability. However, it might still be the case that is a forest. In such a case it will contain more than one root node. To eliminate this, we have the prover broadcast the value to all nodes in the network, which in turn check for consistency. If there are more than one roots in , then with probability their values will be different and thus nodes will detect this inconsistency. This insures that has no cycles and a single root thus it must be a spanning tree of . Of course, the soundness can be amplified by standard (parallel) repetition.
The result of computing a tree is that there is a single root , a unique chosen node among the nodes in the network. In particualr, the result is a protocol for “leader election” that uses a small proof.
Corollary 5**.**
.
The formal protocol is given in Figure 6.
Completeness.
Completeness follows directly from the construction. The honest prover picks exactly one root . Computes a BFS tree rooted at , and gives the correct distances modulus 3 and the correct values . Thus, all the consistency checks of nodes will pass.
Soundness.
First, observe that regardless of the values sent by the prover, each node (except the roots) will identify a single neighbor as its parent in the graph .
Case 1: The graph has no root.
If has no root then it must contain a cycle. Let be such a cycle. With probability half it holds that . Recall that each node verifies that . Thus, we have that . However, since there are no set of values that will satisfy these conditions.
Case 2: The graph has more than one root.
Let be two roots. Then, with probability half it holds that . In such a case, the prover cannot broadcast value 0 or 1 and there must be a node that will reject this consistency check of the broadcast.
7.2 Tool 2: Proofs that Grow with the
Degree.
We have constructed a tree in the graph . The degree of a node in the tree is the number of children has in , denoted by . In the rest of the protocol, it would be very helpful if each node could get a proof of size . However, some nodes might have large and we cannot send such a proof directly.
Instead, we let each node get the proof of its parent . The leaves of the tree get their own proof in addition to the proof of their parents. Then, each node sends its proof to its parent. Since each node has at most one parent in the tree, its gets a single proof of size (the only exception is the leave which get two such proofs). If a node has degree then it gets proofs from its children, each of size . Thus, we can simulate each node receiving a proof of size within our budget. From hereon, we will describe the protocol with such a proof size and at the end such a transformation is applied.
7.3 Tool 3: Decomposition into Blocks
We have constructed a tree in the graph and have also increased the proof capacity of nodes with high degree. We air to further increase the proof capacity. The high level idea is to decompose the tree connected components of size roughly . Then, each block can act as a super-node that has capacity even if each real node gets only a single bit. Then, we need to simulate what the single node would have computed in a distributed manner in the block.
We devise a protocol to decompose the tree into edge-disjoint subtrees , which we call blocks. Let be a graph with vertices, where node corresponds to a block . There is an edge if where is a root of and not of . We require the following from the decomposition protocol:
it holds that . 2. 2.
Blocks intersect at roots: If then , and if then is a root. 3. 3.
The graph is a tree, and if is the parent of in and is the root of then . 4. 4.
Each node knows its neighbors inside each block it belongs to.
Computing such a decomposition by a centralized algorithm is simple and can be done “from bottom up” on the tree, while greedily packing nodes into blocks. Thus, we first let the prover compute this decomposition which we describe next. Then, we describe how the prover sends the result back and proves that he indeed computed the decomposition correctly. The whole decomposition is performed in the first round of the protocol.
The Centralized Algorithm.
The decomposition is computed greedily starting from the leaves and working level by level up to the root. In level , if there is a node such that the size of the subtree is in then declare as a block and remove all nodes in except for .
If there is a node such that the size of the subtree is greater than then we traverse its children according to the port ordering and we greedily pack them into blocks each of sizes in where each block has as its root (this can be done since for all ). For each block, we remove all nodes of the block except the root .
We continues this for all levels of the tree . At the top level, there might be at most remaining nodes, and we add them to the last declared block. This is the only block that might have size more than (but at most ). This completes the description of the algorithm.
One can easily verify that this decomposition satisfies properties 1-3 (where the last property is described next). The blocks are by definition of size at least and at most . The only intersection between blocks is the roots that are not removed when a block is declared. The edges in are always between a root to a block that is at a higher level in the tree, and thus is a tree.
How to Distribute the Output.
We distinguish between three types of nodes: (1) those who are not a root any block, (2) those who are a root in exactly one block and (3) those who are roots in more than one block. Thus, we let the prover send a trinary value to each node indicating each type.
For nodes who are not roots, the output is very simple. These nodes must be a member of only one block and their neighbors in this block are their neighbors in . Thus, no additional information is required from the prover.
Nodes that are root in a single block, are a leaf in the other block. Thus, their children in are their neighbors in the first block, and their parent is their only neighbor in the second block. Again, no additional information in needed from the prover for these nodes to know their neighbors in the block.
The third type is a bit more complicated. A node that is a root in many blocks means that its children have been greedily grouped in several blocks. However, since the algorithm groups according to the port ordering of the root node, it suffice to know the first node in each block in order for the root to divide its children according to the blocks. Thus, we have the prover mark the first node of each block and these nodes notify their parent i.e., the root, that they are first in their block. This way, the root knows exactly which children are in each block.
Soundness.
First, observe that no matter what a dishonest prover sends, by the definition of the output of the protocol, the nodes will be decomposed into edge-disjoint blocks that satisfy properties 2-4.
The only thing we need to verify is that the size of each block is indeed in the desired range of . The prover sends each block a proof of the number of nodes in . That is, each node in gets the size of its subtree inside the block . Nodes check consistency with their neighbors in . Since that blocks are supposed to be of size , the partial sums can be described using bits.
We note that in this protocol, a root will have a proof of size that is proportional to the number of blocks that is participates in (which is bounded by its degree). However, as described before, this can be reduced using the transformation described above (see Section 7.2).
7.4 Tool 4: Set Equality via Super Protocols
Our protocol has computed thus far a spanning tree in the graph , a decomposition of into blocks of size and a super tree between the blocks. The advantage of having such blocks is that their total proof capacity is . That is, if we consider the “super” graph , we can send each node in a proof of size by sending each node inside the block bits. Then we can run protocols on the graph with very small proof size. These would be call “super protocols”, and would let us simulate the SetEquality protocol from Section 4.1 using only bits.
In the SetEquality protocol, the root of the tree chooses a random field element which it sends to the prover. The prover broadcast to all nodes and they verify that they all got the same value. The element is bits long. Thus, we simulate this oh by having the whole block of the root choose together. Then, in the original SetEquality protocol, then prover sends each node the value (and similarly for . The prover here will send for each node of to the corresponding block.
Now, our goal is to verify that the prover gave correct proofs. That is, every two neighboring nodes in , need to verify that the have received the same element , every node need to compute with its children in . We show how to simulate simple” one-round protocols in with proof size using only bits in the tree . Here the term “simple” refers to protocols where the local computation performed by each node is an aggregate function (see Definition 1). Aggregate functions are functions that works on many inputs but that can be computed by applying the same operation on pairs of inputs. The main examples we use are “Equality” (verifying that all neighbors have the same value in the proof) and “Field Op” (e.g., computing the sum/product of all neighbors value over a field). If is an aggregate function then there exists a function such that in order to compute it suffices to compute for all where (if the function is Equality, then would simply be the equality of its inputs, and in case of a “Op” then would be an addition/multiplication field operation).
To simulate the protocol we distributed the proof for a node among the nodes in the corresponding block, such that each node gets bits of the proof, and a -long index indicting the location of these bits in the proof. Let be the proof of block . Then, we need to simulate the local computation, , of a node in with its children in the tree . We let the prover compute and since each is bits long the prover distributes to block . What is left is to verify that indeed the values given by the prover are correct. That is, for each we need to verify that , where block has and and block has and blocks and a shared parent in .
Consider the block and its children . Algorithm consists of sub-algorithms that will run in parallel. The role of Algorithm is to verify that . The main idea is to define a graph which contains , and a path that connects them. Then, since this graph has nodes, we can run our RAM compiler on to compute . Since the centralized algorithm can be computed in time , the cost of such a protocol would be .
The idea above works nicely if and share a root. Then, all nodes know their neighbors in and can run the compiler protocol. Moreover, we can run all such protocols in parallel. Each block participates in at most two-such protocols (one with the previous block, and one with the successor block). A root of a block might participate in such protocols, which means that it will get proofs. Again, using Section 7.2 this can be reduced to bits. After this phase, we are left with all such that and do not share a common root.
However, since the parent block is of size and the remaining children are vertex disjoint, then there can be at most such trees. Thus, let be the graph containing the parent tree and all the left children. Then this graph has size at most and we can again run the RAM program compiler on it to compute all the values . The cost of this will be a proof of size .
The final result is a SetEquality protocol. Notice that the operations in the original SetEquality protocol are actually aggregative functions. We use a “super protocol” to verify that all blocks received the same element . Then, we use another super protocol to compute the product each the children of a block to compare with the value . Finally, the root block will hold and and we can run a RAM compiler inside the block to verify their equality.
Rounds.
The tree is computes in message 1 and verifier in messages 2-3. In message 2 the root block chooses and sends it to the prover. In message 3 the prover responds with and with and for each block. Then, we run the different super protocols which take 3 rounds as to run the RAM compiler which will be sent as messages 3-5 (message 3 is used for and for the first message of the super protocols). In total this is a protocol that uses bits.
Corollary 6**.**
.
7.5 DSym, Clique and More
The tools described above are quite powerful in the sense that they allow us to solve several different problems using a proof of size , except for SetEquality. One particular example is the problem DSym which is similar to the Sym problem except that the automorphism is fixed. That is, a permutation is given to all nodes, and the goal is to decide if is an automorphism of the graph. This problem was studied by [KOS18] where they showed that but any distributed NP proof for requires a proof of size .
We show that using more interaction, we can reduce the proof size to . Each node knows its neighbors in the graph , and applies to learn its neighbors in . Now, we need to verify that the set of edges in is the same as in which is simply solved by a running the SetEquality protocol described above. Thus, we get the following corollary:
Corollary 7**.**
.
Summing Up the Tree.
Another application of the super protocols described above is that we “sum up the tree” within the capacity. Suppose each node has a value and we wish to verify that where is known to all.
For every root of block the prover computes the sum of values in the subtree rooted at . Note that is bits long and the prover cannot send it to . Instead, the prover distributed among the nodes of the block .
Then, we run a super protocol for the “addition” operation. That is, here . This ensures that for the root of is in the block then this block has the correct value . Finally, to check if this value equals we run the RAM compiler inside the block . The total proof size of this protocol is . Moreover, this can be done in three rounds: the tree and block decomposition are given in the first message and verified in messages 2-3. The values are given also in the first message. Then, we run the RAM compiler which is three messages that can be perform in parallel to messages 1-3.
Using this we get several different problems in the . For example, it was shown by [KKP10] that the problem Clique of proving that a graph contains a clique of size (where is a fixed parameter) can be done by a distributed NP proof of size . Plugging in our addition operation we get a 3-round protocol with proof size.
For the particular problem of Clique we get actually modify the protocol to depend only on the leader election protocol and get a constant size proof. The prover marks a clique of size and selects one of the nodes in the clique to be a leader. We run the leader protocol described above to verify that indeed a single leader is selected. Finally, each marked nodes verify that indeed of its neighbors are marked and that one of them is the leader. If each node has neighbors then we know that there are at least nodes marked as the clique. If there are more, then there will a node that is not the neighbor of the selected leader. Thus, this assures that there are exactly marked nodes and that they form a clique. Formally, we ge that
Corollary 8**.**
.
8 Extension and Open Problems
We have shown that a distributed verifier interacting with a prover in a randomized manner is very powerful. To a large extent our results show that it will hard to prove lower bounds in this model, especially super-polylogarithmic lower bounds.
8.1 Argument Labeling Systems
Can the interaction be eliminated? As discussed in Section 1.1 this is not possible without changing the model. A common approach for eliminating interaction is the Fiat-Shamir transformation or heuristic (first used in [FS86]) that converts a public-coins interaction into one without interaction. In the Fiat-Shamir setting the parties have access to a random oracle, and the prover is computationally limited: it can only perform a (polynomially) bounded number of queries to the random oracle. This results in an argument system rather than with a proof system. In such a system, proofs of false statements exist, but it is computationally hard to find them. Therefore, such protocols do not contradict the lower bounds for proof labeling schemes. We call such a system an “argument labeling scheme”.
Applying such a transformation should be done with some care in general, and even more so in our setting. First, the error probability should be small, say, where is what is known as the “security parameter”. That is, the cheating prover has limited running time, but gets (that is, in unary representation) as input. We need the running time of the prover as a function of to be significantly less than . There are general statements regarding the type of protocol for which the Fiat-Shamir transformation preserves soundness works (see [CCH*+*18] and references therein). These include constant round protocols and the GKR protocol, so the protocols considered in this work are covered.
To use the Fiat-Shamir transformation in the distributed setting, we need to apply the random oracle to the entire input, in our case, the graph. While each node has access to the random oracle, they still do not know the entire graph and thus cannot compute . Instead, we let each node apply to its local neighborhood. Then, we combine all the results using a spanning tree. That is, for a node , let be its children in the tree. Suppose that compresses strings of arbitrary length into strings of length . We define the values computed by node . The computation works from the leaf nodes up to the root. The leaf nodes set the value (where are the neighbors of ). Then, an inner node where children’s values computes . Finally, the value of the oracle is where is the root vertex (this is also known as “Merkle Tree computation”).
Note that since the tree is chosen by the prover, it introduces another source of cheating for a dishonest prover. For each tree used by the prover,it gets a single value of the oracle function. It can be proven that every change in the tree requires at least one new call to the oracle (w.h.p.). Since the prover has limited running time, it allows him to try a bounded number of trees, in fact, polynomially many, which can be shown to have an only negligible effect on the success probability.
With this approach, we can obtain an argument labeling systems with bits for all the problems discussed in this work in a setting where the prover is polynomial given the witness. This includes “permutation” and “Symmetry” as well as all problems solvable small space and in NC.
However, once we settled for computational soundness we can get even more general results: we can apply it to the setting of verification of computation in the style of Kilian [Kil92] and Micali [Mic00] (see also Barak and Goldreich [BG08]) where the correctness of a computation inside P is proven using a PCP proof. The proof itself is not communicated to the verifier but rather committed using a collision resistant hash function (a cryptographic primitive which exists in the random oracle world (as well as from collision resistant hashing).
The main challenge in incorporating these techniques in our setting is that the input is assumed to be encoded by a linear code, as in [BFLS91]. In our case, the input is the graph and we cannot encode the graph using a distributed verifier since if it is dense it is too large. Instead, we have the prover encode the graph and broadcasts a short commitment of the encoding to the nodes. Then, the verifier needs to check a small number of locations in the PCP proof (say, locations) and a similar amount in the encoded input. The network needs to verify the PCP proof of [BFLS91]. Some of the work can simply be done by a chosen leader, but the sensitive part that requires “the whole village” is verifying the correctness of these locations in the encoded graph. This can be done by a distributed verifier: If is the generating matrix of the code ( is a fixed matrix known to all), then computing a single location in the encoding corresponds to the inner product of and where is a long vector that represents the graph, is the index in the encoded word and is the row of . Each node can compute locally the its neighborhood the inner product with the corresponding locations in and the full inner product can be easily computed “up the tree” in the graph as we have seen. So the result is that any problem in P has an argument labeling scheme of length in the random oracle model.
An intriguing question is whether the recent exciting results on using the Fiat-Shamir method without random oracles are relevant in the distributed setting (e.g. [KPY18, CCH*+*18]). A reasonable modeling assumption in this setting is the common random string or common reference string.
8.2 Open Questions
There are many interesting questions arising from this work (see also the questions in the body of the paper). Can we tradeoff interaction for communication? We have seen example where going from dAM to dMAM (Symmetry) reduce the communication exponentially, so at the very least we expect to pay a significant cost. This is particularly useful when the communication is . Is there a general reduction from private coins to public coins with a distributed verifier? Does having shared private randomness help?
Finally, a natural property to consider is distributed protocols that are zero-knowledge. For public-coins protocol we note that if our compiler (any one of them) gets a zero-knowledge protocol as input then the output protocol will also be zero-knowledge. One question is whether we can do more in this model.
Acknowledgments
We are grateful to Guy N. Rothblum and for in depth discussion about interactive protocols and in particular for the work on Theorem 10 which is joint with him. We also thank Ron D. Rothblum for explaining [RRR16] and the parameters in it.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AKY 97] Yehuda Afek, Shay Kutten, and Moti Yung. The local detection paradigm and its application to self-stabilization. Theor. Comput. Sci. , 186(1-2):199–229, 1997.
- 2[AW 09] Scott Aaronson and Avi Wigderson. Algebrization: A new barrier in complexity theory. TOCT , 1(1):2:1–2:54, 2009.
- 3[BEG + 94] Manuel Blum, William S. Evans, Peter Gemmell, Sampath Kannan, and Moni Naor. Checking the correctness of memories. Algorithmica , 12(2/3):225–244, 1994.
- 4[BFLS 91] László Babai, Lance Fortnow, Leonid A. Levin, and Mario Szegedy. Checking computations in polylogarithmic time. In Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, May 5-8, 1991, New Orleans, Louisiana, USA , pages 21–31, 1991.
- 5[BFP 15] Mor Baruch, Pierre Fraigniaud, and Boaz Patt-Shamir. Randomized proof-labeling schemes. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC 2015, Donostia-San Sebastián, Spain, July 21 - 23, 2015 , pages 315–324, 2015.
- 6[BFS 86] László Babai, Peter Frankl, and Janos Simon. Complexity classes in communication complexity theory (preliminary version). In 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 27-29 October 1986 , pages 337–347, 1986.
- 7[BG 08] Boaz Barak and Oded Goldreich. Universal arguments and their applications. SIAM J. Comput. , 38(5):1661–1694, 2008.
- 8[BM 88] László Babai and Shlomo Moran. Arthur-merlin games: A randomized proof system, and a hierarchy of complexity classes. J. Comput. Syst. Sci. , 36(2):254–276, 1988.
