An exponential lower bound for Individualization-Refinement algorithms for Graph Isomorphism
Daniel Neuen, Pascal Schweitzer

TL;DR
This paper proves that certain graph isomorphism algorithms based on individualization-refinement require exponential time in the worst case, establishing a fundamental complexity lower bound.
Contribution
It constructs a family of graphs that necessitate exponential runtime for all variants of individualization-refinement algorithms, including those with heuristics and Weisfeiler-Leman enhancements.
Findings
Constructed graphs require exponential time for individualization-refinement algorithms.
Lower bounds hold even with heuristics, invariants, and Weisfeiler-Leman enhancements.
Results apply when automorphism groups are provided to the algorithms.
Abstract
The individualization-refinement paradigm provides a strong toolbox for testing isomorphism of two graphs and indeed, the currently fastest implementations of isomorphism solvers all follow this approach. While these solvers are fast in practice, from a theoretical point of view, no general lower bounds concerning the worst case complexity of these tools are known. In fact, it is an open question whether individualization-refinement algorithms can achieve upper bounds on the running time similar to the more theoretical techniques based on a group theoretic approach. In this work we give a negative answer to this question and construct a family of graphs on which algorithms based on the individualization-refinement paradigm require exponential time. Contrary to a previous construction of Miyazaki, that only applies to a specific implementation within the individualization-refinementā¦
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
An exponential lower bound for Individualization-Refinement algorithms for Graph Isomorphism
Daniel Neuen and Pascal Schweitzer
RWTH Aachen University
{neuen,schweitzer}@informatik.rwth-aachen.de
Abstract
The individualization-refinement paradigm provides a strong toolbox for testing isomorphism of two graphs and indeed, the currently fastest implementations of isomorphism solvers all follow this approach. While these solvers are fast in practice, from a theoretical point of view, no general lower bounds concerning the worst case complexity of these tools are known. In fact, it is an open question whether individualization-refinement algorithms can achieve upper bounds on the running time similar to the more theoretical techniques based on a group theoretic approach.
In this work we give a negative answer to this question and construct a family of graphs on which algorithms based on the individualization-refinement paradigm require exponential time. Contrary to a previous construction of Miyazaki, that only applies to a specific implementation within the individualization-refinement framework, our construction is immune to changing the cell selector, or adding various heuristic invariants to the algorithm. Furthermore, our graphs also provide exponential lower bounds in the case when the -dimensional Weisfeiler-Leman algorithm is used to replace the standard color refinement operator and the arguments even work when the entire automorphism group of the inputs is initially provided to the algorithm.
1 Introduction
The individualization-refinement paradigm provides a strong toolbox for testing isomorphism of two graphs. To date, algorithms that implement the individualization-refinement paradigm constitute the fastest practical algorithms for the graph isomorphism problem and for the task of canonically labeling combinatorial objects.
Originally exploited by McKayās software package nautyĀ [14] as early as 1981, in a nutshell, the basic principle is to classify vertices using a refinement operator according to an isomorphism-invariant property. In a basic form one usually uses the so-called color refinement operator, also called 1-dimensional Weisfeiler-Leman algorithm, for this purpose. Whenever the refinement is not sufficient, vertices within a selected color class (usually called a cell) are individualized one by one in a backtracking manner as to artificially distinguish them from other vertices. This yields a backtracking tree, that is traversed to explore the structure of the input graphs. Additional pruning with the use of invariants and the exploitation of automorphisms of the graphs makes the approach viable in practice, leading to the fastest isomorphism solvers currently available. The use of invariants also allows us to define a smallest leaf, which can be used to canonically label the graph, i.e., to rearrange the vertices in canonical fashion as to obtain a standard copy of the graph.
There are several highly efficient isomorphism software packages implementing the paradigm. Among them are nauty/tracesĀ [15], blissĀ [11], conautoĀ [13] and saucyĀ [6]. While they all follow the basic individualization-refinement paradigm, these algorithms differ drastically in design principles and algorithmic realization. In particular, they differ in the way the search tree is traversed, they use different low level subroutines, have diverse ways to perform tasks such as automorphism detection, and they use different cell selection strategies as well as vertex invariants and refinement operators.
With Babaiās [2] recent quasi-polynomial time algorithm for the graph isomorphism problem, the theoretical worst case complexity of algorithms for the graph isomorphism problem was drastically improved from a previous bestĀ (see [3]) toĀ for some constantĀ . As an open question, Babai asksĀ [2] for the worst case complexity of algorithms based on individualization-refinement techniques. About this worst case complexity, very little had been known.
In 1995 MiyazakiĀ [16] constructed a family of graphs on which the then current implementation of nauty has exponential running time. For this purpose these graphs are designed to specifically fool the cell selection process into exponential behavior. However, as Miyazaki also argues, with a different cell selection strategy the examples can be solved in polynomial time within the individualization-refinement paradigm.
In this paper we provide general lower bounds for individualization-refinement algorithms with arbitrary combinations of cell selection, refinement operators, invariants and even given perfect automorphism pruning. More precisely, the graphs we provide yield an exponential size search tree (i.e., nodes) for any combination of refinement operator, invariants, and the cell selector which are not stronger than theĀ -dimensional Weisfeiler-Leman algorithm for some fixed dimensionĀ . The natural class of algorithms for which we thus obtain lower bounds encompasses all software packages mentioned above even with various combinations of switches that can be turned on and off in the execution of the algorithm to tune the algorithms towards specific input graphs. Our graphs are asymmetric, i.e., have no non-trivial automorphisms, and thus no strategy for automorphism detection can help the algorithm to circumvent the exponential lower bound.
Our construction makes use of a construction of Cai-Fürer-Immerman [4] and the multipede construction of Gurevich and Shelah [9] that yields for every dimension non-isomorphic finite rigid structures that are not distinguishable by the -dimensional Weisfeiler-Leman algorithm. In more detail, our construction starts with a bipartite base graph that is obtained by a simple random process. With high probability such a graph has strong expansion properties ensuring a variant of the meagerness property of [9] suitable for our purposes. Additionally, with high probability the graph has an almost-disjointness property for neighborhoods of vertices from one bipartition class. To the base graph we apply a bipartite variant of the construction of [4]. By individualizing a small fraction of vertices, we can guarantee that the final graphs are rigid (have no non-trivial automorphisms). For our theoretical analysis, we define a closure operator that gives us control over the effect of the Weifeiler-Leman algorithm on the graphs. Due to the disjointness property this effect is limited. Exploiting automorphism of subgraphs of the input, we then proceed to argue that there is an exponential number of colorings of the graph that cannot be distinguished. These statements can be combined to show that the search tree of every algorithm within the individualization-refinement framework has exponential size.
Some of the packages above have a mechanism called component recursion (seeĀ [12]). We show that even this strategy cannot yield improvements for our examples. We should point out that component recursion was used in Goldbergās resultĀ [8] which shows that with the right cell selection strategy and the use of component recursion (in that paper called sections) individualization-refinement algorithms have exponential upper bounds, matching our lower bounds.
We also should remark that, seen as colored graphs, our graphs have bounded color class size and as such isomorphism of the graphs can be decided in polynomial time using simple group theoretic techniques (see [1, 7]).
Since the software packages that follow the individualization-refinement paradigm are designed for practical purposes rather than to obtain theoretical worst case guarantees, the question lies at hand how meaningful the lower bounds provided in the paper are. However, in separate work [18], we investigate practical benchmark graphs. It turns out that constructions related to the ones discussed in this paper in fact yield graphs which, experimentally, pose by far the most challenging graph isomorphism instances available to date.
2 Preliminaries
2.1 Graphs
A graph is a pair with vertex set and edge relation . In this paper all graphs are finite simple, undirected graphs. The neighborhood ofĀ is denotedĀ . For a set let .
An isomorphism from a graph to another graph is a bijective mapping which preserves the edge relation, that isĀ if and only if for allĀ . Two graphs and are isomorphic () if there is an isomorphism fromĀ toĀ . We writeĀ to indicate thatĀ is an isomorphism fromĀ toĀ . TheĀ isomorphism type of a graphĀ is the class of graphs isomorphic toĀ . An automorphism of a graph is an isomorphism fromĀ to itself. By we denote the group of automorphisms of . A graph is rigid (or asymmetric) if its automorphism group is trivial, that is, the only automorphism of is the identity map.
A vertex coloring of a graphĀ is a mapĀ into some set of colorsĀ . Mostly, we will use vertex colorings into the natural numbersĀ .
Isomorphisms between two colored graphsĀ andĀ are required to preserve vertex colors. Slightly abusing notation we will sometimes not differentiate betweenĀ andĀ , if the coloring is apparent from context.
We will also consider vertex colored graphs with a distinguished sequence of not necessarily distinct verticesĀ , whereĀ for someĀ . For a tuple we let be the length of the tuple. Two such graphs with distinguished sequencesĀ andĀ are isomorphic ifĀ and there is an isomorphismĀ fromĀ toĀ preserving vertex colors and satisfyingĀ for allĀ . In analogy to the definition above we writeĀ .
2.2 The Weisfeiler-Leman algorithm
The -dimensional Weisfeiler-Leman algorithm is a procedure that, given a graphĀ and a coloring of theĀ -tuples of the vertices, computes an isomorphism-invariant refinement of the coloring. LetĀ be colorings of theĀ -tuples of vertices ofĀ , whereĀ is some set of colors. We say refines () if for all we have
[TABLE]
Let be a colored graph (where is a coloring of the vertices) and let be some integer. We set to be the coloring, where each -tuple is colored by the isomorphism type of its underlying induced ordered subgraph. More precisely, we define in such a way that if and only if for all it holds that and for all we have and . For we recursively define forĀ the coloringĀ by settingĀ , whereĀ is the multiset defined as
[TABLE]
ForĀ the definition is analogous but the multiset is defined asĀ i.e., iterating only over neighbors ofĀ .
By definition, every coloringĀ induces a refinement of the partition of theĀ -tuples of the graphĀ with coloringĀ . Thus, there is some minimalĀ such that the partition induced by the coloringĀ is not strictly finer than the one induced by the coloringĀ onĀ . For this minimalĀ , we call the coloringĀ the stable coloring ofĀ and denote it byĀ .
ForĀ , the -dimensional Weisfeiler-Leman algorithm takes as input a colored graphĀ and returns the colored graphĀ . For two colored graphsĀ andĀ , we say that theĀ -dimensional Weisfeiler-Leman algorithm distinguishesĀ andĀ with respect to the initial coloringsĀ andĀ if there is some colorĀ such that the sets and have different cardinalities. We writeĀ if the -dimensional Weisfeiler-Leman algorithm does not distinguish betweenĀ andĀ .
We extend the definition to vertex-colored graphs with distinguished vertices. LetĀ andĀ be graphs with vertex coloringsĀ andĀ and letĀ andĀ be sequences of vertices. DefineĀ as the coloring given byĀ . We call this the coloring obtained fromĀ by individualizingĀ . Similarly we defineĀ asĀ . Then we say thatĀ is not distinguished fromĀ , in symbolsĀ , ifĀ andĀ are not distinguished by theĀ -dimensional Weisfeiler-Leman algorithm with respect to the initial coloringsĀ andĀ . We denote byĀ the vertex coloring that is induced by the stable coloring with respect to the initial coloringĀ , that is, where is the stable coloring of the -tuples with respect to the initial coloringĀ .
There is a close connection between the Weisfeiler-Leman algorithm and fixed-point logic with counting. In fact the stable coloring computed byĀ -dimensional Weisfeiler-Leman comprehensively captures the information that can be obtained in fixed-point logic with counting using at mostĀ variables. We refer toĀ [4, 10] for more details.
Pebble Games.
We will not require details about the information computed by the Weisfeiler-Leman algorithm and rather use the following pebble game that is known to capture the same information. Let be a fixed number. For graphs on the same number of vertices and with vertex coloringsĀ andĀ , respectively, we define the bijective -pebble game on and as follows:
- ā¢
The game has two players called Spoiler and Duplicator
- ā¢
The game proceeds in rounds. Each round is associated with a pair of positions withĀ andĀ .
- ā¢
The initial position of the game is .
- ā¢
Each round consists of the following steps. Suppose the current position of the game is .
- (S)
Spoiler chooses some .
- (D)
Duplicator picks a bijection .
- (S)
Spoiler chooses and sets .
The new position is then the pair consisting of and .
- ā¢
Spoiler wins the game if for the current positionĀ the induced graphs are not isomorphic. More precisely, Spoiler wins if there is anĀ such thatĀ v_{i}=\bot\mathrel{\ooalign{\Leftrightarrow/}}w_{i}=\bot orĀ or there areĀ such thatĀ v_{i}=v_{j}\mathrel{\ooalign{\Leftrightarrow/}}w_{i}=w_{j} orĀ \{v_{i},v_{j}\}\in E(G)\mathrel{\ooalign{\Leftrightarrow/}}\{w_{i},w_{j}\}\in E(H). If the play never ends Duplicator wins.
We say that Spoiler (respectively Duplicator) wins the bijective -pebble game if Spoiler (respectively Duplicator) has a winning strategy for the game.
Theorem 2.1** (cf.Ā [4, 10]).**
Let be two graphs. Then if and only if Duplicator wins the pebble game .
3 Individualization-refinement algorithms
An extensive description of the paradigm of individualization-refinement algorithms is given in [15]. These algorithms capture information about the structure of a graph by coloring the vertices. An initially uniform coloring is first refined in an isomorphism-invariant manner as follows.
A refinement operator is an isomorphism-invariant function that takes a graph , a coloringĀ and a sequence and outputs a coloring such that has a unique color for every . In this context isomorphism-invariant means that impliesĀ . A typical choice for such a refinement would be the 1-dimensional Weisfeiler-Leman algorithm described above, where the vertices inĀ are artificially given special colors.
A vertex with a unique color is called a singleton and a coloring is called discrete if all vertices are singletons. Due to the isomorphism-invariance, every isomorphism must preserve the refined colors. Thus, in case the refinement operator produces a discrete coloring on a graphĀ , it is trivial to check whether this graph is isomorphic to another graphĀ . Indeed, the refinement ofĀ must also be discrete and there is at most one color preserving bijection between the vertex sets which can be trivially checked for being an isomorphism. However, if the coloring ofĀ is not discrete we need to do more work. In this case we select a color class, usually called a cell, and then individualize a single vertex from the class. Here individualization means to refine the coloring by making the vertex a singleton. Since such an operation is not necessarily isomorphism-invariant, we branch over all choices of this vertex within the chosen cell. To the coloring with the newly individualized vertex, we apply the refinement operator again and proceed in a recursive fashion. To explain this in more detail we first need to clarify how the cell is chosen.
Let be a graph and be a coloring of the vertices. A cell selector is an isomorphism-invariant function which takes as input a graph and a coloring and either outputs with if such a color exists or otherwise. In this context isomorphism-invariant means that impliesĀ . The performance of an individualization-refinement algorithm can drastically depend on the cell selection strategy. A typical strategy would be to take the first class of smallest size.
Let be a graph with an initial coloring . Let be a cell selector and a refinement operator. Inductively define the search tree as follows. The root of the tree is labeled with the empty sequence . Let be a node of the search tree. Let be the coloring computed by the refinement operator for the current sequence and let be the color selected by the cell selector. If then is a leaf of the search tree and the coloring is discrete. Otherwise, for each , there is child node labeled with . The vertices of the search tree are referred to as nodes and we identify them with the sequence of vertices they are labeled with.
Pruning with invariants.
Together a cell selector and a refinement operator are sufficient to build a correct isomorphism test. Indeed, two graphs are isomorphic if and only if they have isomorphic leaves in their search tree. For these leaves, due to having a discrete coloring, isomorphism is trivial to check. However, there are two further ingredients that are crucial for the efficiency of practical individualization-refinement algorithms. These are the use of node invariants and the exploitation of automorphisms. Let be a totally ordered set. A node invariant is an isomorphism-invariant function taking a graph , a coloring and a sequence and outputs an element such that for all vertex sequences of equal lengthĀ
- (i)
if then it also holds for all that and
- (ii)
if and are discrete and then .
Here isomorphism-invariant means that impliesĀ .
Let be a node invariant and define
[TABLE]
Finally, define the search tree as the subtree induced by the node set . Observe that Property (i) implies that is indeed a tree. By using the invariant we thus cut off the parts of the search tree that do not have a nodes that are minimal among all nodes on their level. However, due to isomorphism invariance, the property that two graphs are isomorphic if and only if they have isomorphic leaves remains.
The use of an invariant also makes it easy to define a canonical labeling using a leaf for which the invariant is smallest. For the purpose of obtaining our lower bounds we will not require detailed information on the concept of a canonical labeling and rather refer toĀ [15].
Pruning with automorphisms.
The second essential ingredient needed for the practicality of individualization-refinement algorithms is the exploitation of automorphisms. Indeed, if for two nodes inĀ , labeled withĀ andĀ , respectively, we haveĀ then it is sufficient to explore only one of the subtrees corresponding to the two nodes (we refer to [15] for correctness arguments). Thus automorphisms that are detected by the algorithm can be used to cut off further parts of the search tree. An efficient strategy for the detection of automorphisms is an essential part of individualization-refinement algorithms and here the various packages differ drastically (seeĀ [15]). Making our lower bounds only stronger, in this paper we take the following standpoint. We will assume that all automorphisms of the input graph are provided to the algorithm in the beginning at no cost. In fact the following lower bound will be sufficient for our purposes.
Proposition 3.1**.**
The running time of an individualization-refinement algorithm with cell selectorĀ , refinement operatorĀ and invariantĀ on a graphĀ is bounded from below byĀ .
The program nauty has an extensive selection of refinement operators/invariants that can be activated via switches. To name a couple, there are various options to count for each vertex the number of vertices reachable by paths with vertex colors of a certain type, options to count substructures such as triangles, quadrangles, cliques up to size 10, independent sets up to size 10, and even options to count the number of Fano planes (the projective plane with 7 points and 7 lines). Finally, the user can implement their own invariant via a provided interface.
Our goal is to make a comprehensive statement about individualization-refinement algorithms independent of the choices forĀ ,Ā , andĀ . However, there is an intrinsic limitation here. For example a complete invariant that can distinguish any two non-isomorphic graphs would yield a polynomial-size search tree. Likewise would a refinement operator that refines every coloring into the orbit partition under the automorphism group. However, we do not know how to compute these two examples efficiently. In fact computing either of these is at least as hard as the isomorphism problem itself.
Of course it is nonsensical to allow that an individualization-refinement algorithm uses a subroutine that already solves the graph isomorphism problem. It becomes apparent that there is a limitation to the operators we can allow. However, within this limitation we strive to be as general as possible. With this in mind, throughout this paper we require that the information computed by refinement operators, invariants and cell selectors can be captured by a fixed dimension of the Weisfeiler-Leman algorithm. This is the case for all available choices in all the practical algorithms. In what follows, we describe the requirement more formally.
We say a cell selector is -realizable if whenever . Similarly a node invariant is -realizable if whenever . Intuitively this means that whenever the -dimensional Weisfeiler-Leman algorithm cannot distinguish between the graphs associated with two nodes of the refinement tree then the cell selector and the node invariant have to behave in the same way on both nodes. Finally a refinement operator is -realizable ifĀ holds for all triplesĀ .
We want to stress the fact that all the operators used in all practical implementations (e.g.Ā nauty/traces, bliss, conauto, etc.) are -realizable for some small constant . In fact, from a theoretical point of view it would always be better to directly use the Weisfeiler-Leman algorithm as a refinement operator, since it is polynomial-time computable. Let us remark that the only reason why the individualization-refinement algorithms do not do this is the excessive running time and space consumption.
Based on these definitions we can now formulate our main result, which implies an exponential lower bound for individualization-refinement algorithms within the framework.
Theorem 3.2**.**
For every constant there is a family of rigid graphs with such that for every -realizable cell selector , every -realizable refinement operator , and every -realizable node invariant it holds that
[TABLE]
Together with PropositionĀ 3.1 this implies exponential lower bounds on the running time of individualization-refinement algorithms.
For the theorem we construct graphs which have large search trees. To prove that these search trees are large, we use the following lemma stating that for every two tuples for which holds, either both tuples are nodes in the search or neither of them is.
Lemma 3.3**.**
Suppose and let be a -realizable cell selector, a -realizable node invariant and a -realizable refinement operator. Furthermore, let be a graph and suppose . Let . Then for every with .
Proof.
We prove the statement by induction on . For the statement trivially holds since . So suppose . Let be the tuple obtained from by deleting the last entry and similarly define . Clearly, and . So by induction hypothesis it follows that . Let and . Because andĀ are -realizable we get that . So and also since . Thus, . Furthermore, which implies that . ā
In the light of the lemma, to prove our lower bound it suffices to construct a graphĀ whose search tree has a node with an exponential number of equivalent tuples. To argue the existence of these we, roughly proceed in two steps. First, we show that to obtain a discrete partition we have to individualize a linear number of vertices, in other words, we show that the search tree is of linear height. Thus, there is a node in the corresponding search such that is linear in the number of vertices of . Then, in a second step, we show that ifĀ is sufficiently large, there are exponentially many equivalent tuples. To find such equivalent tuples we prove a limitation of the effect of the -dimensional Weisfeiler-Leman algorithm after individualizing the vertices fromĀ . Intuitively we identify a subgraph containingĀ which encapsulates the effect of the Weisfeiler-Leman algorithm. We then exploit the existence of many automorphisms of this subgraph to demonstrate the existence of many equivalent tuples.
4 The multipede construction
We describe a construction of a graph from a bipartite base graph . The construction is a combination of the Cai-Fürer-Immerman construction [4] giving graphs with large Weisfeiler-Leman dimension and the construction of Gurevich and Shelah of multipedes [9] that yields rigid structures with such properties.
The Cai-Fürer-Immerman gadget.
For a non-empty finite set we define the CFI gadget to be the following graph. For each there are vertices and and for every with even there is a vertex . For every with even there are edges for all and for all . As an example the graph is depicted in Figure 4.1. The graph is colored so thatĀ forms a color class for eachĀ and so thatĀ forms a color class.
Let and . We say that swaps exactly the pairs of if for and for .
Lemma 4.1** ([4]).**
Let . Then there is an automorphism swapping exactly the pairs of if and only if is even. Additionally, if such an automorphism exists, it is unique.
The multipede graphs.
Let be a bipartite graph. We define the multipede graph construction as follows. We replace every by two vertices and . For each let and for define . We then replace every by the CFI gadget and identify the vertices and with and for all . The middle vertices will be denoted by for with even and the set of all middle vertices of is denoted by . For we define . We also define a vertex coloring for the graphs, however we only specify the color classes, since the actual names of the colors will be irrelevant to us. In this manner, we require that for each , the pair of vertices forms a color class, and for each the set of corresponding middle vertices forms a color class. The resulting (vertex colored) graph will be denoted by . An example of this construction is shown in FigureĀ 4.2.
For we further define the graph similar to but refine the coloring so that for eachĀ bothĀ andĀ form a color class. Hence, .
Since we are aiming to construct rigid graphs we start by identifying properties of that correspond to having few automorphisms.
Definition 4.2**.**
Let be a bipartite graph. We say is odd if for every there exists some such that is odd.
Lemma 4.3**.**
Let be an odd bipartite graph. Then is rigid for every .
Proof.
Let be an automorphism. Due to the coloring of the vertices, we know stabilizes every set for and every set for .
Let . Suppose towards a contradiction that . Since is odd there is some such that is odd. Then restricts to an automorphism of the gadget swapping an odd number of the outer pairs. This contradicts the properties of CFI gadgets (cf.Ā Lemma 4.1).
So and thus for all . From this it easily follows that is the identity mapping. ā
For a bipartite graph let be the matrix with if and only if . We denote by the transpose and by theĀ -rank of a matrix .
Lemma 4.4**.**
Let be a bipartite graph. Then .
Proof.
For a matrix we define the set . To show the lemma it suffices to argue that .
For we define the vector by setting if and only if . Observe that the mapping is injective. Furthermore since for each the automorphism swaps an even number of neighbors of .
For the backward direction let . Then, for each , the set has even cardinality. Thus, by the properties of the CFI-gadgets (cf.Ā Lemma 4.1), there is a unique automorphism that swaps exactly those pairs for which . ā
The arguments show that a bipartite graph is odd if and only ifĀ .
Corollary 4.5**.**
Let be an odd bipartite graph. Then there is some with such that the induced subgraph is odd.
We can also useĀ to quantify how many vertices must be individualized to makeĀ rigid.
Lemma 4.6**.**
Let be a bipartite graph. Then there is with such that is rigid.
Proof.
Let be the standard basis for (that is, if and only if ). Furthermore, for , let be the -th row of and let be a minimal subset of such that spans the entire space . Finally, let . Clearly, .
We argue that is rigid. Let and let be the vector obtained by setting if and only if . Then for all . Furthermore for all by the same argument as in the proof of Lemma 4.4. Since spans the entire space it follows by the standard linear algebra arguments that . Thus is the identity. ā
5 The Weisfeiler-Leman refinement and the -closure
We wish to understand the effect of the Weisfeiler-Leman refinement on graphs obtained with the construction from the previous section. To this end we define theĀ -closure.
Definition 5.1**.**
Let . For define the -attractor ofĀ as
[TABLE]
A set is -closed if . The -closure of is the unique minimal superset which is -closed, that is
[TABLE]
As observed inĀ [9] the 1-closure describes exactly the information the 1-dimensional Weisfeiler-Leman captures.
Lemma 5.2**.**
Let be a bipartite graph and suppose . Then
[TABLE]
for all .
Proof sketch.
The backward direction follows by an inductive argument from the properties of the CFI gadgets. For the forward direction it is easy to check that the corresponding partition on directly extends to a stable partition for the graph . ā
Let be a bipartite graph. Slightly abusing notation, for a set define . For define (for a graph and a set the graph is the induced subgraph of with vertex set ).
In this work we essentially argue that the previous lemma can be used to show that for a 1-closed setĀ and a sequence of vertices fromĀ , for every automorphism we have . The 1-closure thus gives us a method to find tuples that cannot be distinguished by the 1-dimension Weisfeiler-Leman algorithm. However, we require such a statement also for higher dimensions. Obtaining a similar statement characterizing the effect of theĀ -dimensional Weisfeiler-Leman seems to be much more intricate and it is easy to see that the -closure does not achieve this. However, under some additional assumptions, we show that the forward direction of the previous lemma still holds and thus, the -closure gives us a tool to control the effect of -dimensional Weisfeiler-Leman which is sufficient for our purposes.
Lemma 5.3**.**
Let and suppose . Let be a bipartite graph and be a -closed set. Furthermore suppose that for distinct we haveĀ . Let be a sequence of vertices with and let . Then .
Proof.
We prove that Duplicator has a winning strategy in the bijective -pebble game played on and . Towards this end we say a vertex (respectively ) is pebbled if there exists (respectively ) which is pebbled. Furthermore we say that a vertex is fixed if there is some pebbled with . For a tuple of length at mostĀ of pebbled vertices let
[TABLE]
Now let , and suppose there is an isomorphism from toĀ mappingĀ toĀ andĀ toĀ . Observe that extends and for we can choose .
ClaimĀ 1.
For every unpebbled with there is some which is neither pebbled nor fixed.
*Proof. *Consider . Since is -closed we conclude that . By the assumption of the lemma there are at mostĀ elements inĀ that are fixed. Thus, contains at least elements which are not fixed. Furthermore, there are at mostĀ vertices inĀ that are pebbled. Thus there is at least one element that is neither pebbled nor fixed.
For each unpebbled with choose a that is neither pebbled nor fixed. Furthermore let . For everyĀ we define
[TABLE]
whereĀ is the set withĀ andĀ denotes the symmetric difference. We can now define the bijection
[TABLE]
It is easy to check that for every there is an isomorphism from toĀ mappingĀ toĀ andĀ toĀ . ā
6 Meager graphs
Searching for graphs in which we have control over the size of theĀ -closure of a set we generalize the notion of an -meager graphĀ [9].
Definition 6.1**.**
Let be a bipartite graph and let . The graph is -meager if for every with it holds thatĀ .
Meager graphs have two properties that are advantageous for our course. The first property is that for sufficiently smallĀ the graphĀ has many automorphisms.
Lemma 6.2**.**
Let be -meager and with . Then
[TABLE]
Proof.
By Lemma 4.4 for with we have that
[TABLE]
The second property that is advantageous for us is that in a meager graph the size of theĀ -closure of a setĀ is only by a constant factor larger thanĀ itself.
Lemma 6.3**.**
Suppose and . Let be -meager and suppose with . Then .
Proof.
Let be a sequence of sets such that and for some with and such that is -closed. Clearly, for every it holds that . Suppose that and set . Then . Hence the meagerness is applicable toĀ . We conclude implyingĀ . But this contradicts the definition ofĀ . So and thus, . ā
We now concern ourselves with the existence of meager graphs. However, we require several additional properties. Indeed, in the light of LemmaĀ 5.3 we also want certain neighborhoods to be almost disjoint. Moreover we want the graph to only have few automorphisms, which by LemmaĀ 4.6 translates into the matrixĀ having large rank.
Theorem 6.4**.**
There exists such that for every with and every sufficiently large there is an -meager graph with
, 2. 2.
for all , 3. 3.
for all distinct and 4. 4.
.
Our proof of the theorem, which covers the rest of this section, makes use of the fact that bipartite expander graphs are meager.
Definition 6.5**.**
Let be a bipartite graph with . We call a -expander if for every with it holds that .
One method to obtain bipartite expanders is by considering the following random process. Let be a fixed number. Given vertex sets and with and we obtain a bipartite graph by choosing independently and uniformly at random, for every a set of distinct neighbors inĀ . We refer toĀ [19, Section 4] and [17, Chapter 5.3] for more information on expanders, including variants of the following lemma.
Lemma 6.6**.**
ForĀ sufficiently large, \Pr(G\text{ is a }\text{\left(\frac{1}{10r},\frac{r}{2}\right)\text{-expander}})\geq\frac{8}{9}.
Proof.
Let . For and letĀ denote the probability that that . Then
[TABLE]
Furthermore let and . Let be the probability that is not a -expander. Then, using the inequalityĀ , we get
[TABLE]
Now let . IfĀ is sufficiently large then . It follows that
[TABLE]
Theorem 6.7** (cf.Ā [5, Theorem 1.1]).**
For let . Furthermore let be a random matrix where the rows are drawn uniformly and independently from . There is a such that for every fixed it holds that
[TABLE]
Lemma 6.8**.**
Asymptotically almost surely there are no distinct vertices such that .
Proof.
Let be the probability that there are distinct with . Furthermore pick such that for allĀ . Then, forĀ sufficiently large,
[TABLE]
for some constant . ā
Proof of Theorem 6.4.
Let be sufficiently large and let be a random bipartite graph as described above with and . By Lemma 6.8, Condition 3 is satisfied asymptotically almost surely and Theorem 6.7 implies that Condition 4 is satisfied asymptotically almost surely. So it remains to show is -meager with a positive probability. By Lemma 6.6, with probability at leastĀ , the graph is a -expander. Suppose there was an with for which . Then we could choose withĀ (assumingĀ ). But then which is a contradiction. ā
7 Lower bounds for individualization-refinement algorithms
In the previous section we have proven the existence of meager graphs with various additional properties. We show now that applying the multipede construction to such graphs yields examples where the search tree of individualization-refinement algorithms is large.
Lemma 7.1**.**
Let and suppose and . Let be -meager and let be a subset of cardinality . Furthermore suppose that for all distinct we haveĀ . Let be a sequence with . Then
[TABLE]
Proof.
Let be the -closure of . By Lemma 6.3 we get that . Suppose and let be an extension of with . By Lemmas 5.3 and 6.2 we conclude that
[TABLE]
Let and for let be the projection onto the first components. Clearly, for , it holds that . So
[TABLE]
Since we conclude that
[TABLE]
ā
Theorem 7.2**.**
Let , and . Suppose is a bipartite graph with such that
is -meager, 2. 2.
for all , 3. 3.
for all distinct and 4. 4.
.
Then there is a subset with such that
is rigid and 2. 2.
for every -realizable cell selector , every -realizable node invariant and every -realizable refinement operator it holds that
[TABLE]
Proof.
Set . By Lemma 4.6 there is a set of size such that is rigid. Suppose .
ClaimĀ 1.
For distinct we haveĀ .
*Proof. *We haveĀ .
We choose an arbitrary linear order on the set of vertices inĀ . For a vertexĀ define the projectionĀ to be the sequenceĀ of the vertices inĀ ordered according to the linear order ofĀ (observe that for each either or occurs in the sequence). We extendĀ toĀ by defining forĀ thatĀ andĀ .
For a sequenceĀ of vertices ofĀ we defineĀ as the concatenation of the sequencesĀ . We letĀ be the subsequence ofĀ in which all duplicates are removed starting from right. (I.e, for a sequence we inductively the sequence by setting and lettingĀ be equal toĀ , ifĀ for someĀ , and is equal to the concatenation ofĀ andĀ , otherwise. Then .)
ClaimĀ 2.
For every sequenceĀ of vertices ofĀ it holds that
[TABLE]
*Proof. *Observe first that the second inequality holds since there are onlyĀ color preserving permutations ofĀ .
For the first inequality suppose forĀ we haveĀ then we can find a liftĀ withĀ so thatĀ andĀ have the same color for allĀ . For this lift we haveĀ . Since lifts of distinct sequences must be distinct we conclude the claim.
Now let and letĀ .
By Lemma 7.1 and Claim 2 we conclude that for every vertex sequenceĀ withĀ we have
[TABLE]
(Here the extraĀ in the exponent comes from the fact that in Lemma 7.1 for eachĀ only one can be chosen but hereĀ can contain both vertices fromĀ .)
ClaimĀ 3.
There is a sequenceĀ inĀ withĀ .
*Proof. *Let be a leaf of and let be the length of . Observe that for every . DefineĀ asĀ . Note thatĀ sinceĀ . It thus suffices to show thatĀ . Assume otherwise. We show thatĀ is not discrete and thusĀ is not a leaf. LetĀ be a sequence of whichĀ is a prefix that satisfiesĀ (possiblyĀ ). It suffices now to show thatĀ is not discrete. Indeed, by Equation (7.2) we haveĀ . SinceĀ is rigid, this implies thatĀ is not discrete.
Applying the sequenceĀ from Claim 3 to Equation (7.2) we obtain thatĀ . By Claim 2 this means thatĀ . We conclude the proof by an application of LemmaĀ 3.3. ā
Having already shown in the previous section the existence of graphs that satisfy the requirements of the theorem we just proved, we can now prove the main theorem.
Proof of TheoremĀ 3.2.
TheoremĀ 3.2 now follows by combining Theorems 6.4 and 7.2. ā
While the graphs we have constructed are colored graphs, we should remark that it is not difficult to turn them into uncolored graphs while preserving the exponential size of the search tree. Indeed, we form the disjoint union of the graphĀ with a path of lengthĀ whereĀ is the number of colors inĀ . We then order the colors and connect theĀ -th vertex of the path with all vertices of colorĀ . Finally we add a vertex adjacent to all but the last vertex of the path to obtain a graphĀ . In the resulting uncolored graphĀ , the last vertex of the path is the only vertex with degreeĀ 1. Moreover if we apply color-refinement then all newly added vertices are singletons and the partition induced by color classes onĀ inĀ is the same as it the one inĀ . This shows that the graph is still rigid. It also implies that each search tree ofĀ corresponds to a search tree ofĀ of the same size.
8 Component Recursion
Component recursion is a mechanism that can be used as addition to an individualization-refinement algorithm improving their performance. With the right cell selection strategy, using component recursion, individualization-refinement algorithms have exponential upper boundsĀ [8].
Some form of component recursion is for example implemented in the newest version of bliss and was demonstrated to yield significant improvements in practiceĀ [12]. In this section we argue that for our examples it is not possible to beat the exponential lower bounds even with the use of component recursion, which we explain first.
For a graph we say that two disjoint vertex sets are uniformly joined ifĀ orĀ . By definition the sets are uniformly joined if or .
Definition 8.1**.**
Let be a colored graph and let be a set of vertices. We say that is a color-component of if for all colors the setsĀ andĀ are uniformly joined.
We will not go into great detail on how color-components can be exploited by individualization-refinement algorithms and rather refer toĀ [12]. Intuitively, components can be used to treat parts of the input graph independently. For us it will be sufficient to note the following. If a color-component is a union of color classes (of the stable coloring under 1-dimensional Weisfeiler-Leman) then a suitable cell selector can ensure that the component is explored entirely before the search progresses into vertices outside of the color-componentĀ (seeĀ [12]). We will show now that for the graphs that appear as nodes in the search tree of our graphs there are no color-components besides those that are unions of color classes.
Definition 8.2**.**
Let be a colored graph and let be a set of vertices. We say that respects color if either or .
Let be a colored graph and let be a second coloring of the vertices. We call an equitable coloring of if and for all and all it holds that
[TABLE]
Observe that a coloring is equitable if and only if it is not further refined by -dimensional Weisfeiler-Leman, that is, is stable with respect to the -dimensional Weisfeiler-Leman algorithm.
Lemma 8.3**.**
Let be a bipartite graph and let be an equitable coloring of . Suppose is a color-component of . Then is a union of color classes ofĀ or the graph has a non-trivial automorphism.
Proof.
Suppose that is not a union of color classes. Then there is some color such that and . Let be the (original) coloring of . Since , there either is a vertex with or there is some with .
ClaimĀ 1.
respects all colors for whichĀ .
*Proof. *Suppose otherwise thatĀ does not respectĀ and choose such that . Furthermore let be distinct subsets of even size such that for all , and . Choose and . First observe that and form color classes with respect to , because is equitable. Furthermore, does not respect the color classes and .
Now assume without loss of generality that . We distinguish two cases. First suppose that . If then has to respect the color the color class . Otherwise and has to respect the color the color class . In both cases we obtain a contradiction.
So . Let . If then has to respect the color the color class . But is only connected to one of the vertices which again leads to a contradiction. So . But then, looking at and , has to respect the color class . Once again, this is a contradiction.
Now let M=\{i\in[n]\mid\text{Si}\}. For each we have where and . We define
[TABLE]
ClaimĀ 2.
.
*Proof. *Let where for some and for some . First suppose . Then and does not respect the color class . Without loss of generality assume and thus, . Since is equitable there is some with . But then because is a color-component. Hence, does not respect color and by Claim 1. So and thus, .
Similarly, it also follows that if . So preserves the edge relation of and thus, .
ā
Corollary 8.4**.**
Let be a bipartite graph. If is rigid then every color-component of of an equitable coloring is a union of color classes.
As we mentioned before, color-components that are unions of color classes can be handled with a suitable cell selector. We conclude that the lower bound of TheoremĀ 3.2 also applies to individualization-refinement algorithms that apply component recursion.
9 Discussion
We presented a construction resulting in graphs for which the search tree of individualization-refinement algorithms has exponential size. As a consequence algorithms based on this paradigm have exponential worst case complexity. In particular, this includes the to date fastest practical isomorphism solvers such as nauty, traces, bliss, saucy and conauto.
While the analysis in this paper is theoretical, constructions related to the ones presented in this paper produce graphs that constitute the practically most difficult instances to dateĀ [18].
Our construction gives, for each constant , a family of graphs that lead to exponential size search trees when the -dimensional Weisfeiler-Leman algorithm is used as a refinement operator. However, it is not clear whether we can obtain similar statements if the Weisfeiler-Leman dimension may depend on the number of vertices of the input graph. The recent quasi-polynomial time algorithm due to Babai [2] for example repeatedly benefits from performing the -dimensional Weisfeiler-Leman algorithm where is the number of vertices of the input graphs. In particular, it is an interesting question whether we still obtain exponential size search trees if we allow the Weisfeiler-Leman dimension to be linear in the number of vertices. A related question asks for the maximum Weisfeiler-Leman dimension of rigid graphs.
Another interesting question concerns the complexity of individualization-refinement algorithms for other types of structures. Here, we are particularly interested in the group isomorphism problem (for groups given by multiplication table). What is the running time of the individualization-refinement paradigm for groups? In fact, it is not even known whether groups have bounded Weisfeiler-Leman dimension. That is, whether there is aĀ for which theĀ -dimensional Weisfeiler-Leman algorithm solves group isomorphism.
Finally, all known constructions leading to graphs with large Weisfeiler-Leman dimension are in some way based on the CFI-construction, but it would be desirable to have constructions that are conceptually different. Let us remark again that isomorphism for the graphs we constructed can be decided im polynomial time using techniques from algorithmic group theory. It would be interesting to find explicit constructions that are difficult for both group theoretic techniques and methods based on the Weisfeiler-Leman algorithm.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. Babai. Monte carlo algorithms in graph isomorphism testing. Technical Report 79-10, UniversitƩ de MontrƩal, 1979.
- 2[2] L. Babai. Graph isomorphism in quasipolynomial time [extended abstract]. In D. Wichs and Y. Mansour, editors, Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016 , pages 684ā697. ACM, 2016.
- 3[3] L. Babai and E. M. Luks. Canonical labeling of graphs. In D. S. Johnson, R. Fagin, M. L. Fredman, D. Harel, R. M. Karp, N. A. Lynch, C. H. Papadimitriou, R. L. Rivest, W. L. Ruzzo, and J. I. Seiferas, editors, Proceedings of the 15th Annual ACM Symposium on Theory of Computing, 25-27 April, 1983, Boston, Massachusetts, USA , pages 171ā183. ACM, 1983.
- 4[4] J. Cai, M. Fürer, and N. Immerman. An optimal lower bound on the number of variables for graph identifications. Combinatorica , 12(4):389ā410, 1992.
- 5[5] N. J. Calkin. Dependent sets of constant weight binary vectors. Combinatorics, Probability & Computing , 6(3):263ā271, 1997.
- 6[6] P. Codenotti, H. Katebi, K. A. Sakallah, and I. L. Markov. Conflict analysis and branching heuristics in the search for graph automorphisms. In 2013 IEEE 25th International Conference on Tools with Artificial Intelligence, Herndon, VA, USA, November 4-6, 2013 , pages 907ā914. IEEE Computer Society, 2013.
- 7[7] M. L. Furst, J. E. Hopcroft, and E. M. Luks. Polynomial-time algorithms for permutation groups. In 21st Annual Symposium on Foundations of Computer Science, Syracuse, New York, USA, 13-15 October 1980 , pages 36ā41. IEEE Computer Society, 1980.
- 8[8] M. Goldberg. A nonfactorial algorithm for testing isomorphism of two graphs. Discrete Applied Mathematics , 6(3):229 ā 236, 1983.
