Inference and Sampling of $K_{33}$-free Ising Models
Valerii Likhosherstov, Yury Maximov, Michael Chertkov

TL;DR
This paper introduces polynomial-time algorithms for inference and sampling in a broad class of Ising models, including those with $K_{33}$-free topologies, extending beyond planar graphs.
Contribution
It extends tractable inference and sampling algorithms to $K_{33}$-free Ising models, generalizing planar cases to models with complex topologies.
Findings
Polynomial-time algorithms for $K_{33}$-free Ising models.
Extension of tractability from planar to $K_{33}$-free topologies.
Efficient sampling and inference in models with unbounded genus.
Abstract
We call an Ising model tractable when it is possible to compute its partition function value (statistical inference) in polynomial time. The tractability also implies an ability to sample configurations of this model in polynomial time. The notion of tractability extends the basic case of planar zero-field Ising models. Our starting point is to describe algorithms for the basic case computing partition function and sampling efficiently. To derive the algorithms, we use an equivalent linear transition to perfect matching counting and sampling on an expanded dual graph. Then, we extend our tractable inference and sampling algorithms to models, whose triconnected components are either planar or graphs of size. In particular, it results in a polynomial-time inference and sampling algorithms for (minor) free topologies of zero-field Ising models - a generalization of planar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Stochastic processes and statistical mechanics · Complex Network Analysis Techniques
Inference and Sampling of -free Ising Models
Valerii Likhosherstov1, Yury Maximov*(1,2)* and Michael Chertkov*(1,2,3)*
1 Skolkovo Institute of Science and Technology, Moscow, Russia
2 Theoretical Division and Center for Nonlinear Studies,
Los Alamos National Laboratory, Los Alamos, NM, USA
3 Graduate Program in Applied Mathematics,
University of Arizona, Tucson, AZ, USA
Abstract
We call an Ising model tractable when it is possible to compute its partition function value (statistical inference) in polynomial time. The tractability also implies an ability to sample configurations of this model in polynomial time. The notion of tractability extends the basic case of planar zero-field Ising models. Our starting point is to describe algorithms for the basic case, computing partition function and sampling efficiently. Then, we extend our tractable inference and sampling algorithms to models whose triconnected components are either planar or graphs of size. In particular, it results in a polynomial-time inference and sampling algorithms for (minor)-free topologies of zero-field Ising models—a generalization of planar graphs with a potentially unbounded genus. 111The paper to appear at the Proceedings of the 36-th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019. Implementation of the algorithms is available at https://github.com/ValeryTyumen/planar_ising.
1 Introduction
Computing the partition function of the Ising model is generally intractable, even an approximate solution in the special anti-ferromagnetic case of arbitrary topology would have colossal consequences in the complexity theory [\citeauthoryearJerrum and SinclairJerrum and Sinclair1993]. Therefore, a question of interest—rather than addressing the general case—is to look after tractable families of Ising models. In the following, we briefly review tractability related to planar graphs and graphs embedded in surfaces of small genus.
Related work. Onsager [\citeauthoryearOnsagerOnsager1944] gave a closed-form solution for the partition function in the case of a homogeneous interaction Ising model over an infinite two-dimensional square grid without a magnetic field. This result has opened an exciting era of phase transition discoveries, which is arguably one of the most significant contributions in theoretical and mathematical physics of the 20th century. Then, Kac and Ward [\citeauthoryearKac and WardKac and Ward1952] showed in the case of a finite square lattice that the problem of the partition function computation is reducible to a determinant. Kasteleyn [\citeauthoryearKasteleynKasteleyn1963] generalized the results to the case of an arbitrary inhomogeneous interaction Ising model over an arbitrary planar graph. Kasteleyn’s construction was based on mapping of the Ising model to a perfect matching (PM) model with specially defined weights over a modified graph. Kasteleyn’s construction was also based on the so-called Pfaffian orientation, which allows counting of PMs by finding a single Pfaffian (or determinant) of a matrix. Fisher [\citeauthoryearFisherFisher1966] simplified Kasteleyn’s construction such that the modified graph remained planar. Transition to PM is fruitful because it extends planar zero-field Ising model inference to models embedded on a torus [\citeauthoryearKasteleynKasteleyn1963] and, in fact, on any surface of small (orientable) genus , but with a price of the additional, multiplicative, and exponential in genus, , factor in the algorithm’s run time [\citeauthoryearGallucio and LoeblGallucio and Loebl1999].
A parallel way of reducing the planar zero-field Ising model to a PM problem consists of constructing a so-called expanded dual graph [\citeauthoryearBieche, Uhry, Maynard, and RammalBieche et al.1980, \citeauthoryearBarahonaBarahona1982, \citeauthoryearSchraudolph and KamenetskySchraudolph and Kamenetsky2009]. This approach is more natural and interpretable because there is a one-to-one correspondence between spin configurations and PMs on the expanded dual graph. An extra advantage of this approach is that the reduction allows one to develop an exact efficient sampling. Based on linear algebra and planar separator theory [\citeauthoryearLipton and TarjanLipton and Tarjan1979], Wilson introduced an algorithm [\citeauthoryearWilsonWilson1997] that allows one to sample PMs over planar graphs in time. The algorithms were implemented in [\citeauthoryearThomas and MiddletonThomas and Middleton2009, \citeauthoryearThomas and MiddletonThomas and Middleton2013] for the Ising model sampling, however, the implementation was limited to only the special case of a square lattice. In [\citeauthoryearThomas and MiddletonThomas and Middleton2009] a simple extension of the Wilson’s algorithm to the case of bounded genus graphs was also suggested, again with the factor in complexity. Notice that imposing zero field condition is critical, as otherwise, the Ising model over a planar graph is NP-hard [\citeauthoryearBarahonaBarahona1982]. On the other hand, even in the case of zero magnetic field Ising models over general graphs are difficult [\citeauthoryearBarahonaBarahona1982].
Contribution. In this manuscript, we discuss tractability related to the Ising model with zero magnetic fields over graphs more general than planar. Our construction is related to graphs characterized in terms of their excluded minor property. Planar graphs are characterized by excluded minor and minor (Wagner’s theorem [\citeauthoryearDiestelDiestel2006], Chapter 4.4). Therefore, instead of attempting to generalize from planar to graphs embedded into surfaces of higher genus, it is natural to consider generalizations associated with a family of graphs excluding minor or minor.
In this manuscript, we show that -free zero-field Ising models are tractable in terms of inference and sampling and give a tight asymptotic bound, , for both operations. For that purpose, we use graph decomposition into triconnected components—the result of recursive splitting by pairs of vertices, disconnecting the graph. Indeed, the -free graphs are simple to work with because their triconnected components are either planar or graphs [\citeauthoryearHallHall1943]. Therefore, the essence of our construction is to decompose the inference task in Ising over a -free graph into a sequential dynamic programming evaluation over planar or graphs in the spirit of [\citeauthoryearStraub, Thierauf, and WagnerStraub et al.2014]. Notice that the triconnected classification of the tractable zero-field Ising models is complementary to the aforementioned small genus classification. We illustrate the difference between the two classifications with an explicit example of a tractable problem over a graph with genus growing linearly with graph size.
Structure. The manuscript is organized as follows. Sections 2 and 3, respectively, establish notations and pose problems of inference and sampling. Section 4 presents transition from the zero-field Ising model to an equivalent tractable perfect matching (PM) model. This provides a description of a inference and sampling method in planar models, which is new (to the best of our knowledge), and it sets the stage for what follows. Section 5 discusses a scheme for polynomial inference and sampling in zero-field models over graphs with triconnected components that are either planar or of size. Section 6 applies this scheme to -free zero-field Ising models, resulting in tight asymptotic bounds, which appear to be equivalent to those in the planar case. Section 7 describes benchmarks justifying correctness and efficiency of our algorithm. Technical proofs of statements given throughout the manuscript can be found in the supplementary material.
2 Definitions and Notations
Let be a finite set of vertices, a multiset consisting of , be edges, then we call a graph. We call normal, if is a set (i.e., there are no multiple edges in ).
A tree is a connected graph without cycles. For , let denote a graph . Let be a graph. Then is a subgraph of , if . Vertex is an articulation point of , if is disconnected. is biconnected if there are no articulation points in . Biconnected component is a maximal subgraph of without an articulation point.
The graph is planar if it can be drawn on a plane without edge intersections. The corresponding drawing is referred to as planar embedding of . When no ambiguity arises, we do not distinguish planar graph from its embedding.
A set is called a perfect matching (PM) of , if edges of are disjoint and their union equals . denotes the set of all PMs of . denotes a complete (normal) graph on vertices, and denotes a utility graph. Triple bond is a graph of two vertices and three edges between them. Multiple bond is a graph of two vertices and at least three edges between them.
3 Problem Setup
Let be a normal graph, . For each , define a random binary variable (a spin) , . Subscript will be used as shorthand for , for brevity, thus . For each , define a pairwise interaction . We associate assignment to vector with probability as follows:
[TABLE]
where
[TABLE]
The probability distribution (1) defines the so-called zero-field (or pairwise) Ising model, and is called the partition function (PF) of the zero-field Ising (ZFI) model. Notice that .
Given a ZFI model, our goal is to find (inference) and draw samples from the model efficiently.
4 Reducing Planar ZFI Model to PM Model
In this section, we consider a special case of planar graph and introduce a transition from the ZFI model to the perfect matching (PM) model on a different planar graph.
We assume that the planar embedding of is given (and if not, it can be found in time [\citeauthoryearBoyer and MyrvoldBoyer and Myrvold2004]). We follow [\citeauthoryearSchraudolph and KamenetskySchraudolph and Kamenetsky2009] in constructions discussed in this section.
4.1 Expanded Dual Graph
First, triangulate by adding new edges to such that . (The triangulation does not change probabilities of the spin assignments.) Graph is generated (use the same notation as for the original graph for convenience) and is biconnected with every face, including lying on the boundary, forming a triangle. Complexity of the triangulation procedure is , see [\citeauthoryearSchraudolph and KamenetskySchraudolph and Kamenetsky2009] for an example.
Second, construct a new graph, , where each vertex of is a face of , and there is an edge in if and only if and share an edge in . By construction, is planar, and it is embedded in the same plane as , so that each new edge intersects the respective old edge. Call a dual graph of . Since is triangulated, each has degree 3 in .
Third, obtain a planar graph and its embedding from by substituting each by a triangle so that each vertex of the triangle is incident to one edge, going outside the triangle (see Figure 1 for an illustration). Call an expanded dual graph of .
Newly introduced triangles of , substituting ’s vertices, are called Fisher cities [\citeauthoryearFisherFisher1966]. We refer to edges outside triangles as intercity edges and denote their set as . The set of Fisher city edges is denoted as . Notice that intersects exactly one and vice versa, which defines a bijection between and ; denote it by . Observe also that , where is the size of . Moreover, is a PM of , and thus . Since is planar, one also finds that . Constructing takes efforts of complexity.
4.2 Perfect Matching (PM) Model
For , let be a set . Each Fisher city is incident to an odd number of edges in . Thus, can be uniquely completed to a PM by edges from . Denote the resulting PM by (see Figure 1 for an illustration). Let .
Lemma 1**.**
* is a bijection between and .*
Define weights on according to
[TABLE]
Lemma 2**.**
For holds
[TABLE]
where
[TABLE]
is the PF of the PM distribution (PM model) defined by (2).
Second transition of (3) reduces the computation to solve for . Furthermore, only two equiprobable spin configurations and (one of which is in ) correspond to , and they can be recovered from in steps, thus resulting in the statement that one samples from (1) if sampling from (2) is known.
The PM model can be defined for an arbitrary graph with positive weights , as a probability distribution over : .
Our subsequent derivations are based on the following:
Theorem 1**.**
Given the PM model defined on planar graph of size with positive edge weights , one can find its partition function and sample from it in time.
Algorithms, constructively proving the theorem, are directly inferred from [\citeauthoryearWilsonWilson1997, \citeauthoryearThomas and MiddletonThomas and Middleton2009], with minor changes/generalizations. Hence, we outline them in the supplementary material.
Corollary**.**
Inference and sampling of the PM model on (and, hence, the ZFI model on ) take time.
5 Dynamic Programming within Triconnected Components
Starting with this section, we present new results. We describe a general algorithm that allows us to perform inference and sampling from the ZFI model in the case where the triconnected components of the underlying graph are either planar or of size.
5.1 Decomposition into Biconnected Components
Consider a ZFI model (1) over a normal graph , . If is disconnected, then distribution (1) is decomposed into a product of terms associated with independent ZFI models over the connected components of . Hence, we assume below, without loss of generality, that is connected.
Let be biconnected components of . They form a tree if an edge is drawn between and whenever and share an articulation point. A simple reduction (see supplementary material) shows that inference and sampling on are reduced to a series of inference and sampling on ZFI models induced by subgraphs .
Lemma 3**.**
Let be partition functions of ZFI models induced by . Then,
[TABLE]
Sampling from is reduced to a series of sampling on and post-processing.
Observe also that all the articulation points and the biconnected components of can be found in steps [\citeauthoryearHopcroft and TarjanHopcroft and Tarjan1973a]. Therefore later on, we assume without loss of generality that is biconnected.
5.2 Biconnected Graph as a Tree of Triconnected Components
In this subsection we follow [\citeauthoryearHopcroft and TarjanHopcroft and Tarjan1973b, \citeauthoryearGutwenger and MutzelGutwenger and Mutzel2001], see also [\citeauthoryearMaderMader2008] to define the tree of triconnected components. Following discussions of the previous subsection, one considers here a biconnected .
Let . Divide into equivalence classes so that are in the same class if they lie on a common simple path that has as endpoints. are referred to as separation classes. If , then is a separation pair of , unless (a) and one of the classes is a single edge or (b) and each class is a single edge. Graph is called triconnected if it has no separation pairs.
Let be a separation pair in with equivalence classes . Let be such that , . Then, graphs are called split graphs of with respect to , and is a virtual edge, which is a new edge between and , identifying the split operation. Due to the addition of , and are not normal in general.
Split into and . Continue splitting , and so on, recursively, until no further split operation is possible. The resulting graphs are split components of . They can either be (triangles), triple bonds, or triconnected normal graphs.
Let be a virtual edge. There are exactly two split components containing : and . Replacing and with is called merging and . Do all possible mergings of the cycle graphs (starting from triangles), and then do all possible mergings of multiple bonds starting from triple bonds. Components of the resulting set are referred to as the triconnected components of . We emphasize again that some graphs (i.e., cycles and bonds) in the set of triconnected components are not necessarily triconnected.
Lemma 4**.**
[\citeauthoryearHopcroft and TarjanHopcroft and Tarjan1973b]** Triconnected components are unique for . Total number of edges within the triconnected components is at most .
Consider a graph , where vertices (further referred to as nodes for disambiguation) are triconnected components, and there is an edge between and in , when and share a (copied) virtual edge.
Lemma 5**.**
[\citeauthoryearHopcroft and TarjanHopcroft and Tarjan1973b]** is a tree.
Example.
Figure 2 illustrates triconnected decomposition of a binconnected graph and intermediate steps towards it.
All triconnected components, and thus , can be found in steps [\citeauthoryearHopcroft and TarjanHopcroft and Tarjan1973b, \citeauthoryearGutwenger and MutzelGutwenger and Mutzel2001, \citeauthoryearVoVo1983]. Merging of two triconnected components is equivalent to contracting an edge in (VI on Figure 2). After all possible mergings, is recovered.
5.3 Inference via Dynamic Programming
Assume that there is a (small) number bounding the size of each nonplanar triconnected component. In the following, we present a polynomial time algorithm that computes for a given (fixed) .
First, one finds triconnected components of and in steps. Choose a root node in . For any node in , let the next node (on a unique path from to ) be a parent of , and be a child of . Nodes, which do not have any children, are called leaves. For node , let a subtree denote a subgraph constructed from , its children, grandchildren, and so on.
Our algorithm processes each node once. The node is only processed when all its children have been already processed, so a leaf is processed first and the root is processed last. Let be a currently processed node. Let be a graph obtained by merging all nodes in . If is a root, then . Since the root is processed last, it outputs the desired PF, Z. Figure 3 provides a visualization of a node processing routine which is to be explained.
If is not a root, let be a virtual edge shared between and its parent. The only virtual edge in is , and without is a subgraph of . Hence, pairwise interactions are defined for . The result of node ’s processing is a quantity.
[TABLE]
where . Notice that , , and hence .
Processing nodes one by one we notice that the following cases are possible:
** is a leaf**. Therefore, there is nothing to merge, and . If is nonplanar, find by brute force enumeration, completed in steps. If is a multiple bond, is found in steps.
Assume now that node is (or corresponds to) a planar, normal graph. Define and consider a ZFI model with the probability defined over graph with as pairwise interactions. Let be the PF of the ZFI model. In the remaining part of this case we will only work with this induced ZFI model, so that one can assume that nodes in are ordered, , such that . Then, one utilizes the notations and and derives
[TABLE]
Next, one triangulates by adding enough edges with zero pairwise-interactions, similar to how it is done in Subsection 4.1. Assume that is triangulated, and observe that the right-hand side of Eq. (5) is not affected. Construct , which is an expanded dual graph of with , and defined as in Subsection 4.1. Then, define mapping , weights , and the PF as in 4.2. Denote .
According to the definition of ,
[TABLE]
Denote . We continue the chain of relations/equalities (6) observing that
[TABLE]
Then one arrives at
[TABLE]
where is a PF of the PM model over . Compute and in steps, as described in Section 4. Since is planar of size , can also be computed in steps, as Theorem 1 states. The following relations finalize computation of in steps:
[TABLE] 2. 2.
** is not a leaf, not a root**. Let be ’s children, and be a virtual edge shared between and , . At this point, we already computed all . Each is a separation pair in that splits it into and the rest of , containing all , . Denote all virtual edges in as , and then the following relation holds:
[TABLE]
If is (or corresponds to) a multiple bond, (7) is computed trivially in steps. Hence, one assumes next that is a normal graph.
Each is positive, and it essentially only depends on the product , that is, there exist such that . Using this relation, one rewrites (7) as
[TABLE]
Denote , for each . Then rewrite (8) as
[TABLE]
We compute (9) by brute force in steps, if is nonplanar. If is normal planar, we once again consider a ZFI model with the probability , defined over , where the pairwise weights are , and is the respective PF. Then applying machinery from Case 1, one derives
[TABLE]
in steps. 3. 3.
** is a root**. Once again, let be children of , be a virtual edge shared between and , and , be the set of virtual edges in (which shares only with its children). Using considerations similar to those described while deriving Eq. (7), one arrives at
[TABLE]
Finally, one computes similarly to how the values were derived in Case 2. It takes steps if is a multiple bond. Otherwise, one constructs a ZFI model and finds the PF over the respective graphs in either steps, if the graph is nonplanar, or in steps, if is normal planar.
5.4 Sampling via Dynamic Programming
The sampling algorithm, detailed below, follows naturally from the inference routine. Compute triconnected components of in steps. If all the triconnected components of are multiple bonds, should be a multiple bond itself, but is normal. Therefore, there exists a component that is not a multiple bond; choose it as a root of .
Use the inference routine (described in the previous Section) to compute . Now, do a backward pass through the tree, processing the root first, and then processing the node only when its parent has already been processed (Figure 4 visualizes the sampling algorithm).
Suppose is a root and it is processed by now. Since is not a multiple bond, it results in an Ising model, . Draw a spin configuration from this model. It will take steps if is nonplanar or steps if is planar.
Suppose is not a root. If is a multiple bond, spin values were already assigned to its vertices (contained within the node/graph ). Otherwise, there exists a ZFI model already constructed at the inference stage. Following the notation of Subsection 5.3, one has to sample from , since spins and are shared with the parent model and have already been drawn as and , respectively. If , all valid are such that , and the task is reduced to sampling PMs on . Otherwise, all valid are such that . Denote and notice that
[TABLE]
Therefore, the task is reduced to sampling PM over .
6 -free Topology
6.1 ZFI Model over -free Graphs
Consider the ZFI model (1) over a normal connected graph . Let be some graph. Then, is a minor of , if it is isomorphic to ’s subgraph, in which some edges are contracted. (See [\citeauthoryearDiestelDiestel2006], Chapter 1.7, for a formal definition.)
is -free, if is not a minor of , that is, it cannot be derived from ’s subgraph by contraction of some edges.
Let a biconnected be decomposed into the tree of triconnected components. Then, the following lemma holds:
Lemma 6**.**
[\citeauthoryearHallHall1943]** Graph is -free if and only if its nonplanar triconnected components are exactly .
Therefore, if is -free, it satisfies all the conditions needed for efficient inference and sampling, described in Section 5. According to the lemma, the graph in Fig. 2 is -free. The next statement expresses the main contribution of this manuscript.
Theorem 2**.**
If is -free, inference or sampling of (1) takes steps.
We point out that the family of models for which the algorithm from Section 5 applies is broader than just -free models. However, we focus on -free graphs because they have a fortunate characterization in terms of a missing minor.
6.2 Discussion: Genus of -free Graphs
A remarkable feature of -free models is related to considerations addressing the graph’s genus. Genus of a graph is a minimal genus (number of handles) of the orientable surface that the graph can be embedded into. Kasteleyn [\citeauthoryearKasteleynKasteleyn1963] has conjected that the complexity of evaluating the PF of a ZFI model embedded in a graph of genus is exponential in . The result was proven and detailed in [\citeauthoryearRegge and ZecchinaRegge and Zecchina2000, \citeauthoryearGallucio and LoeblGallucio and Loebl1999, \citeauthoryearCimasoni and ReshetikhinCimasoni and Reshetikhin2007, \citeauthoryearCimasoni and ReshetikhinCimasoni and Reshetikhin2008]. One naturally asks what are genera of graphs over which the ZFI models are tractable. The following statement relates biconnectivity and graph topology (genus):
Theorem 3**.**
[\citeauthoryearBattle, Harary, and KodamaBattle et al.1962]** A graph’s genus is a sum of its biconnected component genera.
If a graph is not biconnected, its genus can be arbitrarily large, while inference and sampling may still be tractable in relation to the decomposition technique discussed in Subsection 5.1. Therefore, it becomes principally interesting to construct tractable biconnected models with large genus.
Lemma 7**.**
A biconnected -free graph of size can be of genus as big as .
From this we conclude that -free graphs can’t be tackled via the bounded-genus approach of [\citeauthoryearRegge and ZecchinaRegge and Zecchina2000, \citeauthoryearGallucio and LoeblGallucio and Loebl1999, \citeauthoryearCimasoni and ReshetikhinCimasoni and Reshetikhin2007, \citeauthoryearCimasoni and ReshetikhinCimasoni and Reshetikhin2008]. This justifies the novelty of our contribution.
7 Implementation and Tests
To test the correctness of inference, we generate random -free models of a given size and then compare the value of PF computed in a brute force way (tractable for sufficiently small graphs) and by our algorithm. We simulate samples of sizes from ( samples per size) and verify that respective expressions coincide.
When testing sampling implementation, we take for granted that the produced samples do not correlate given that the sampling procedure (Section 5.4) accepts the Ising model as input and uses independent random number generation inside. The construction does not have any memory, therefore, it generates statistically independent samples. To test that the empirical distribution is approaching a theoretical one (in the limit of the infinite number of samples), we draw different numbers, , of samples from a model of size . Then we find Kullback-Leibler divergence between the probability distribution of the model (here we use our inference algorithm to compute the normalization, ) and the empirical probability, obtained from samples. Fig. 5 shows that KL-divergence converges to zero as the sample size increases. Zero KL-divergence corresponds to equal distributions.
Finally, we simulate inference and sampling for random models of different size and observe that the computational time (efforts) scales as (Fig. 6)222Implementation of the algorithms is available at https://github.com/ValeryTyumen/planar_ising..
8 Conclusion
In this manuscript, we compiled results that were scattered over the literature on sampling and inference in the Ising model over planar graphs. To the best of our knowledge, we are the first to present a complete and mathematically accurate description of the tight asymptotic bounds.
We generalized the planar results to a new class of zero-field Ising models over graphs not containing as a minor. In this case, which is strictly more general than the planar case, we have shown that the complexity bounds for sampling and inference are the same as in the planar case. Along with the formal proof, we provided evidence of our algorithm’s correctness and complexity through simulations.
Acknowledgements
This work was supported by the U.S. Department of Energy through the Los Alamos National Laboratory as part of LDRD and the DOE Grid Modernization Laboratory Consortium (GMLC). Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001).
Appendix A Technical Proofs
Lemma 1 proof. Let . Call saturated, if it intersects an edge from . Each Fisher city is incident to an odd number of edges in . Thus, each face of has an even number of unsaturated edges. This property is preserved, when two faces/cycles are merged into one by evaluating respective symmetric difference. Therefore, one gets that any cycle in has an even number of unsaturated edges.
For each define , where is the number of unsaturated edges on the path connecting and . The definition is consistent due to aforementioned cycle property. Now for each , if and only if is saturated. To conclude, we constructed such that . Such is unique, because parity of unsaturated edges on a path between and uniquely determines relationship between and , and is always . ∎
Lemma 2 proof. Let , . The statement is justified by the following chain of transitions:
[TABLE]
Lemma 3 proof. The Algorithm 1 reduces sampling on to a series of samplings on .
Given the algorithm and inference formula in Lemma 3, the statement is obvious for . Let . Let be an articulation point shared by and . Denote , . Without loss of generality assume that has index in and . Let . Then one derives:
[TABLE]
where is the PF of the ZFI model induced by . As far as sampling is concerned, denote by a probability distribution induced by the -th ZFI model. Then, since :
[TABLE]
Assume that a method for sampling from is available. Then, draw by sampling from . To sample conditional on from , draw from . If , then , otherwise . This is consistent with Algorithm 1.
For graphs of the statement of lemma follows naturally by induction.
Theorem 2 proof. Since is normal and minor-free, it holds that [\citeauthoryearThomasonThomason2001]. Find all biconnected components and for each construct a triconnected component tree in .
As described above, the time (number of steps) of inference or sampling is a sum of inference or sampling times of each triconnected component of . Let the set of all ’s triconnected components (that is, a union over all biconnected components) to consist of planar triconnected components of size with edges respectively, multiple bonds of edges and graphs. Then the complexity of inference or sampling is .
The edges of are partitioned among biconnected components. Inside each biconnected component apply second part of Lemma 4 to obtain that . This gives that and . Since triconnected components are connected graphs, we get that for all and hence . From convexity of it follows that and finally that . ∎
Lemma 7 proof. A simple example illustrates that genus of a biconnected -free graph can grow linearly with its size. First, notice that is a nonplanar graph, but it can be embedded in toroid (Fig. 7), therefore genus of the graph is unity. Consider a cycle of length , enumerate edges in the order of cycle traversal from to . Attach graph to each odd edge of the cycle (see Fig. 7). The resulting graph is of size , it is biconnected and -free (see Figure 7). Remove an arbitrary even edge from the cycle. It results in a graph whose biconnected components are graphs and edges, so its genus is . Since edge removal can only decrease genus, we conclude that ’s genus is at least .
Appendix B Counting PMs of Planar in time
This section addresses inference part of Theorem 1.
B.1 Pfaffian Orientation
Let be an oriented graph. Its cycle of even length (built on an even number of vertices) is said to be odd-oriented, if, when all edges along the cycle are traversed in any direction, an odd number of edges are directed along the traversal. An orientation of is called Pfaffian, if all cycles , such that , are odd-oriented.
We will need to contain a Pfaffian orientation, moreover the construction is easy.
Theorem 4**.**
Pfaffian orientation of can be constructed in .
Proof.
This theorem is proven constructively, see e.g. [\citeauthoryearWilsonWilson1997, \citeauthoryearVaziraniVazirani1989], or [\citeauthoryearSchraudolph and KamenetskySchraudolph and Kamenetsky2009], where the latter construction is based on specifics of the expanded dual graph. ∎
Construct a skew-symmetric sparse matrix ( denotes orientation of edges):
[TABLE]
The next result allows to compute PF of PM model on in a polynomial time.
Theorem 5**.**
, .
Proof.
See, e.g., [\citeauthoryearWilsonWilson1997] or [\citeauthoryearKasteleynKasteleyn1963]. ∎
B.2 Computing
LU-decomposition of a matrix , found via Gaussian elimination, where is a lower-triangular matrix with unit diagonals and is an upper-triangular matrix, would be a standard way of computing , which is then equal to a product of the diagonal elements of . However, this standard way of constructing the LU decomposition applies only if all ’s leading principal submatrices are nonsingular (See e.g. [\citeauthoryearHorn and JohnsonHorn and Johnson2012], Section 3.5, for detailed discussions). And already the first, , leading principal submatrix of is zero/singular.
Luckily, this difficulty can be resolved through the following construction. Take ’s arbitrary perfect matching . In the case of a general planar graph can be found via e.g. Blum’s algorithm [\citeauthoryearBlumBlum1990] in time, while for graphs and appearing in this paper can be found in from a spin configuration using mapping (e.g. ). Modify ordering of vertices, , so that . Build according to the definition (10). Obtain from by swapping column with column , with and so on. This results in , where the new is properly conditioned.
Lemma 8**.**
’s leading principal submatrices are nonsingular.
Proof.
The proof, presented in [\citeauthoryearWilsonWilson1997] for the case of unit weights , generalizes to arbitrary positive . ∎
Notice, that in the general case (of a matrix represented in terms of a general graph) complexity of the LU-decomposition is cubic in the size of the matrix. Fortunately, nested dissection technique, discussed in the following subsection, allows to reduce complexity of computing to .
B.3 Nested Dissection
The partition of set is a separation of , if for any it holds that . We refer to as the parts, and to as the separator.
Lipton and Tarjan (LT) [\citeauthoryearLipton and TarjanLipton and Tarjan1979] found an algorithm, which finds a separation such that and . The LT algorithm can be used to construct the so called nested dissection ordering of . The ordering is built recursively, by first placing vertices of , then and , and finally permuting indices of and recursively according to the ordering of and (See [\citeauthoryearLipton, Rose, and TarjanLipton et al.1979] for accurate description of details, definitions and analysis of the nested dissection ordering). As shown in [\citeauthoryearLipton, Rose, and TarjanLipton et al.1979] the complexity of finding the nested dissection ordering is .
Let be a matrix with a sparsity pattern of . That is, can be nonzero only if or .
Theorem 6**.**
[\citeauthoryearLipton, Rose, and TarjanLipton et al.1979]** If is ordered according to the nested dissection and ’s leading principal submatrices are nonsingular, computing the LU-decomposition of becomes a problem of the complexity.
Notice, however, that we cannot directly apply the Theorem to , because the sparsity pattern of is asymmetric and does not correspond, in general, to any graph.
Let be a planar graph, obtained from , by contracting each edge in , . Find and fix a nested dissection ordering over (it takes steps) and let the enumeration of correspond to this ordering. Split into cells and consider the sparsity pattern of the nonzero cells. One observes that the resulting sparsity pattern coincides with the sparsity patterns of and . Since LU-decomposition can be stated in the block elimination form, its complexity is reduced down to .
This concludes construction of an efficient inference (counting) algorithm for planar PM model.
Appendix C Sampling PMs of Planar in time (Wilson’s Algorithm)
This section addresses sampling part of Theorem 1. In this section we assume that degrees of ’s vertices are upper-bounded by . This is true for , and - the only PM models appearing in the paper. Any other constant substituting wouldn’t affect the analysis of complexity. Moreover, Wilson [\citeauthoryearWilsonWilson1997] shows that any PM model on a planar graph can be reduced to bounded-degree planar model without affecting complexity.
C.1 Structure of the Algorithm
Denote a sampled PM as , . Wilson’s algorithm first applies LT algorithm of [\citeauthoryearLipton and TarjanLipton and Tarjan1979] to find a separation of (, ). Then it iterates over and for each it draws an edge of , saturating . Then it appears that, given this intermediate result, drawing remaining edges of may be split into two independent drawings over and , respectively, and then the process is repeated recursively.
It takes steps to sample edges attached to at the first step of the recursion, therefore the overall complexity of the Wilson’s algorithm is also .
Subsection C.2 introduces probabilities required to draw the aforementioned PM samples. Subsections C.3 and C.4 describe how to sample edges attached to the separator, while Subsection C.5 focuses on describing the recursion.
C.2 Drawing Perfect Matchings
For some consider the probability of getting as a subset of :
[TABLE]
Let and . Then the set coincides with . This yields the following expression
[TABLE]
where
[TABLE]
is a PF of the PM model on induced by the edge weights .
For a square matrix let denote the matrix obtained by deleting rows and columns from . Let be obtained by leaving only rows and columns of and placing them in this order.
Now let . A simple check demonstrates that deleting vertex from a graph preserves the Pfaffian orientation. By induction this holds for any number of vertices deleted. From that it follows that is a Kasteleyn matrix for and then
[TABLE]
resulting in
[TABLE]
Linear algebra transformations, described in [\citeauthoryearWilsonWilson1997], suggest that if is non-singular, then
[TABLE]
This observation allows us to express probability (11) as
[TABLE]
Now we are in the position to describe the first step of the Wilson’s recursion.
C.3 Step 1: Computing Lower-Right Submatrix of
Find a separation of . The goal is to sample an edge from every .
Let be a set of vertices from and their neighbors, then because each vertex in is of degree at most . Let be a set of the contracted edges (recall definition from Subsection B.3), containing at least one vertex from , . Then is a separator of such that
[TABLE]
where one uses that, . Find a nested dissection ordering (Subsection B.3) of with as a top-level separator. This is a correct nested dissection due to Eq. (12).
Utilizing this ordering, construct . Compute and - LU-decomposition of ( time). Let and let be a shorthand notation for . Using and , find , which is a lower-right ’s submatrix of size .
It is straightforward to observe that the -th column of , , satisfies
[TABLE]
where is a zero vector with unity at the -th position. Therefore constructing is reduced to solving triangular systems, each of size , resulting in required steps.
C.4 Step 2: Sampling Edges in the Separator
Now, progressing iteratively, one finds which is not yet paired and draw an edge emanating from it. Suppose that the edges, , are already sampled. We assume that by this point we have also computed LU-decomposition and we will update it to when the new edge is drawn. Then
[TABLE]
Next we choose so that is not saturated yet. We iterate over ’s neighbors considered as candidates for becoming . Let to become the next candidate, denote . For let if is odd and if is even. Then the identity
[TABLE]
follows from the definition of . One deduces from Eq. (14)
[TABLE]
Constructing one has . It means that is a submatrix of with permuted rows and columns, hence is known.
We further observe that
[TABLE]
Therefore to update and , one just solves the triangular system of equations and , where are of size (this is done in steps), and then compute which is of the size , then set, .
The probability to pair and is
[TABLE]
Therefore maintaining allows us to compute the required probability and draw a new edge from . By construction of , has only neighbors, therefore the complexity of this step is because .
C.5 Step 3: Recursion
Let be a set of edges drawn on the previous step, and be a set of vertices saturated by , . Given , the task of sampling such that is reduced to sampling perfect matchings and over and , respectively. Then becomes the result of the perfect matching drawn from (2).
Even though only the first step of the Wilson’s recursion was discussed so far, any further step in the recursion is done in exactly the same way with the only exception that vertex degrees may become less than , while in they are exactly . Obviously, this does not change the iterative procedure and it also does not affect the complexity analysis.
Appendix D Random Graph Generation
As our derivations cover the most general case of planar and -free graphs, we want to test them on graphs which are as general as possible. Based on Lemma 6 (notice, that it provides necessary and sufficient conditions for a graph to be -free) we implement a randomized construction of -free graphs, which is assumed to cover most general -free topologies.
Namely, one generates a set of ’s and random planar graphs, attaching them by edges to a tree-like structure. For simplicity, we slightly relax the condition that random planar components should be triconnected (because it is not clear how to generate such graphs efficiently) and simply require the components to be biconnected. This can be interpreted as constructing , where some neighbor planar nodes are merged (merging planar graphs results in another planar graph). We refer to such non-unique decomposition as partially merged. Inference and sampling algorithm suggested in Section 5 is applied with no changes to the partially merged decomposition. Our generation process consists of the following two steps.
Planar graph generation. This step accepts as an input and generates a normal biconnected planar graph of size along with its embedding on a plane. The details of the construction are as follows.
First, a random embedded tree is drawn iteratively. We start with a single vertex, on each iteration choose a random vertex of an already “grown” tree, and add a new vertex connected only to the chosen vertex. Items I-V in Fig. 8 illustrate this step.
Then we triangulate this tree by adding edges until the graph becomes biconnected and all faces are triangles, as in the Subsection 4.1 (VI in Figure 8). Next, to get a normal graph, we remove multiple edges possibly produced by triangulation (VII in Fig. 8). At this point the generation process is complete. 2. 2.
-free graph generation. Here we take as the input and generate a normal biconnected -free graph in a form of its partially merged decomposition . Namely, we generate a tree of graphs where each node is either a normal biconnected planar graph or , and every two adjacent graphs share a virtual edge.
The construction is greedy and is essentially a tree generation process from Step 1. We start with root and then iteratively create and attach new nodes. Let be a size of the already generated graph, at first. Notice, that when a node of size is generated, it contributes new vertices to .
An elementary step of iteration here is as follows. If , a coin is flipped and the type of new node is chosen - or planar. If , cannot be added, so a planar type is chosen. If a planar node is added, its size is drawn uniformly in the range between and and then the graph itself is drawn as described in Step 1. Then we attach a new node to a randomly chosen free edge of a randomly chosen node of . We repeat this process until is of the desired size . Fig. 9 illustrates the algorithm.
To obtain an Ising model from , we sample pairwise interactions for each edge of independently from .
Notice that the tractable Ising model generation procedure is designed in this section solely for the convenience of testing and it is not claimed to be sampling models of any particular practical interest (e.g. in statistical physics or computer science).
Appendix E Future Work
We conclude by discussing some future research directions:
- •
The class of models considered in the manuscript can be extended even further towards -free generalizations of (a) the so-called outerplanar graphs, which can then be used for approximate inference and efficient learning in the spirit of [\citeauthoryearGloberson and JaakkolaGloberson and Jaakkola2007] and [\citeauthoryearJohnson, Oyen, Chertkov, and NetrapalliJohnson et al.2016] respectively; and (b) graphs embedded in the surfaces of genus [\citeauthoryearRegge and ZecchinaRegge and Zecchina2000, \citeauthoryearGallucio and LoeblGallucio and Loebl1999, \citeauthoryearCimasoni and ReshetikhinCimasoni and Reshetikhin2007, \citeauthoryearCimasoni and ReshetikhinCimasoni and Reshetikhin2008].
- •
This manuscript was motivated by a larger task of using efficient inference and learning over the most general -graphs for constructing more general (and thus, hopefully, more powerful) alternatives to traditional Neural Networks for efficient learning.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ \citeauthoryear Barahona Barahona 1982] Barahona, F. (1982). On the computational complexity of Ising spin glass models. Journal of Physics A: Mathematical and General 15 (10), 3241.
- 2[ \citeauthoryear Battle, Harary, and Kodama Battle et al.1962] Battle, J., F. Harary, and Y. Kodama (1962, 11). Additivity of the genus of a graph. Bull. Amer. Math. Soc. 68 (6), 565–568.
- 3[ \citeauthoryear Bieche, Uhry, Maynard, and Rammal Bieche et al.1980] Bieche, L., J. P. Uhry, R. Maynard, and R. Rammal (1980). On the ground states of the frustration model of a spin glass by a matching method of graph theory. Journal of Physics A: Mathematical and General 13 (8), 2553.
- 4[ \citeauthoryear Blum Blum 1990] Blum, N. (1990). A new approach to maximum matching in general graphs. In M. S. Paterson (Ed.), Automata, Languages and Programming , Berlin, Heidelberg, pp. 586–597. Springer Berlin Heidelberg.
- 5[ \citeauthoryear Boyer and Myrvold Boyer and Myrvold 2004] Boyer, J. M. and W. J. Myrvold (2004). On the cutting edge: Simplified O ( n ) 𝑂 𝑛 O(n) planarity by edge addition. J. Graph Algorithms Appl. 8 (2), 241–273.
- 6[ \citeauthoryear Cimasoni and Reshetikhin Cimasoni and Reshetikhin 2007] Cimasoni, D. and N. Reshetikhin (2007, Oct). Dimers on surface graphs and spin structures. I. Communications in Mathematical Physics 275 (1), 187–208.
- 7[ \citeauthoryear Cimasoni and Reshetikhin Cimasoni and Reshetikhin 2008] Cimasoni, D. and N. Reshetikhin (2008, Apr). Dimers on surface graphs and spin structures. II. Communications in Mathematical Physics 281 (2), 445.
- 8[ \citeauthoryear Diestel Diestel 2006] Diestel, R. (2006). Graph Theory . Electronic library of mathematics. Springer.
