A theory of non-equilibrium local search on random satisfaction problems

Erik Aurell; Eduardo Dom\'inguez; David Machado; R. Mulet

arXiv:1903.01510·cond-mat.dis-nn·December 11, 2019

A theory of non-equilibrium local search on random satisfaction problems

Erik Aurell, Eduardo Dom\'inguez, David Machado, R. Mulet

PDF

TL;DR

This paper develops a new theoretical framework using cavity master equations to predict the dynamics of non-equilibrium local search algorithms solving random k-satisfiability problems, outperforming traditional equilibrium-based models.

Contribution

It introduces a systematic non-equilibrium theory for local search algorithms on satisfiability problems, validated on random 3-satisfiability instances.

Findings

01

Accurately predicts the solution process away from phase boundaries.

02

Qualitatively captures the location of the algorithm phase boundary.

03

Outperforms equilibrium Gibbs state predictions in non-equilibrium regimes.

Abstract

We study local search algorithms to solve instances of the random $k$ -satisfiabi lity problem, equivalent to finding (if they exist) zero-energy ground states of statistical models with disorder on random hypergraphs. It is well known that the best such algorithms are akin to non-equilibrium processes in a high-dimensional space. In particular, algorithms known as focused, and which do not obey detailed balance, outperform simulated annealing and related methods in the task of finding the solution to a complex satisfiability problem, that is to find (exactly or approximately) the minimum in a complex energy landscape. A physical question of interest is if the dynamics of these processes can be well predicted by the well-developed theory of equilibrium Gibbs states. While it has been known empirically for some time that this is not the case, an alternative systematic theory that does so…

Figures7

Click any figure to enlarge with its caption.

Equations7

\overset{p}{˙} (σ_{a ∖ i} ∣ σ_{i}) = - j \in a ∖ i \sum {σ_{b ∖ j}} b \in \partial j ∖ a \sum r_{j} (+) b \in \partial j ∖ a \prod p (σ_{b ∖ j} ∣ σ_{j}) p (σ_{a ∖ i} ∣ σ_{i})

\overset{p}{˙} (σ_{a ∖ i} ∣ σ_{i}) = - j \in a ∖ i \sum {σ_{b ∖ j}} b \in \partial j ∖ a \sum r_{j} (+) b \in \partial j ∖ a \prod p (σ_{b ∖ j} ∣ σ_{j}) p (σ_{a ∖ i} ∣ σ_{i})

+ j \in a ∖ i \sum {σ_{b ∖ j}} b \in \partial j ∖ a \sum r_{j} (-) b \in \partial j ∖ a \prod p (σ_{b ∖ j} ∣ - σ_{j}) p (F_{j} [σ_{a ∖ i}] ∣ σ_{i})

r_{i} = \frac{E _{i} ( σ _{i} , σ _{\partial i} )}{K E} min [e^{- β Δ E (σ_{i}, σ_{\partial i})}, 1]

r_{i} = \frac{E _{i} ( σ _{i} , σ _{\partial i} )}{K E} min [e^{- β Δ E (σ_{i}, σ_{\partial i})}, 1]

E_{a} = \frac{1}{2 ^{K}} i \in a \prod (1 - l_{i}^{a} σ_{i})

E_{a} = \frac{1}{2 ^{K}} i \in a \prod (1 - l_{i}^{a} σ_{i})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A theory of non-equilibrium local search on random satisfaction problems

Erik Aurell

Department of Computational Science and Technology, AlbaNova University Center, SE-106 91 Stockholm, Sweden

[email protected]

Eduardo Domínguez

Group of Complex Systems and Statistical Physics. Department of Theoretical Physics, Physics Faculty, University of Havana, Cuba

David Machado

Group of Complex Systems and Statistical Physics. Department of Theoretical Physics, Physics Faculty, University of Havana, Cuba

Roberto Mulet

Group of Complex Systems and Statistical Physics. Department of Theoretical Physics, Physics Faculty, University of Havana, Cuba

[email protected]

Abstract

We study local search algorithms to solve instances of the random $k$ -satisfiability problem, equivalent to finding (if they exist) zero-energy ground states of statistical models with disorder on random hypergraphs. It is well known that the best such algorithms are akin to non-equilibrium processes in a high-dimensional space. In particular, algorithms known as focused, and which do not obey detailed balance, outperform simulated annealing and related methods in the task of finding the solution to a complex satisfiability problem, that is to find (exactly or approximately) the minimum in a complex energy landscape. A physical question of interest is if the dynamics of these processes can be well predicted by the well-developed theory of equilibrium Gibbs states. While it has been known empirically for some time that this is not the case, an alternative systematic theory that does so has been lacking. In this paper we introduce such a theory based on the recently developed technique of cavity master equations and test it on the paradigmatic random $3$ -satisfiability problem. Our theory predicts the solution process very accurately away from the algorithm phase boundary and also predicts the qualitative form of this boundary.

I Introduction

Combinatorial optimization problems are of great importance in many industrial and engineering fields, and are also central to computational complexity Garey and Johnson (1979). They are equivalent to the physical problem of finding ground states in statistical mechanics models with disorder, an analogy which has generated a large literature Mézard and Montanari (2009). Constraint satisfaction is the subset of such problems where the energy function is non-negative, and the problem is to find a zero-energy ground state (if any exists). Many problems in combinatorial optimization are known to be worst-case computationally intractable, given “ $P\neq NP$ ”.

The typical or average-case behavior is however qualitatively different. It was found a quarter of a century ago that the empirical run-time on random instances of one of the most famous NP-complete problems, Boolean 3-satisfiability problem (3-SAT), varies greatly Mitchell et al. (1992); Monasson et al. (1999). Very under-constrained problems are in practice easy, for almost any solution procedure. This does not say that the problem would not be worst-case hard; it only says that hard instances are hard to find in problems that are overall under-constrained. The most difficult region is for problems that are on the verge of being unsatisfiable, which for $3$ -satisfiability means a ratio of clauses to variables ( $M/N$ ) of about $4.27$ . Very over-constrained problems are again easy for complete algorithms, but this aspect will not be discussed further here.

In the run-up from under-constrained to critical $3$ -SAT problems different algorithms can be characterized, rigorously or empirically, where they fail to work. We here take “work” to mean “find a solution in time scaling polynomially in system size”, but keep it unspecified whether this has to happen always or with high probability, and leave aside rigorous considerations for which we refer to Achlioptas and Coja-Oghlan (2008); Coja-Oghlan and Panagiotou (2016), and references cited therein. According to this criterion the best algorithm for random K-SAT is “survey propagation” Mézard et al. (2002) which in its most recent version is able to find solutions extremely close to the SAT/UNSAT threshold Marino et al. (2016). Survey propagation is however a quite complex algorithm tailored to random constraint satisfaction problems, and is not competitive on most real-world problems Kautz and Selman (2007). It is therefore of interest to step back and consider other simpler and more general solution procedures, of which the first example is “simulated annealing” Kirkpatrick et al. (1983), a work-horse of scientific computing. The performance of simulated annealing at slow enough cooling rate can be analyzed by spin glass techniques Krzakala et al. (2007) and is known to fail at some distance from the SAT/UNSAT threshold. This can be taken to reinforce the (equilibrium) statistical mechanics view of random satisfiability problems.

The best local search algorithms that have been invented for satisfiability are however not processes in detailed balance, and hence fall outside the paradigm of equilibrium statistical physics. They all rely in “focusing”, meaning that only variables that participate in some unsatisfied clause are considered for update. A focused algorithm hence obeys to the dictum “if it works, don’t fix it”. It is obvious that focusing breaks detailed balance as it leaves the set of solutions (zero-energy states) invariant. In other words, if the problem has a solution, then the focused algorithm has an absorbing set. For constraint satisfiability, the most well-known algorithm in this class is “walksat” Selman et al. (1996) which is competitive on many real-world problems Kautz and Selman (2007); Barthel et al. (2003). Moreover, with parameter tuning it works on random $3$ -SAT up to clause density about $4.2$ Aurell et al. (2005). Several other local search procedures have been shown to work up to a similar threshold Schöning (1999); Aurell et al. (2005); Seitz et al. (2005); Ardelius and Aurell (2006); Alava et al. (2008); Kroc et al. (2010); Lemoy et al. (2015). We will here consider Focused Metropolis Search (FMS) Seitz et al. (2005); Alava et al. (2008). This algorithm can be described very simply as first making a focusing step and then a standard Metropolis step, as in simulated annealing at one temperature. For the best choice of temperature FMS has been empirically shown to work up to clause density about $4.23$ Seitz et al. (2005).

However, the understanding of non-equilibrium local search has been hampered by the absence of theory. While it has been empirically clear that predictions of equilibrium-derived theories do not apply, it has been unclear what to use in their stead. The goal of this paper is to provide such a theory. Previous attempts rest on average rate equations Barthel et al. (2003); Semerjian and Weigt (2004) that must be built case by case. Our theory gives quantitatively excellent results on the development of FMS away from the SAT/UNSAT boundary, and qualitatively correct predictions on how that boundary depends on clause density and algorithm parameters. The crucial ingredient of this theory is the newly developed cavity method for continuous-time processes Aurell et al. (2017).

II Cavity Master Equation applied to random 3-SAT

Cavity Master Equation is a closure of the dynamic cavity equations. Dynamic cavity starts from the joint probability distribution of all histories of a set of dynamic variables interacting in a locally tree-like (locally loop-free) graph. It is then possible to write a self-consistent equation for the probabilities of the history of single variables when the history of one of their neighboring variables is held fixed; one says that the first variable is in the cavity of the second variable. These dynamic cavity equations are formally Belief Propagation updates. As is, they are however of little practical value since the variable (the history of one dynamic variable) is very high-dimensional. For dynamics in discrete time with synchronous updates closure assumptions have been explored for some time Neri and Bollé (2009); Kanoria et al. (2011); Del Ferraro and Aurell (2015).

The Cavity Master Equation is appropriate for dynamics of discrete variables in continuous time. In satisfiability problems these variables naturally take values $1$ (true) or $-1$ (false), which we here call spins. The Cavity Master Equation takes as input the jump rates $r_{i}$ (for spin $i$ ) defining the dynamics, and is for spins interacting in groups labeled by $a,b,c,\ldots$ (constraints, clauses) formulated in terms of quantities $p_{a\to i}(\sigma_{a\setminus i}|\sigma_{i})$ where $\sigma_{a\setminus i}$ are the current values in group $a$ except $i$ Aurell et al. (2018). These quantities should be considered closures imposed on the corresponding full cavity quantities $\mu_{a\to i}(X_{a\setminus i}|X_{i})$ where $X_{a\setminus i}$ is the whole history of of all the spins in group $a$ except $i$ in the cavity of $i$ , and $X_{i}$ the cavity history. In practice, to describe FMS on random K-SAT we then have to solve the following set of coupled differential equations:

[TABLE]

$F_{j}$ in above is the standard flip operator acting on spin $j$ while the combination of several terms of the type $p(\sigma_{a\setminus i}\!\mid\!\sigma_{i})$ is characteristic of the cavity master equation closure, and structurally analogous to the earlier described case of (ferromagnetic) $p$ -spin model Aurell et al. (2018). The term $r_{j}(\pm)$ in (1) is on the other hand the jump rate of spin $j$ when it takes value $\pm 1$ . This quantity depends on the instantaneous value of spin $j$ and on the instantaneous values of all the spins interacting with $j$ , through all the clauses in which spin $j$ appears. To describe the dynamics of the FMS algorithms one takes

[TABLE]

were $E_{i}(\sigma_{i},\sigma_{\partial i})$ is the number of unsatisfied constraints of which spin $i$ is a member. Each of these constraints can be written

[TABLE]

$K$ -satisfiability is thus a mixture of $p$ -spin problems, where $p$ ranges from 1 to $K$ . FMS is based on focusing and a Metropolis step. In the focusing of FMS all unsatisfied clauses are picked uniformly at random, and thereafter one variable in each such clause is again selected uniformly at random. This is the same as picking all variables partaking in unsatisfied clauses with probability proportional to $E_{i}(\sigma_{i},\sigma_{\partial i})$ , which explains this factor in (2). The term $\text{min}\left[e^{-\beta\Delta E(\sigma_{i},\sigma_{\partial i})},1\right]$ is on the other hand the standard Metropolis factor.

To model the dynamics of FMS in overall algorithmic time (wall-clock time), we have to further take into account that the number of unsatisfied clauses changes. When this becomes smaller the rate per unit time of a given unsatisfied clause to be picked goes up. This is reflected by the denominator $KE$ in (2), where $k$ is the number of variables per clause ( $3$ for 3-SAT) and $E$ is the total number of unsatisfied clauses. This factor kicks in strongly when there are only a few unsatisfied clauses left, and when the variables in these clauses are probed more often. It can be eliminated by letting the FMS algorithm mark time inversely proportionally to $E$ , and is hence a kind of globally defined time reparametrization. By a more efficient coding one can bring down the number of sums in (1) from $2^{(K-1)c}$ to $2^{c}$ ( $c$ is the number of clauses per variable). This coding is described in Supplemental Material

III Results

The problem is defined by the ratio between the number of clauses ( $M$ ) and the number of literals ( $N$ ) of some given instance of 3-SAT written as $\alpha=M/N$ , and by $\eta=e^{-\beta}$ as the noise parameter that enter into the rates of equation (2). In order to understand the behavior of FMS we need to study its dependence on these two parameters.

For a given noise $\eta$ , FMS has been empirically shown to have a zone, for $\alpha$ lower than some $\alpha_{c}(\eta)$ , where it solves 3-SAT instances in times linear with system size $N$ . For $\alpha\geq\alpha_{c}$ solutions are found in times that grows exponentially with $N$ , or solutions do not exist. This is shown in figure (1). As can be seen in the top panel of this figure, for $\eta=0.45$ , FMS is able to solve instances that have $\alpha\leq 3.7$ , and seems to fail otherwise. In the bottom panel, size effects are represented. For $\alpha=3.6$ FMS results seem to be almost independent of $N$ .

Then, by numerically integrating equations (1) one can obtain the behavior for the same values of the parameter $\eta$ . Results can be seen in figure (2, top). Although the transition $\alpha$ is not identical to figure (1), the results of CME are qualitatively very similar. The differences are that CME, as is natural of the solution of a set of ordinary differential equations, either converges to zero fairly rapidly, or does not converge to zero. The zone where FMS solves the problem by fluctuations is hence not well described by CME. The predicted threshold of CME ( $\alpha_{c}$ for given $\eta$ ) is thus generally slightly smaller than the empirically determined threshold of FMS.

On figure (2, bottom) a comparison is made between CME and FMS, for $\eta=0.65$ , and several values of $\alpha$ . Below the transition line of CME the agreement is very good.

As a summary, a comparison between the corresponding phase diagrams of FMS and CME is shown in figure (3). As one sees there is high qualitative similarity between them, essentially the transition line in CME is pushed to a little smaller values of $\alpha$ but the two curves follow each other quite closely as the parameter $\eta$ is varied.

IV Discussion

The qualitative and quantitative description of the energy landscapes in combinatorial optimization problems is one of the most important results of statistical physics of disordered systems, with many applications in many areas of science Mézard et al. (1987); Mézard and Montanari (2009). The quantitative prediction of the exact threshold between a SAT and an UNSAT phase in random satisfiability problems by a one-step replica symmetry breaking (1RSB) technique was a breakthrough Mézard et al. (2002), which has been extended to many other paradigmatic problems in computer science such as e.g. graph coloring Mulet et al. (2002) vertex covering Zhou and Zhou (2009), and the stochastic block model Decelle et al. (2011).

Yet, these advances a priori describe statics, and not dynamics. A long line of empirical investigations surveyed in the introduction have shown that the phase diagram of non-equilibrium local search appears unrelated to bounds derived from the complexity of (equilibrium) free energy landscapes. A further and more recent discussion that non-equilibrium may be “unreasonably effective” was given in Baldassi et al. (2016) and similarly in Budzynski et al. (2018). For combinatorial optimization it may hence be possible to achieve what has sometimes been posited to be impossible, from equilibrium considerations. A full realization (and exploitation) of these results has however been hampered by a lack of systematic theory. This is what we have furnished here, by adapting recent advances in the description of dynamics on locally tree-like graphs.

Our theory for how the local search proceeds in time is very accurate away from the (algorithm-dependent) phase boundary. The discrepancies found close to the phase boundary are very likely due to the build-up of correlations in time which are not captured by the closure approximation that leads to the Cavity Master Equation. We note that in the simpler case of synchronously updated spin systems (parallel updates) it was possible to improve on an analogous Markov approximation presented in Del Ferraro and Aurell (2015) by using the matrix product approximation of quantum many-body theory Barthel et al. (2018). We believe it is likely that efficient and more accurate higher-order closure schemes can also be found for continuous-time dynamics. For Focused Metropolis Search we find that our theory captures well the form of the phase boundary: for given $\eta$ (Metropolis parameter) the predicted boundary is basically shifted to a somewhat smaller value of $\alpha$ (clause density).

Acknowledgments

We acknowledge support from the European Union Horizon 2020 research and innovation programme MSCA-RISE-2016 under grant agreement No. 734439 INFERNET and by an Erasmus+ International Credit Mobility to KTH (EU).

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Garey and Johnson (1979) M. R. Garey and D. S. Johnson, Computers and intractability (Freeman, 1979).
2Mézard and Montanari (2009) M. Mézard and A. Montanari, Information, Physics and Computation (Oxford University Press, 2009). · doi ↗
3Mitchell et al. (1992) D. Mitchell, B. Selman, and H. Levesque, in Proceedings of the tenth national conference on Artificial intelligence (AAAI Press, 1992) pp. 459–465.
4Monasson et al. (1999) R. Monasson, R. Zecchina, S. Kirkpatrick, B. Selman, and L. Troyansky, Nature 400 , 133 (1999).
5Achlioptas and Coja-Oghlan (2008) D. Achlioptas and A. Coja-Oghlan, in Foundations of Computer Science, FOCS’08 (IEEE, 2008) pp. 793–802.
6Coja-Oghlan and Panagiotou (2016) A. Coja-Oghlan and K. Panagiotou, Advances in Mathematics 288 , 985 (2016).
7Mézard et al. (2002) M. Mézard, G. Parisi, and R. Zecchina, Science 297 , 812 (2002).
8Marino et al. (2016) R. Marino, G. Parisi, and F. Ricci-Tersenghi, Nature Communications 7 , 12996 (2016) . · doi ↗