Biased random k-SAT
Joel Larsson, Klas Markstr\"om

TL;DR
This paper investigates how introducing a bias towards positive literals in random k-SAT affects the satisfiability threshold, providing asymptotic results as the bias approaches 0 or 1/2, confirming earlier heuristic predictions.
Contribution
It analyzes the impact of variable occurrence bias on the satisfiability threshold in random k-SAT, deriving asymptotics for extreme bias values and validating previous heuristic predictions.
Findings
Asymptotic threshold behavior as bias approaches 0
Asymptotic threshold behavior as bias approaches 1/2
Confirmation of earlier heuristic predictions
Abstract
The basic random -SAT problem is: Given a set of Boolean variables, and clauses of size picked uniformly at random from the set of all such clauses on our variables, is the conjunction of these clauses satisfiable? Here we consider a variation of this problem where there is a bias towards variables occurring positive -- i.e. variables occur negated w.p. and positive otherwise -- and study how the satisfiability threshold depends on . For this model breaks many of the symmetries of the original random -SAT problem, e.g. the distribution of satisfying assignments in the Boolean cube is no longer uniform. For any fixed , we find the asymptotics of the threshold as approaches or . The former confirms earlier predictions based on numerical studies and heuristic methods from statistical physics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization · Auction Theory and Applications · Game Theory and Voting Systems
spacing=nonfrench
Biased random -SAT
Joel Larsson111Mathematics Institute, University of Warwick
Klas Markström222Department of Mathematics and Mathematical Statistics, Umeå Universitet
Research supported by a grant from the Swedish Research Council (Vetenskapsrådet)
klas.markströ[email protected]
Abstract
The basic random -SAT problem is: Given a set of Boolean variables, and clauses of size picked uniformly at random from the set of all such clauses on our variables, is the conjunction of these clauses satisfiable?
Here we consider a variation of this problem where there is a bias towards variables occurring positive – i.e. variables occur negated w.p. and positive otherwise – and study how the satisfiability threshold depends on . For this model breaks many of the symmetries of the original random -SAT problem, e.g. the distribution of satisfying assignments in the Boolean cube is no longer uniform.
For any fixed , we find the asymptotics of the threshold as approaches [math] or . The former confirms earlier predictions based on numerical studies and heuristic methods from statistical physics.
Keywords— Random -SAT, random constraint satisfaction problem, phase transition, combinatorial probability
1 Introduction
Random -SAT formulas, and their set of satisfying assignments, have become one of the most studied intersection points of combinatorics, computer science and physics. The basic problem is as follows. Let be a set of Boolean variables. A -clause is a Boolean formula of the form , where each is either or for some . A -SAT formula is a Boolean formula of the form , where each is a -clause. The random -SAT problem asks: if we take a -SAT formula with clauses on variable uniformly at random from all such formulas, for which is satisfiable w.h.p.? For which is unsatisfiable w.h.p.? Is there a sharp threshold in between? It turns out that the crucial parameter is the linear density of .
One of the earliest appearances of this problem is in [6], where it was shown that for certain large a random CNF of the type just described is not satisfiable, and that this is hard to show using the resolution proof system. Soon thereafter [7] demonstrated that for there is a threshold from solvable CNFs to unsolvable ones at the critical density , and they asked if a similar threshold could be identified for all fixed . Around the same time [19, 23, 28] a series of extensive – at least for the computational resources of the time – simulation studies of this problem was begun and evidence for a threshold constant was presented for many small values of . This in turn sparked the interest of the statistical physics community, seeing similarities between random -SAT and problems in spin-glass theory. A number of heuristic calculations based on such methods [24, 25] provided conjectures both for the values of and the structure of the set of solution to a satisfiable random CNF. Parts of these calculations could also be made mathematically rigorous [31], but not the most crucial ones.
The existence of a sharp threshold for random -SAT was proven by Friedgut in [14], however neither the location of the this threshold, nor that does not asymptotically depend on , follows from his very general threshold results. On the other hand an even more detailed description [4] of the threshold for has been obtained and the asymptotic behaviour of as has been found [8]. The existence of such a constant for large enough has been established [10], and for the exact location of the satisfiability threshold has been determined [15, 21]. It is generally believed that exists for all fixed but recently the exact value given by the cavity-method [24, 25] has been questioned [22] for low .
The aim of this paper is to study a variation on the random -SAT problem where, instead of taking each clause uniformly at random, we introduce a bias parameter which determines the probability of a variable being negated or non-negated in each clause. Variables in a clause independently occur non-negated with probability and negated with probability , independently for each variable. For we thus get the usual random -SAT problem, and for smaller we get a higher proportion of non-negated variables in our clauses. For this -SAT distribution has been studied [3] in connection with the hardness of approximating the maximum number of clauses which can be satisfied in random CNF, and surprisingly evidence was found for the balanced case not being the hardest one. The case is also covered by the results in [9], where the threshold behaviour of a more general family of -SAT distributions was identified. For the threshold has been studied numerically, albeit to quite low precision [29]. Our main focus will be on determining the satisfiability threshold as a function of both and , and where possible confirm behaviour conjectured in the older literature. We will also give some results on the distribution of satisfying assignments in the hypercube. In the unbiased version this distribution is uniform and this in part responsible for the inefficiency of some probabilistic tools in analysing the model, as noted in e.g. [2].
We find the exact threshold for any when , and for very close to when , similarly to what is known for the balanced case . For fixed , we show that the threshold is approximately quadratic in near and scales like as , the latter confirming a prediction based on replica symmetry heuristics [26]. While the proof for the case is mostly an adaption of known methods, the proof for the case is novel, and may be of independent interest. The reader interested in the latter proof can skip directly to section 5, which is largely self-contained.
1.1 Biased k-SAT
Let be a real number. The biased random -SAT problem is a random SAT problem, where the clauses are picked according to the following distribution: Start with a set of Boolean variables and pick from it a -set uniformly at random. Then, independently for each , pick the literal with probability and the literal otherwise.333As the problem is symmetric under , we will assume throughout that . The literals thus chosen form a clause , which we call a -biased clause. Let be the conjunction of i.i.d. -biased clauses.
Let , and let if the limit exists. In a slight abuse of notation, we will write (for instance) ‘’ even if we do not whether exists, but this should be understood as being short for ‘’. We will study how the satisfiability threshold behaves as a function of . Regions of particular interest are and . In the latter case, we will often work with the parametrization for some small positive .
1.2 Structured coupon collection
It is worth noting that random SAT problems are examples of structured coupon collector problems [13]. Each assignment of true/false to the Boolean variables is an -word in a two-symbol alphabet, and as such can be identified with the vertices of the hyper-cube . (Here, denotes false and denotes true.) Each -clause forbids a subset of those vertices, for instance the clause forbids any vertex of the form . The set of vertices forbidden by forms a --sub-cube of .
A sub-cube can be represented as an -word in the alphabet in the following way: iff for all , iff for all , otherwise. In other words, denotes the ‘free’ coordinates of (variables that do not occur in the clause), and the dimension of is the number of ’s in . The set of solutions forbidden by the clause above is the sub-cube . Since there is a bijection between clauses and sub-cubes in this way, we will henceforth identify a clause with its corresponding sub-cube of forbidden vertices.
We will usually think of random -SAT as a process where we add clauses at integer times until the formula is no longer satisfiable. A variation on this is to add clauses at times given by a Poisson process of intensity . At time , the discrete-time model will have clauses, while the continuous-time model will have clauses. These two models are closely related444Cf. the two random graph models and , and the satisfiability threshold of either of them is within a multiplicative factor of the other. We will therefore work with whichever model best suits our needs at any given time, but take care to specify when we switch between them.
1.3 Structure of paper
Each section of the paper is largely self-contained. We begin by adapting known techniques to the biased version of the random -SAT problem in sections 2, 3 and 4.
In section 2, we give the exact threshold for the case . The remainder of the paper assumes . In section 3, we find a lower bound on for fixed by analyzing the unit clause propagation algorithm. In section 4 we use the method of moments to bound . In section 4.1, we estimate the first two moments of the number of solutions to a biased -SAT formula. This leads to sharp bounds on for close to and for sufficiently large. We study a variation on the first moment method in section 4.2, and find an slightly sharper upper bound on which for fixed is within a constant factor of the lower bound from section 3. These results together establish the asymptotic behavior as .
Finally, in section 5, we investigate the asymptotics of as , by studying how the satisfiability of a formula is affected by changing the occurrences of a single variable. We use a novel combination of tools, including Russo’s formula and the Kruskal-Katona theorem, to show that the satisfiability threshold is approximately a parabola near , i.e. .
2 Satisfiability threshold for biased random 2-SAT
In the special case , we find the exact value of the threshold. For the classical (unbiased) -SAT problem, the threshold value of was established by Chvátal & Reed [7] by exploiting some of the structure specific to -SAT.
Later Cooper, Frieze & Sorkin [9] worked with -SAT formulas of prescribed literal degrees and gave a criterion for satisfiability. Before we state their theorem, we need the following notation: For a -SAT formula , let be the number of occurrences of in , and similarly be number of occurrences of . For any sequence , let and .
Theorem 1 (CFS):
Let be constant and . Let be any literal-degree sequence over variables with and even, and let be chosen uniformly at random from all -SAT formulas with degree sequence .
- (i)
If , then 2. (ii)
If , then
Both limits are uniform in (independent of ).
Theorem 2:
The biased -SAT problem with bias has a sharp satisfiability threshold at .
Sumedha, Krishnamurthy & Sahoo [30] first sketched this, and we will give a full proof.
Proof. We prove this for the continuous-time case, the discrete-time case follows. Pick an . For a -biased -SAT formula with clauses, the quantities and are random variables. , while
[TABLE]
Both and are sharply concentrated around their means. Letting , we find that with high probability, and
[TABLE]
with high probability, so that (i) from Theorem 1 is satisfied w.h.p. Similarly, letting gives that (ii) is satisfied w.h.p. The theorem follows. ∎
3 Algorithmic lower bound on satisfiability threshold
In this section we will show how to adapt the work of Chao and Franco [5] and Achlioptas [1] to biased -SAT. We will work in discrete time.
We will show that the algorithm ‘unit clause propagation’ (1) succeeds in finding a satisfying truth assignment with positive probability when is not too large. This algorithm is non-backtracking, and straight-forward to analyze. While better lower bounds are known for the non-biased case, this gives a lower bound that (for any fixed ) scales correctly with .
1 is given a SAT formula as input, in the form of a set of subsets of . It repeatedly tries to satisfy a unit clause (i.e. a clause on a single variable), and if no such clause exists it sets a random variable to a random value. Satisfied clauses and unsatisfied literals are then removed. The algorithm succeeds iff no empty clauses are ever generated.
It might seem strange to let with probability in the first ‘else’ of the algorithm, rather than with probability (since this would maximize the expected number of clauses being satisfied). The reason for this choice is to simplify the analysis of the algorithm: it ensures that the dynamics of the number of -clauses, , is independent from the number of -clauses.
Theorem 3:
There exists a such that if is at most , then 1 finds a satisfying assignment to with probability at least .
Furthermore, if a satisfying assignment is found, the number of variables set to ‘FALSE’ follows a binomial distribution with parameters and .
The intuition behind this theorem is as follows: An -clause is turned into a -clause with probability , and else removed. So of the -clauses, only a fraction of approximately survives long enough to be reduced to -clauses. If the rate at which -clauses are being created is strictly less than the rate at which they are dealt with (which is ), queuing theory suggests that the queue of -clauses will remain of bounded size. If the queue is of bounded size, the probability of there existing contradicting -clauses at any given time is of order , which suggests that the probability of there ever existing such clauses should be of constant order.
Proof. To a large extent the proof is essentially mutatis mutandis from [1], so we will focus on the necessary modifications and how the results follows from a few key lemmas.
For every and , let be the number of -clauses at time , and . Our aim is to understand the trajectory of the (time-inhomogeneous) Markov chain .
Looking at the expected difference , conditioned on , we see that
[TABLE]
because any -clause contains of the free variables, and each of them is equally likely to be locked at time .
Similarly, for any ,
[TABLE]
where the first term is as before, while the second term counts the expected number of -clauses being shrunk to -clauses. (When a variable occurring in an -clause is locked, the clause is shrunk to an -clause with probability , and else removed.) Together these difference equations describe the Markov chain .
The main proof idea is to look at the scaling limit of the expected trajectory of this Markov chain (the so-called liquid model). However, there are two problems that arise.
First, for , is of order and the dynamics of is not too sensitive to deviations in of order . But for or , and the dynamics is sensitive to small deviations. We deal with this problem by looking at the scaling limit of only for , and then making sure that is never so large that the influx of -clauses exceeds the rate at which they can be removed. The following lemma (which we state without proof) is a slight modification of lemma 4 in [1].
Lemma 4:
For any , if is such that and
[TABLE]
then with probability at least .
This lemma says that as long the density of the -clauses stays below times the satisfiability threshold for -SAT at time , there is a positive probability that there are no [math]- or -clauses at . So, next we need to bound the expected value of , and show that stays close to it.
To see the second problem, let’s look at the system of differential equations describing the scaling limit. If we let the functions be defined by for , they will satisfy the following system (which is the scaling limit of the system of difference equations describing ):
[TABLE]
The system has the following unique solution:
[TABLE]
We want to show that the trajectory of unlikely to deviate much from the trajectory of . There is a theorem by Wormald [32] that lets us do precisely that, and in order for it to apply we need the Markov chain to satisfy these two properties:
- (i)
The system of differential equations describing can be written on the form for some Lipschitz continuous function (for some appropriate domain and time interval ). 2. (ii)
Conditioned on the history of up to time the probability that is at most .
These properties are largely unaffected by the value of . The first holds for any time interval regardless of . For the second one, the increments follow an approximate Skellam distribution, which have exponential tails.
Since the first condition doesn’t hold all the way until time , we will only analyze the algorithm on the time interval , and then show that the formula remaining at time is sparse enough to be satisfied easily. Applying Wormald’s theorem, we get the following lemma.
Lemma 5:
For any , there exists such that with probability at least ,
[TABLE]
In order for Lemma 4 to apply, we need that is sufficiently small for all , and now Lemma 5 tells us that will stay close to for all such . That is, for any , holds w.h.p. So,
[TABLE]
For this to be at most the upper bound for in Lemma 4, we need the expression within brackets to be bounded away from . We accomplish that by choosing and for some . Thus, for as above and , with probability at least we have that there are no clauses smaller than at time .
Let be the formula remaining at time , conditional on there being no clause of size less than , and let be the number of clauses it consists of. To show that is satisfiable w.h.p., we construct a new formula from by uniformly at random throwing away literals from each clause with more than literals. Any assignment satisfying also satisfies . The formula is a -SAT formula, and it follows the distribution . By Theorem 2, is satisfiable w.h.p. if .
is close to , so it follows that (with high probability). Thus
[TABLE]
If we let and , the expression within brackets is at most . The pre-factor is at most , so the entire expression is bounded away from and thus the condition of Theorem 2 is satisfied. So is satisfiable w.h.p., and any assignment that satisfies also satisfies .
For the ‘furthermore’ part of the theorem, note that the signs assigned to the variable are i.i.d. Bernoulli r.v.’s., so that the number of variables set to ‘FALSE’ is a binomial random variable with tries and success probability of . ∎
4 Method of moments bounds on satisfiability threshold
The earliest proven upper bound () was found by applying the first moment method to the number of satisfying assignments.555In this section, we work in discrete time. This upper bound has been improved many times, often by using variations on the same method.
4.1 Vanilla first and second moment methods
In this section we will estimate the expected number of satisfying assigment in the biased -SAT model, which leads to an upper bound on . We will also employ the second moment method, but this only gives a non-trivial lower bound when is logarithmic in .
While the classical random -SAT problem is vertex transitive on the set of solutions, introducing a bias breaks that symmetry. However, the biased version is vertex transitive on any fixed weight ‘layer’ of , i.e. subset where the number of coordinates equal to is equal to some constant . The number of such layers is relatively small (, compared to vertices in total), so dealing with each layer separately and then applying a union bound only generates small error terms. First, we will need some notation.
- (i.)
For any integers , define to be the :th layer of , and take such that (i.e. the Hamming distance between and is ). 2. (ii.)
Let be a -biased random sub-cube, and let and . Note that . 3. (iii.)
Let be the (random) number of non-covered vertices in after clauses (or cubes) have been drawn, and let . 4. (iv.)
Finally, let and .
Applying the first moment method to the random variable , we see that . We will begin by estimating .
Claim 1:
For , \big{(}\frac{i-j+1}{k-j+1}\big{)}^{j}\leq{\binom{i}{j}}\big{/}{\binom{k}{j}}\leq\big{(}\frac{i}{k}\big{)}^{j}.
Proof. Expand the binomial coefficients and repeatedly apply the inequality (valid for and ). ∎
Claim 2:
Define the binary entropy . Then, for , and .
Proof. The second part of the lemma follows from the function being concave and monotonely decreasing. For the first part, note that is concave and , so . ∎
Claim 3:
For any , c_{p,x}=-{\frac{1}{n}\log\binom{n}{xn}}/{\log\big{(}1-Q(xn,n)\big{)}}
Proof. Let . There are vertices in , each of which fails to be covered by -biased random sub-cube with probability , so . This equals precisely when
[TABLE]
∎
Claim 4:
*For any , let . Then
*
Proof. Let . For a fixed , and a -set , there is a unique cube containing and whose set of locked variables is precisely . The number of cubes containing and with precisely variables locked to is then . Each such cube appears with probability , so that
[TABLE]
We approximate the binomials using claim 1:
[TABLE]
where the constants implicit in the big- notation do not depend on . Summing over all , we get that
[TABLE]
The claim follows. ∎
Claim 5:
Q^{r}(i,n)\leq\Big{(}1-\frac{r}{n-k+1}\Big{)}^{k-1}Q(i,n)**
Proof. First, note that . But , so
[TABLE]
Together with claim 1 this proves the claim. ∎
Now that we have some good estimates for the probabilities and , we can proceed to use the first moment method to get an upper bound on the satisfiability threshold, and the second moment method to get a lower bound.
Proposition 6 (Bounds on first moment threshold):
For any integer and (with and/or possibly depending on ), let and . Then
[TABLE]
In particular,
- (1.)
If for some , then
[TABLE] 2. (2.)
If and is fixed, then for all sufficiently large
[TABLE]
Proof. For the lower bound, note that by definition. So, in particular, , and it suffices to estimate . For the upper bound, we shall find an small interval on which the supremum is attained. Recall that
[TABLE]
where . For the sake of simplicity we will work with rather than directly with .
First, note that is a strictly concave continuous function on , whence there exists a unique which maximizes . Since whereas for any , we must have . Second, note that has a continuous derivative on , so if are such that , then . Now, for any ,
[TABLE]
The pre-factor is always positive, so the sign of will be the same as the sign of . Expanding out the definition of , we can rewrite as
[TABLE]
Let and consider . Then the term (1) equals [math] (since at least one of the factors in that term equals [math]), while the term (2) is negative for any . So , and thus .
Next, we let (which is at most ) and consider . Then the term 1 equals
[TABLE]
which is decreasing in and hence at least \log\big{(}\frac{1-\frac{1}{5}}{\frac{1}{5}}\big{)}\cdot\frac{3}{5}p>0.8p. The term 2, on the other hand, equals
[TABLE]
which is also decreasing in and hence at least , which for is at least . Together this gives us that . It follows that , and together with we have that .
Now that we have an interval which we know contains , we can estimate . To bound the first factor from above, note that is a strictly increasing function on , and . Thus . Similarly, to bound the second factor from above, note that is decreasing on , and , whence . Together these two inequalities gives us that
[TABLE]
This proves the upper bound part of the proposition. For the ‘in particular’-statements, apply the definition of and the inequality . ∎
Next, we will show that for growing sufficiently fast and bias sufficiently small, the first moment bound is tight (i.e. ).
Theorem 7:
Assume for some and let be fixed.
- (i.)
For any , we have that
[TABLE]
where is defined as the smaller of the two roots of the following equation:
[TABLE]
In other words, is satisfiable w.h.p. for . 2. (ii.)
If , then . In other words,
[TABLE]
These results are similar to those known for the unbiased -SAT problem. Setting (and hence ) in our theorem we recover the lower bound which is known for that case [15, 21]. We also note that the result remains valid for which depend on .
Proof. Let . We will use the second moment method to prove that for any , there exists a solution in with high probability if . From this the two parts of the theorem follow by noting that (i.) , and (ii.) for sufficiently close to , the that maximizes lies in .
[TABLE]
In order for the event to occur, a Poisson process of intensity must have had no event on the the time interval . The probability of this happening is . Thus the denominator equals .
Similarly, in order for the event to occur, no clause covering or can have occurred. By inclusion-exclusion, the total intensity of clauses covering at least one of and is , where , so this event has probability .
How many pairs have Hamming distance ? Starting from , such a is uniquely determined by which ’s we flip to [math]’s, and which [math]’s we flip to ’s. So for any there are vertices at distance from . It follows that
[TABLE]
Define to be the function
[TABLE]
so that .
Claim 6:
The function is concave on .
Proof. The second derivative of is , which is at most . Similarly, the second derivative of is at most .
Using that is at least , we see that the second derivative of is at most
[TABLE]
Hence . ∎
We will estimate for different ranges of :
- Case I:
The factor is at most , and . On the other hand, , which (by assumption) is at most . Together this gives us that . It follows that
[TABLE] 2. Case II:
For convenience, let . Using that , it follows that
[TABLE]
where the last inequality comes from noting that is an increasing function on , and thus at most . So is at most \exp\big{(}H(x)\cdot o(n^{2\delta})\big{)}, which is if we set . It follows that for in this interval, and summing over we get
[TABLE] 3. Case III:
Since is concave on this interval, its graph lies beneath any of its tangent lines. In particular, this is true for the tangent line at . In other words, for any we have that . Since , we need to lower bound and upper bound .
[TABLE]
Since , (4) is at most
[TABLE]
and by the previous case. The fraction in the right-hand side of (3) is at least . Thus
[TABLE]
To bound from above, recall from the previous case that it is at most . The expression is maximized for , where it is . Hence
[TABLE]
and thus
[TABLE]
Together these three cases give us that , which implies that . Chebyshev’s inequality then gives us that with probability . ∎
4.2 Improved first moment method
When the unit-clause algorithm from section 3 succeeds in finding a solution, the number of variables set to ‘false’ is concentrated around . So we might suspect that is the dominating term in the expected number of solutions. Recall that is the probability that a -biased -clause covers an arbitrary vertex in , and that for we have
[TABLE]
Furthermore, recall that is the number of uncovered vertices in , and that
[TABLE]
For small , in order for the right hand side to be negative we need to be of order . This only differs from the lower bound by a log-factor, and a small improvement on the vanilla first moment method suffices to correct this: the single-flip method (due to Dubois & Boufkhad [12]). We will adapt this method to the biased random -SAT model.
Proposition 8:
For small enough , .
Proof. We will apply the first moment method to a subset of solutions, which is guaranteed to be non-empty if the set of solutions is non-empty.
Let be the sub-cubes drawn, and let be the set of covered vertices at time . The hyper-cube can be given a lattice ordering as follows: if for every coordinate . This is isomorphic to the lattice ordering on induced by inclusion. Let be the number of solutions (uncovered vertices) in that are locally minimal w.r.t. this order.666We might equally well count the number of locally maximal solutions. Note that is non-zero if and only if the formula is satisfiable. We will bound from above.
Pick any , and let be the vertices in adjacent to . The probability that is a locally minimal solution can then be written as
[TABLE]
The event is the union over all of the events . For any fixed , the events and are mutually exclusive conditional on , because any cube covering both and for some must also cover (by convexity of ). We can therefore consider this as a balls-and-bins problem, where the balls are clauses and the bins are vertices . Let , i.e. the number of ‘balls’ in ‘bin’ number . Dubhashi & Ranjan [11] studied negative dependence of balls-and-bins problems, and in particular showed that the vector of the number of balls in each bin satisfies the negative association property (theorem 13 of that paper).
So is a negatively associated vector, and proposition 4 from the same paper gives the following inequality:
[TABLE]
The left hand side is precisely , while each factor on the right hand side is , whence
[TABLE]
Claim 7:
The probability of a -biased random cube covering but not is
Proof. Assume WLOG that , (and thus for ). For a cube to cover but not , it must have a in the first position. This happens with probability . The other positions can be seen as a -co-dimensional sub-cube of , which must cover the vertex . This happens with probability
[TABLE]
The claim follows. ∎
Using claim 7, we can calculate the conditional probability of not being covered.
[TABLE]
from which it follows that is a locally minimal solution with probability at most
[TABLE]
Recall that is the number of locally minimal solutions (in layer ) to a -biased random -SAT instance with clauses. We can estimate the expected number of minimal solutions to be at most
[TABLE]
Let . We will use the following bound (valid for sufficiently small and ).
[TABLE]
Then . Let . This function is minimized for , giving the inequality . In particular, for we have that
[TABLE]
which in turn gives the following bound on :
[TABLE]
The right hand side is less than for . Hence , which for is at most . But for smaller , it suffices to note that , and
[TABLE]
Thus, by Markov’s inequality,
[TABLE]
In other words, for small enough. ∎
5 Satisfiability threshold for bias near 0
In this section we will prove the main theorem of this paper777From now on, we will work with the continuous-time version of the biased -SAT problem. , which describes the shape of the threshold near . We will work with the parametrization for some small .
Theorem 9:
For any , there exists a constant such that for all sufficiently small ,
[TABLE]
The upper bound on can be improved slightly to by optimizing the choice of parameters in the proof of Lemma 12 due to Chvátal-Szemeredi [6], but in order to achieve a sub-exponential upper bound Lemma 19 would need to be significantly sharpened.
As an aside, what should we expect the correct quadratic coefficient to be? While the lower bound matches our exact result of for , the upper and lower bounds for larger bias suggest a larger quadratic coefficient for . The threshold curve for near [math] or scales like , and if it follows that curve for near too, we get the Taylor expansion near . One might therefore guess that is the correct coefficient. This also matches the result for .
Before we can continue with the proof of Theorem 9, we will need some background on spine variables.
Definition 10:
Let be a satisfiable formula and a variable in it. We say that is a spine variable in if has the same value in any assignment satisfying . If such an always has value ‘TRUE’, we say that it is a positive spine variable and that it is locked to TRUE. (Similarly for negative.)
For an unsatisfiable formula, we say that is a spine variable in if there exists a satisfiable formula such that is a spine variable in .
We will use the following definition and lemma from Chvátal-Szemerédi [6].
Definition 11:
Let . A -uniform hypergraph with vertices is (,)-sparse if every set of vertices contains at most edges.
Lemma 12 (Chvátal-Szemerédi):
Let and . Then w.h.p. the random -uniform hypergraph with vertices and edges is -sparse, where
[TABLE]
Using this lemma, Boetcher, Istrate & Percus proved that the number of spine variables is either or at least for some (part 1 of theorem 3 in [17]). But their proof actually works even when replacing with [math], leading to the following theorem.
Theorem 13 (Boetcher-Istrate-Percus888A special case of their Theorem 3, using our notation):
For , there is a constant such that the following holds with high probability: for every unsatisfiable with , the number of spine variables of is either [math] or at least .
The following lemma is a consequence of the proofs of Lemmas 12 and 13, and we state it without proof.
Lemma 14:
For a fixed , is a decreasing function of . Furthermore, for a fixed , .
Next, we will need a lemma that gives a correspondence between spine variables and clauses that turn a satisfiable formula into an unsatisfiable one.
Lemma 15:
Let be a satisfiable -CNF with a set of positive spine variables and a set of negative spine variables. Then is unsatisfiable if and only if can be written as
[TABLE]
for some .
In other words, is unsatisfiable iff every variable that occurs in is a spine variable in , and the sign that it has in is incompatible with the truth value it is locked to in .
Proof. For the ‘if’ part, assume is of the above form. If is a solution to , it has for , so . Similarly, . But then , so is not a solution to .
On the other hand, if is not a solution to it is not a solutions to . Hence is unsatisfiable, proving the ‘if’ part.
For the ‘only if’ part, assume instead is not of that form. Then either
There exists an such that occur in with negative sign. In that case, any that satisfies will also satisfy , because such an will have , and will have a term . Hence is satisfied by . 2. 2.
There exists an such that occur in with positive sign. Analogously to the previous case, any that satisfies will also satisfy . 3. 3.
There exists an such that occur in . In that case, there exist with that satisfy (otherwise would have been a spine variable!). Either or will satisfy , so is satisfiable.
So in any case, is satisfiable. ∎
Overview of proof idea
In the remainder of this section, we will assume that for some small positive , and work with rather than .
Let be the probability that (a -biased -CNF) is satisfiable. By studying the partial derivatives of and estimating the ratio between them, we derive a pair of differential inequalities for the implicit function given by . Solving these inequalities then gives us an upper and a lower bound on the satisfiability threshold.
The -derivative is given by the probability of making the formula unsatisfiable by adding one more clause. Lemma 15 gives us a complete description, in terms of spine variables, of when adding a new clause can turn a satisfiable formula into an unsatisfiable one.
In order to calculate the -derivative, we employ Margulis-Russo’s formula from percolation theory (Proposition 17). This will result in something very similar to the -derivative, only depending on the signs of variables slightly differently. To bridge that gap, we study the effects of re-randomizing the signs that spine variables occur with in clauses, conditional on certain other spine variables not being affected (Lemmas 19, 21 and 20).
Margulis-Russo’s formula was proven by Russo [27] specifically for indicator random variables of increasing events, but the event we are interested in (satisfiability) is not monotone with respect to changing signs. However, in Grimmett’s textbook on percolation theory [16] there is a generalization of Russo’s formula (thm 2.32) to any real-valued random variable. Here is a version of that theorem, restated with our notation and for finite-dimensional product spaces.
Theorem 16 (Russo’s formula, finite case):
Let be a finite set, let the probability space be equipped with the product measure where for any , and let be a real-valued random variable on . For any , let be but with the -coordinate set to . Furthermore, let the pivotal be defined by
[TABLE]
Then, for any ,
[TABLE]
This theorem allows us to calculate the rate of change of by studying the expected effect of ‘local’ changes to . We will apply it with .
Proof of Theorem 9
Proposition 17:
Let be a -biased random -clause, and let be but with the sign of the first variable changed to . Furthermore, let \rho_{\pm}:={\mathbb{P}\big{(}\Phi\wedge C_{\pm}\notin\operatorname{SAT},\Phi\in\operatorname{SAT}\big{)}} and \rho:=\mathbb{P}\big{(}\Phi\wedge C\notin\operatorname{SAT},\Phi\in\operatorname{SAT}\big{)}. Then
[TABLE]
Proof. The -derivative is trivial; the prefactor comes from scaling time by . For the -derivative, let and apply Russo’s formula. Let be the set of clauses in .
[TABLE]
where is the pivot of the variable in the clause . Now, pick an arbitrary . Let be the first variable in . The signed pivotal is if is unsatisfiable and is satisfiable, if the reverse holds, and [math] otherwise. Thus
[TABLE]
Since every term in the sum in eq. 5 above have same expected value, we can apply Wald’s equation to arrive at
[TABLE]
Noting that , the proposition follows. ∎
Lemma 18:
For , and defined as in Proposition 17 and as in Theorem 13,
[TABLE]
In order to prove Lemma 18, we will need the estimates in Lemmas 19, 20 and 21.
Lemma 19:
For conditioned on being satisfiable and having a spine variable , the probability that is locked to ‘TRUE’ is at most .
Lemma 20:
Let be the event that is satisfiable and that the variables are spine variables in , of which are locked to signs (for some ). Then, conditional on , the probability that is locked to ‘TRUE’ is between and .
Furthermore, if we let with probability , then (conditional on ) the probability that is locked to is between and
Lemma 21:
For conditioned on being satisfiable and having a spine variable , the probability that is locked to ‘TRUE’ is at least .
Proof of Lemma 18. Assume (wlog) that the variables in are and they occur with signs . Let the events ,, and be defined in the following way:
:
is unsatisfiable and the variables in clause are spine variables in .
:
These variables all occur with the opposite sign (in ) to the sign they are locked to in .
:
These variables all occur with the opposite sign (in ) to the sign they are locked to in .
First, note that the event happens if and only if the events and happen, and similarly for , and . Thus and . But , and similarly for . It follows that
[TABLE]
We now want to estimate the probabilities . Starting with , we see that
[TABLE]
We will use Lemmas 19 and 21 to estimate the first probability.
[TABLE]
For all the subsequent probabilities, we use Lemma 20.
[TABLE]
For sufficiently small, the product of these probabilities is at most
[TABLE]
and similarly at least . Together these estimates yields
[TABLE]
Similarly,
[TABLE]
Using these bounds, we see that
[TABLE]
Finally, to estimate , simply note that it is a weighted mean of and , both of which are . Thus, cancelling the prefactors of , we get that
[TABLE]
The lemma follows. ∎
Proof of Lemma 19. Let be the event . We start by conditioning on and on having degree in the hypergraph of . Let be the clauses containing , and let be sign occurs with in . Let be the function such that iff the formula obtained by replacing in with for every is satisfiable (otherwise ). Note that is non-decreasing: satisfying more clauses can only make the rest of the formula easier to satisfy.
When trying to find a satisfying assignment to , we can either satisfy all clauses containing or all clauses containing . If , then the formula is unsatisfiable regardless of the value we assign to , and similarly if it is always satisfiable. But if then is satisfiable and is a spine variable locked to .
We therefore let . The function is both non-decreasing and odd.
Now, pick a -biased random conditional on . Construct a new formula from by replacing the signs that occur with in with . Because of the conditioning, is a spine variable in too, and is satisfiable. (Note that is not necessarily equal to , we only know that belongs to both.) What is the expected value of ?
Define to be the probability of , i.e. where is the number of ’s in . Furthermore, define as expected value of , conditional on , i.e.
[TABLE]
We will upper bound , and thus get an upper bound for . Cancelling common factors of and , we see that
[TABLE]
But for any integer ,
[TABLE]
Noting that is at most , we see that the above expression is at most . Hence the expression is at most , and it follows that . So the expected sign of is at most , or in other words
[TABLE]
We don’t know the average degree of spine variables, but we do know the average degree of all variables, and that there are at least spine variables. This gives us the upper bound
[TABLE]
But , and by Theorem 13. Thus
[TABLE]
∎
Proof of Lemma 20. We cannot simply re-randomize the signs that appear with, conditional on it remaining a spine variable, because that could change whether or not (say) is a spine variable.
What we can do, however, is re-randomize the signs of conditional on and being unchanged. Consider of all sign vectors such that replacing the original signs of with will not change or . This set is symmetric, , so its elements comes in anti-podal pairs.
Now the proof from Lemma 19 carries through as before, giving the upper bound of the corollary. For the lower bound, replace with throughout. ∎
Before we continue with the proof of Lemma 21, we will need the following definition and theorem concerning the possible sizes of simplicial complexes. The theorem was proven independently in [20, 18].
Definition 22:
For positive integers and , the -cascade of is defined 999Our indices here are slightly non-standard; usually one works with . as the unique non-increasing sequence of positive integers such that
[TABLE]
Given such , we define .
Theorem 23 (Kruskal-Katona):
For an integral vector , there exists a -dimensional simplicial complex such that if and only if for every .
Proof of Lemma 21. Recall that is the expected sign of a variable, conditional on .
Now, is a simplicial complex (equal to or a subcomplex of it), and this correspondance is a bijection, so is determined by . Not only that, only depends on the number of ’s of at each level of the hypercube , so is determined solely by the -vector of (the vector whose :th coordinate is the number of -faces of ). So henceforth we consider to be a function of .
We want to minimize over the set of ’s that can be written as for some . The Kruskal-Katona theorem gives sufficient and necessary conditions for the existence of a simplicial complex with a given -vector.
Claim 8:
If is the -vector of some -dimensional simplicial complex and minimizes , then for every . In other words, there exists a non-increasing sequence such that for every .
Proof. Let be the set of all integral -vectors such that for some simplicial complex . We want to find .
Note that for every such that , increasing by will decrease slightly, but if doing so will increase it slightly. (For , changing has no effect on .) So we want to increase for big ’s and decrease for small ’s, whenever possible.
We therefore let , and consider an arbitrary fixed integer such that . Let be the set of all such that .
We now aim to find the that achieves . Let be the -cascade of .
By Kruskal-Katona, for any we have that and that this bound is tight. So we can assume wlog that this holds with equality for all such . Similarly, for any we have that wlog .
So for any that minimizes over , we have that is determined by the -cascade of . It follows that for any non-increasing sequence of non-negative integers , we can define a simplicial vector by letting, for each ,
[TABLE]
This will be the unique minimizer (in ) of . ∎
Since satisfies Kruskal-Katona (by design), there exists a simplicial complex (not necessarily unique) with simplicial vector . The size of is given by
[TABLE]
where the last inequality comes from noting that . But cannot be larger than , because and are disjoint. (By definition, , and by symmetry .) It follows that either and , or . In the former case, , and in the latter case regardless of the values of and .
Claim 9:
For a non-increasing sequence ,
[TABLE]
and is a strictly decreasing function of both of its arguments.
Proof. Consider a subcube of whose lowest corner is at level and whose highest corner is at level . For any , this cube will have vertices at level , exactly matching the -term in the -cascade of . One such cube is
[TABLE]
The probability of a -biased random string following the pattern above is precisely , so its -weight will be
[TABLE]
The claim follows. ∎
Claim 10:
The minimum of is either achieved by or for some integer .
Proof. We will use that . If we must have equal to the -term sequence , because is already as large as the sum can be. So assume instead that .
First, assume is not a constant sequence, seeking a contradiction. Then there exists an with . Let be but with replaced by . (I.e. .) The sequence is still non-increasing, and by claim 9, .
Next, assume for some , again seeking a contradiction. Then, if , decrease the length of by , which decreases by . If instead , increase the length of by , which decreases by . ∎
We are now left with only the following candidates for : , for some , or for some . It is easy to check that . Furthermore, is decreasing in , so the minimum is achieved for . It follows that
[TABLE]
∎
We now have all the ingredients necessary to prove Theorem 9.
Proof of Theorem 9. Let be the implicit function defined by . The derivative of is given by
[TABLE]
By Proposition 17, and . So . But by Lemma 18, , which leads to the following pair of differential inequalities:
[TABLE]
We will relax these inequalities slightly. First, we see that , so . By Lemma 14, is a decreasing function, and we may replace the function with the constant in eq. 6.
This leads to the differential inequality , which has the solution \psi(b)\leq{\psi(0)}\big{(}1-\frac{2k^{3}\psi(0)}{\delta_{0}}b^{2}\big{)}^{-1}. Noting that , we see that for small enough . We may therefore also replace with in the right-hand side of eq. 6, leading to
[TABLE]
and this new pair of of differential inequalities has the solution
[TABLE]
Using the lower bound again, the theorem follows. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Achlioptas. Lower bounds for random 3-SAT via differential equations. Theoretical Computer Science , 265(1-2):159–185, 2001.
- 2[2] D. Achlioptas and C. Moore. Random $k$-sat: Two moments suffice to cross a sharp threshold. SIAM J. Comput. , 36(3):740–762, Sept. 2006.
- 3[3] P. Austrin. Balanced max \max 2 2 2 -sat might not be the hardest. In Proceedings of the Thirty-ninth Annual ACM Symposium on Theory of Computing , STOC ’07, pages 189–197, 2007.
- 4[4] B. Bollobás, C. Borgs, J. T. Chayes, J. H. Kim, and D. B. Wilson. The scaling window of the 2 2 2 -SAT transition. Random Structures and Algorithms , 18(3):201–256, 2001.
- 5[5] M.-T. Chao and J. Franco. Probabilistic analysis of two heuristics for the 3 3 3 -satisfiability problem. SIAM Journal on Computing , 15(4):1106–1118, 1986.
- 6[6] V. Chvátal and E. Szemerédi. Many hard examples for resolution. Journal of the Association for Computing Machinery , 35(4):759–768, 1988.
- 7[7] V. Chvátal and B. Reed. Mick gets some (the odds are on his side). In Foundations of Computer Science, 1992. Proceedings., 33rd Annual Symposium on , pages 620–627, 1992.
- 8[8] A. Coja-Oghlan. The asymptotic k-SAT threshold. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing , STOC ’14, pages 804–813, 2014.
