Biased random k-SAT

Joel Larsson; Klas Markstr\"om

arXiv:1906.05127·math.CO·June 13, 2019

Biased random k-SAT

Joel Larsson, Klas Markstr\"om

PDF

Open Access

TL;DR

This paper investigates how introducing a bias towards positive literals in random k-SAT affects the satisfiability threshold, providing asymptotic results as the bias approaches 0 or 1/2, confirming earlier heuristic predictions.

Contribution

It analyzes the impact of variable occurrence bias on the satisfiability threshold in random k-SAT, deriving asymptotics for extreme bias values and validating previous heuristic predictions.

Findings

01

Asymptotic threshold behavior as bias approaches 0

02

Asymptotic threshold behavior as bias approaches 1/2

03

Confirmation of earlier heuristic predictions

Abstract

The basic random $k$ -SAT problem is: Given a set of $n$ Boolean variables, and $m$ clauses of size $k$ picked uniformly at random from the set of all such clauses on our variables, is the conjunction of these clauses satisfiable? Here we consider a variation of this problem where there is a bias towards variables occurring positive -- i.e. variables occur negated w.p. $0 < p < \frac{1}{2}$ and positive otherwise -- and study how the satisfiability threshold depends on $p$ . For $p < \frac{1}{2}$ this model breaks many of the symmetries of the original random $k$ -SAT problem, e.g. the distribution of satisfying assignments in the Boolean cube is no longer uniform. For any fixed $k$ , we find the asymptotics of the threshold as $p$ approaches $0$ or $\frac{1}{2}$ . The former confirms earlier predictions based on numerical studies and heuristic methods from statistical physics.

Equations220

E D_{2}

E D_{2}

= n \cdot \frac{2 m}{n} (1 - p) \cdot \frac{2 m}{n} p = 4 p (1 - p) \frac{m ^{2}}{n}

2 D_{2} < (1 + ε) \cdot 2 E D_{2} = (1 + ε) (1 - 4 ε) \cdot 2 m < (1 - 2 ε) E D_{1} < (1 - ε) D_{1},

2 D_{2} < (1 + ε) \cdot 2 E D_{2} = (1 + ε) (1 - 4 ε) \cdot 2 m < (1 - 2 ε) E D_{1} < (1 - ε) D_{1},

E [S_{k} (j + 1) - S_{k} (j) ∣ Y (j)] = - \frac{k}{n - j} S_{k} (j)

E [S_{k} (j + 1) - S_{k} (j) ∣ Y (j)] = - \frac{k}{n - j} S_{k} (j)

E [S_{i} (j + 1) - S_{i} (j) ∣ Y (j)] = - \frac{i}{n - j} S_{i} (j) + 2 p (1 - p) \frac{i + 1}{n - j} S_{i + 1} (j)

E [S_{i} (j + 1) - S_{i} (j) ∣ Y (j)] = - \frac{i}{n - j} S_{i} (j) + 2 p (1 - p) \frac{i + 1}{n - j} S_{i + 1} (j)

w.h.p. S_{2} (t n) < \frac{( 1 - δ ) ( 1 - t )}{4 p ( 1 - p )} \cdot n, for all 0 \leq t \leq t^{*}

w.h.p. S_{2} (t n) < \frac{( 1 - δ ) ( 1 - t )}{4 p ( 1 - p )} \cdot n, for all 0 \leq t \leq t^{*}

\frac{d c _{k}}{d t}

\frac{d c _{k}}{d t}

⋮

\frac{d c _{i}}{d t}

⋮

\frac{d c _{2}}{d t}

\forall i \in {2, \dots, k}, c_{i} (t) = c \cdot (i k) (2 p (1 - p) t)^{k - i} (1 - t)^{i}

\forall i \in {2, \dots, k}, c_{i} (t) = c \cdot (i k) (2 p (1 - p) t)^{k - i} (1 - t)^{i}

\quad\big{|}S_{i}(tn)-c_{i}(t)n\big{|}<\delta n\textrm{, for all }t<(1-\varepsilon).

\quad\big{|}S_{i}(tn)-c_{i}(t)n\big{|}<\delta n\textrm{, for all }t<(1-\varepsilon).

c_{2} (t) + δ

c_{2} (t) + δ

\leq (c k^{2} (2 p (1 - p))^{k - 1} (1 - t) + \frac{δ}{ε}) \cdot \frac{1 - t}{4 p ( 1 - p )}

4 p (1 - p) \frac{m _{t}}{ε n}

4 p (1 - p) \frac{m _{t}}{ε n}

\displaystyle<\frac{4p(1-p)}{\varepsilon}\left(\delta+c\cdot\sum_{i=0}^{k}\binom{k}{i}\big{(}2p(1-p)\big{)}^{k-i}\varepsilon^{i}\right)

\displaystyle=\frac{4p(1-p)}{\varepsilon}\left({\delta+c\cdot\big{(}2p(1-p)+\varepsilon\big{)}^{k}}\right)

\displaystyle=\frac{4p(1-p)}{\varepsilon}\left(\delta+2p(1-p)k^{-2}\cdot\big{(}\frac{1+\varepsilon}{2p(1-p)}\big{)}^{k}\right)

\leq 4 p (1 - p) (\frac{δ}{ε} + \frac{2 p ( 1 - p )}{k ^{2} ε} (1 + \frac{ε}{2 p ( 1 - p )})^{k})

m=-{\log\binom{n}{i}}\Big{/}{\log\Big{(}1-Q(i,n)\Big{)}}.

m=-{\log\binom{n}{i}}\Big{/}{\log\Big{(}1-Q(i,n)\Big{)}}.

Q (i, n) = j = 0 \sum k \frac{( j i ) ( k - j n - i )}{( k n )} p^{j} (1 - p)^{k - j}

Q (i, n) = j = 0 \sum k \frac{( j i ) ( k - j n - i )}{( k n )} p^{j} (1 - p)^{k - j}

\frac{\binom{i}{j}\binom{n-i}{k-j}}{\binom{n}{k}}=\left(1+O\Big{(}\frac{k^{2}}{n}\Big{)}-O\Big{(}\frac{k^{2}}{i}\Big{)}\right)\cdot\binom{k}{j}\cdot\left(1-\frac{i}{n}\right)^{k-j}\cdot\left(\frac{i}{n}\right)^{j},

\frac{\binom{i}{j}\binom{n-i}{k-j}}{\binom{n}{k}}=\left(1+O\Big{(}\frac{k^{2}}{n}\Big{)}-O\Big{(}\frac{k^{2}}{i}\Big{)}\right)\cdot\binom{k}{j}\cdot\left(1-\frac{i}{n}\right)^{k-j}\cdot\left(\frac{i}{n}\right)^{j},

\displaystyle Q(i,n)=\left(1+O\Big{(}\frac{k^{2}}{n}\Big{)}-O\Big{(}\frac{k^{2}}{i}\Big{)}\right)\cdot(x(1-p)+p(1-x))^{k}

\displaystyle Q(i,n)=\left(1+O\Big{(}\frac{k^{2}}{n}\Big{)}-O\Big{(}\frac{k^{2}}{i}\Big{)}\right)\cdot(x(1-p)+p(1-x))^{k}

Q^{r} (i, n) / Q (i, n) = \frac{n ( k n - r )}{( n - r ) ( k n )} = \frac{( k - 1 n - r - 1 )}{( k - 1 n - 1 )} .

Q^{r} (i, n) / Q (i, n) = \frac{n ( k n - r )}{( n - r ) ( k n )} = \frac{( k - 1 n - r - 1 )}{( k - 1 n - 1 )} .

H (x_{+}) η_{p} (x_{+})^{- k} \leq c_{p} < H (x_{+}) η_{p} (x_{-})^{- k}

H (x_{+}) η_{p} (x_{+})^{- k} \leq c_{p} < H (x_{+}) η_{p} (x_{-})^{- k}

lo g (2) \cdot 2^{k} \leq c_{p} < lo g (2) \cdot 2^{k} e^{\frac{3}{5} β}

lo g (2) \cdot 2^{k} \leq c_{p} < lo g (2) \cdot 2^{k} e^{\frac{3}{5} β}

p^{1 - k} \frac{lo g k}{k} \cdot e^{- 1} \leq c_{p} < p^{1 - k} \frac{lo g k}{k} \cdot e^{- \frac{1}{5}}

p^{1 - k} \frac{lo g k}{k} \cdot e^{- 1} \leq c_{p} < p^{1 - k} \frac{lo g k}{k} \cdot e^{- \frac{1}{5}}

c_{p, x} = \frac{lo g ( x n n )}{n Q ( x n , n )} = (1 - o (1)) \cdot H (x) η (x)^{- k},

c_{p, x} = \frac{lo g ( x n n )}{n Q ( x n , n )} = (1 - o (1)) \cdot H (x) η (x)^{- k},

f^{'} (x)

f^{'} (x)

= η (x)^{- k - 1} \cdot =: Δ (x) (η (x) H^{'} (x) - k H (x) (1 - 2 p))

Δ (x) =

Δ (x) =

+ k (1 - 2 p) (x lo g x + (1 - x) lo g (1 - x))

=

+ lo g (1 - x) (1 - 2 p) k

(lo g (1 - x_{-}) - lo g x_{-}) \cdot \frac{3}{5} p

(lo g (1 - x_{-}) - lo g x_{-}) \cdot \frac{3}{5} p

lo g (1 - x_{-}) \cdot (1 - 2 p) k = \frac{1}{x _{-}} lo g (1 - x_{-}) \cdot \frac{2 k p}{5 ( k - 1 )}

lo g (1 - x_{-}) \cdot (1 - 2 p) k = \frac{1}{x _{-}} lo g (1 - x_{-}) \cdot \frac{2 k p}{5 ( k - 1 )}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConstraint Satisfaction and Optimization · Auction Theory and Applications · Game Theory and Voting Systems

Full text

spacing=nonfrench

Biased random $k$ -SAT

Joel Larsson111Mathematics Institute, University of Warwick

[email protected]

Klas Markström222Department of Mathematics and Mathematical Statistics, Umeå Universitet

Research supported by a grant from the Swedish Research Council (Vetenskapsrådet)

klas.markströ[email protected]

Abstract

The basic random $k$ -SAT problem is: Given a set of $n$ Boolean variables, and $m$ clauses of size $k$ picked uniformly at random from the set of all such clauses on our variables, is the conjunction of these clauses satisfiable?

Here we consider a variation of this problem where there is a bias towards variables occurring positive – i.e. variables occur negated w.p. $0<p<\frac{1}{2}$ and positive otherwise – and study how the satisfiability threshold depends on $p$ . For $p<\frac{1}{2}$ this model breaks many of the symmetries of the original random $k$ -SAT problem, e.g. the distribution of satisfying assignments in the Boolean cube is no longer uniform.

For any fixed $k$ , we find the asymptotics of the threshold as $p$ approaches [math] or $\frac{1}{2}$ . The former confirms earlier predictions based on numerical studies and heuristic methods from statistical physics.

Keywords— Random $k$ -SAT, random constraint satisfaction problem, phase transition, combinatorial probability

1 Introduction

Random $k$ -SAT formulas, and their set of satisfying assignments, have become one of the most studied intersection points of combinatorics, computer science and physics. The basic problem is as follows. Let $x_{1},\ldots,x_{n}$ be a set of Boolean variables. A $k$ -clause is a Boolean formula of the form $z_{i}\vee\ldots\vee z_{k}$ , where each $z_{i}$ is either $x_{j}$ or $\neg x_{j}$ for some $j$ . A $k$ -SAT formula is a Boolean formula of the form $C_{1}\wedge\ldots\wedge C_{m}$ , where each $C_{i}$ is a $k$ -clause. The random $k$ -SAT problem asks: if we take a $k$ -SAT formula $\Phi$ with $m$ clauses on $n$ variable uniformly at random from all such formulas, for which $m,n$ is $\Phi$ satisfiable w.h.p.? For which $m,n$ is $\Phi$ unsatisfiable w.h.p.? Is there a sharp threshold in between? It turns out that the crucial parameter is the linear density $\alpha:=m/n$ of $\Phi$ .

One of the earliest appearances of this problem is in [6], where it was shown that for certain large $m$ a random CNF of the type just described is not satisfiable, and that this is hard to show using the resolution proof system. Soon thereafter [7] demonstrated that for $k=2$ there is a threshold from solvable CNFs to unsolvable ones at the critical density $\alpha_{2}:=\frac{m}{n}=1$ , and they asked if a similar threshold $\alpha_{k}$ could be identified for all fixed $k$ . Around the same time [19, 23, 28] a series of extensive – at least for the computational resources of the time – simulation studies of this problem was begun and evidence for a threshold constant $\alpha_{k}$ was presented for many small values of $k$ . This in turn sparked the interest of the statistical physics community, seeing similarities between random $k$ -SAT and problems in spin-glass theory. A number of heuristic calculations based on such methods [24, 25] provided conjectures both for the values of $\alpha_{k}$ and the structure of the set of solution to a satisfiable random CNF. Parts of these calculations could also be made mathematically rigorous [31], but not the most crucial ones.

The existence of a sharp threshold for random $k$ -SAT was proven by Friedgut in [14], however neither the location of the this threshold, nor that $\alpha_{k}$ does not asymptotically depend on $n$ , follows from his very general threshold results. On the other hand an even more detailed description [4] of the threshold for $k=2$ has been obtained and the asymptotic behaviour of $\alpha_{k}$ as $k\to\infty$ has been found [8]. The existence of such a constant $\alpha_{k}$ for large enough $k$ has been established [10], and for $k\approx\log n$ the exact location of the satisfiability threshold has been determined [15, 21]. It is generally believed that $\alpha_{k}$ exists for all fixed $k$ but recently the exact value given by the cavity-method [24, 25] has been questioned [22] for low $k$ .

The aim of this paper is to study a variation on the random $k$ -SAT problem where, instead of taking each clause uniformly at random, we introduce a bias parameter $p$ which determines the probability of a variable being negated or non-negated in each clause. Variables in a clause independently occur non-negated with probability $p$ and negated with probability $1-p$ , independently for each variable. For $p=\frac{1}{2}$ we thus get the usual random $k$ -SAT problem, and for smaller $p$ we get a higher proportion of non-negated variables in our clauses. For $k=2$ this $2$ -SAT distribution has been studied [3] in connection with the hardness of approximating the maximum number of clauses which can be satisfied in random CNF, and surprisingly evidence was found for the balanced case not being the hardest one. The case $k=2$ is also covered by the results in [9], where the threshold behaviour of a more general family of $2$ -SAT distributions was identified. For $k=3$ the threshold has been studied numerically, albeit to quite low precision [29]. Our main focus will be on determining the satisfiability threshold as a function of both $k$ and $p$ , and where possible confirm behaviour conjectured in the older literature. We will also give some results on the distribution of satisfying assignments in the hypercube. In the unbiased version this distribution is uniform and this in part responsible for the inefficiency of some probabilistic tools in analysing the model, as noted in e.g. [2].

We find the exact threshold for any $p$ when $k=2$ , and for $p$ very close to $\frac{1}{2}$ when $k\geq\log n$ , similarly to what is known for the balanced case $p=\frac{1}{2}$ . For fixed $k\geq 3$ , we show that the threshold is approximately quadratic in $p$ near $\frac{1}{2}$ and scales like $p^{1-k}$ as $p\to 0$ , the latter confirming a prediction based on replica symmetry heuristics [26]. While the proof for the case $p\to 0$ is mostly an adaption of known methods, the proof for the case $p\to\frac{1}{2}$ is novel, and may be of independent interest. The reader interested in the latter proof can skip directly to section 5, which is largely self-contained.

1.1 Biased k-SAT

Let $p\in(0,1)$ be a real number. The biased random $k$ -SAT problem is a random SAT problem, where the clauses are picked according to the following distribution: Start with a set of $n$ Boolean variables and pick from it a $k$ -set $K$ uniformly at random. Then, independently for each $x\in K$ , pick the literal $x$ with probability $p$ and the literal $\neg x$ otherwise.333As the problem is symmetric under $p\mapsto 1-p$ , we will assume throughout that $p\leq\frac{1}{2}$ . The $k$ literals $z_{1},\ldots,z_{k}$ thus chosen form a clause $C:=z_{1}\vee z_{2}\vee\ldots\vee z_{k}$ , which we call a $p$ -biased clause. Let $\Phi^{p}_{k}(m,n)$ be the conjunction of $m$ i.i.d. $p$ -biased clauses.

Let $\alpha_{k}(p,n):=\inf\{\frac{m}{n}:\mathbb{P}(\phi^{p}_{k}(m,n)\textrm{ is satisfiable})\leq\frac{1}{2}\}$ , and let $\alpha_{k}(p):=\lim_{n}\alpha_{k}(p,n)$ if the limit exists. In a slight abuse of notation, we will write (for instance) ‘ $\alpha_{3}(\frac{1}{2})\geq 3$ ’ even if we do not whether $\alpha_{3}(\frac{1}{2})$ exists, but this should be understood as being short for ‘ $\liminf_{n}\alpha_{3}(\frac{1}{2},n)\geq 3$ ’. We will study how the satisfiability threshold $\alpha_{k}(p)$ behaves as a function of $p$ . Regions of particular interest are $p\approx 0$ and $p\approx\frac{1}{2}$ . In the latter case, we will often work with the parametrization $p=\frac{1}{2}-b$ for some small positive $b$ .

1.2 Structured coupon collection

It is worth noting that random SAT problems are examples of structured coupon collector problems [13]. Each assignment of true/false to the $n$ Boolean variables is an $n$ -word $(\varepsilon_{1},\varepsilon_{2},\ldots\varepsilon_{n})$ in a two-symbol alphabet, and as such can be identified with the vertices of the hyper-cube $\{-1,1\}^{n}=:\Sigma_{n}$ . (Here, $-1$ denotes false and $1$ denotes true.) Each $k$ -clause $C$ forbids a subset of those vertices, for instance the clause $C:=x_{1}\vee x_{2}\vee\neg x_{3}$ forbids any vertex of the form $(-1,-1,1,\varepsilon_{4},\varepsilon_{5},\ldots\varepsilon_{n})$ . The set of vertices forbidden by $C$ forms a $(n$ - $k)$ -sub-cube of $\Sigma_{n}$ .

A sub-cube $C^{\prime}\subseteq\Sigma_{n}$ can be represented as an $n$ -word $(\delta_{1},\delta_{2},\ldots\delta_{n})$ in the alphabet $\{-1,1,\star\}$ in the following way: $\delta_{i}=-1$ iff $x_{i}=-1$ for all $x\in C^{\prime}$ , $\delta_{i}=1$ iff $x_{i}=1$ for all $x\in C^{\prime}$ , $\delta_{i}=\star$ otherwise. In other words, $\star$ denotes the ‘free’ coordinates of $C^{\prime}$ (variables that do not occur in the clause), and the dimension of $C^{\prime}$ is the number of $\star$ ’s in $C^{\prime}$ . The set of solutions forbidden by the clause $C$ above is the sub-cube $C^{\prime}=(-1,-1,1,\star,\ldots\star)$ . Since there is a bijection between clauses and sub-cubes in this way, we will henceforth identify a clause with its corresponding sub-cube of forbidden vertices.

We will usually think of random $k$ -SAT as a process where we add clauses at integer times until the formula is no longer satisfiable. A variation on this is to add clauses at times given by a Poisson process of intensity $1$ . At time $m$ , the discrete-time model will have $m$ clauses, while the continuous-time model will have $\operatorname{Poi}(m)$ clauses. These two models are closely related444Cf. the two random graph models $\mathcal{G}_{n,p}$ and $\mathcal{G}_{n,m}$ , and the satisfiability threshold of either of them is within a multiplicative factor $1\pm o(1)$ of the other. We will therefore work with whichever model best suits our needs at any given time, but take care to specify when we switch between them.

1.3 Structure of paper

Each section of the paper is largely self-contained. We begin by adapting known techniques to the biased version of the random $k$ -SAT problem in sections 2, 3 and 4.

In section 2, we give the exact threshold for the case $k=2$ . The remainder of the paper assumes $k\geq 3$ . In section 3, we find a lower bound on $\alpha_{k}(p)$ for fixed $k$ by analyzing the unit clause propagation algorithm. In section 4 we use the method of moments to bound $\alpha_{k}(p)$ . In section 4.1, we estimate the first two moments of the number of solutions to a biased $k$ -SAT formula. This leads to sharp bounds on $\alpha_{k}(p)$ for $p$ close to $\frac{1}{2}$ and $k\geq K\log n$ for $K$ sufficiently large. We study a variation on the first moment method in section 4.2, and find an slightly sharper upper bound on $\alpha_{k}(p)$ which for fixed $k$ is within a constant factor of the lower bound from section 3. These results together establish the asymptotic behavior $\alpha_{k}(p)\sim p^{1-k}$ as $p\to 0$ .

Finally, in section 5, we investigate the asymptotics of $\alpha_{k}(p)$ as $p\to\frac{1}{2}$ , by studying how the satisfiability of a formula is affected by changing the occurrences of a single variable. We use a novel combination of tools, including Russo’s formula and the Kruskal-Katona theorem, to show that the satisfiability threshold is approximately a parabola near $p=\frac{1}{2}$ , i.e. $\alpha_{k}(p)=\alpha_{k}+\Theta((p-\frac{1}{2})^{2})$ .

2 Satisfiability threshold for biased random 2-SAT

In the special case $k=2$ , we find the exact value of the threshold. For the classical (unbiased) $2$ -SAT problem, the threshold value of $\alpha_{2}=1$ was established by Chvátal & Reed [7] by exploiting some of the structure specific to $2$ -SAT.

Later Cooper, Frieze & Sorkin [9] worked with $2$ -SAT formulas of prescribed literal degrees and gave a criterion for satisfiability. Before we state their theorem, we need the following notation: For a $2$ -SAT formula $F$ , let $d_{i}^{+}$ be the number of occurrences of $x_{i}$ in $F$ , and similarly $d_{i}^{-}$ be number of occurrences of $\neg x_{i}$ . For any sequence $\mathbf{d}=(d_{1}^{+},d_{1}^{-},\ldots d_{n}^{+},d_{n}^{-})$ , let $D_{1}:=\sum_{i}(d_{i}^{+}+d_{i}^{-})$ and $D_{2}:=\sum_{i}d_{i}^{+}d_{i}^{-}$ .

Theorem 1 (CFS):

Let $0<\varepsilon<1$ be constant and $n\to\infty$ . Let $\mathbf{d}$ be any literal-degree sequence over $n$ variables with $\max_{i}d_{i}^{\pm}\leq n^{1/11}$ and $D_{1}$ even, and let $F$ be chosen uniformly at random from all $2$ -SAT formulas with degree sequence $\mathbf{d}$ .

(i)

If $2D_{2}<(1-\varepsilon)D_{1}$ , then $\mathbb{P}(F\textrm{ is satisfiable})\to 1$ 2. (ii)

If $2D_{2}>(1+\varepsilon)D_{1}$ , then $\mathbb{P}(F\textrm{ is satisfiable})\to 0$

Both limits are uniform in $n$ (independent of $d$ ).

Theorem 2:

The biased $2$ -SAT problem with bias $p$ has a sharp satisfiability threshold at $\alpha_{2}(p)=\frac{1}{4p(1-p)}=\frac{1}{1-4b^{2}}$ .

Sumedha, Krishnamurthy & Sahoo [30] first sketched this, and we will give a full proof.

Proof. We prove this for the continuous-time case, the discrete-time case follows. Pick an $\varepsilon>0$ . For a $p$ -biased $2$ -SAT formula with $\operatorname{Poi}(m)$ clauses, the quantities $D_{1}$ and $D_{2}$ are random variables. $\mathbb{E}D_{1}=2m$ , while

[TABLE]

Both $D_{1}$ and $D_{2}$ are sharply concentrated around their means. Letting $m={(1-4\varepsilon)\frac{n}{4p(1-p)}}$ , we find that $\|\mathbf{d}\|_{\infty}=O(\log n)$ with high probability, and

[TABLE]

with high probability, so that (i) from Theorem 1 is satisfied w.h.p. Similarly, letting $m=(1+4\varepsilon)\frac{n}{4p(1-p)}$ gives that (ii) is satisfied w.h.p. The theorem follows. ∎

3 Algorithmic lower bound on satisfiability threshold

In this section we will show how to adapt the work of Chao and Franco [5] and Achlioptas [1] to biased $k$ -SAT. We will work in discrete time.

We will show that the algorithm ‘unit clause propagation’ (1) succeeds in finding a satisfying truth assignment with positive probability when $m$ is not too large. This algorithm is non-backtracking, and straight-forward to analyze. While better lower bounds are known for the non-biased case, this gives a lower bound that (for any fixed $k$ ) scales correctly with $p$ .

1 is given a SAT formula $\Phi$ as input, in the form of a set of subsets of $\{x_{1},\neg x_{1},x_{2},\neg x_{2},\ldots,\neg x_{n}\}$ . It repeatedly tries to satisfy a unit clause (i.e. a clause on a single variable), and if no such clause exists it sets a random variable to a random value. Satisfied clauses and unsatisfied literals are then removed. The algorithm succeeds iff no empty clauses are ever generated.

It might seem strange to let $\ell=v$ with probability $1-p$ in the first ‘else’ of the algorithm, rather than with probability $1$ (since this would maximize the expected number of clauses being satisfied). The reason for this choice is to simplify the analysis of the algorithm: it ensures that the dynamics of the number of $i$ -clauses, $i>1$ , is independent from the number of $1$ -clauses.

Theorem 3:

There exists a $\delta=\delta(k,p)$ such that if $m$ is at most ${\frac{n}{k^{2}}(2p(1-p))^{1-k}}$ , then 1 finds a satisfying assignment to $\Phi_{k}^{p}(m,n)$ with probability at least $\delta$ .

Furthermore, if a satisfying assignment is found, the number of variables set to ‘FALSE’ follows a binomial distribution with parameters $n$ and $p$ .

The intuition behind this theorem is as follows: An $i$ -clause is turned into a $(i\!-\!1)$ -clause with probability $2p(1-p)$ , and else removed. So of the $k$ -clauses, only a fraction of approximately $(2p(1-p))^{k-1}$ survives long enough to be reduced to $1$ -clauses. If the rate at which $1$ -clauses are being created is strictly less than the rate at which they are dealt with (which is $1$ ), queuing theory suggests that the queue of $1$ -clauses will remain of bounded size. If the queue is of bounded size, the probability of there existing contradicting $1$ -clauses at any given time is of order $n^{-1}$ , which suggests that the probability of there ever existing such clauses should be of constant order.

Proof. To a large extent the proof is essentially mutatis mutandis from [1], so we will focus on the necessary modifications and how the results follows from a few key lemmas.

For every $i\in\{0,1,\ldots k\}$ and $j\in\{0,1\ldots,n\}$ , let $S_{i}(j)$ be the number of $i$ -clauses at time $j$ , and $Y(j):=(S_{0}(j),S_{1}(j),\ldots S_{k}(j))$ . Our aim is to understand the trajectory of the (time-inhomogeneous) Markov chain $Y$ .

Looking at the expected difference $S_{k}(j+1)-S_{k}(j)$ , conditioned on $Y(j)$ , we see that

[TABLE]

because any $k$ -clause contains $k$ of the $n-j$ free variables, and each of them is equally likely to be locked at time $j+1$ .

Similarly, for any $0\leq i<k$ ,

[TABLE]

where the first term is as before, while the second term counts the expected number of $(i\!+\!1)$ -clauses being shrunk to $i$ -clauses. (When a variable occurring in an $(i\!+\!1)$ -clause is locked, the clause is shrunk to an $i$ -clause with probability $2p(1-p)$ , and else removed.) Together these difference equations describe the Markov chain $Y$ .

The main proof idea is to look at the scaling limit of the expected trajectory of this Markov chain (the so-called liquid model). However, there are two problems that arise.

First, for $i\geq 2$ , $\mathbb{E}S_{i}$ is of order $n$ and the dynamics of $Y$ is not too sensitive to deviations in $S_{i}$ of order $o(n)$ . But for $i=0$ or $1$ , $\mathbb{E}S_{i}=O(1)$ and the dynamics is sensitive to small deviations. We deal with this problem by looking at the scaling limit of $S_{i}$ only for $i\geq 2$ , and then making sure that $S_{2}$ is never so large that the influx of $1$ -clauses exceeds the rate at which they can be removed. The following lemma (which we state without proof) is a slight modification of lemma 4 in [1].

Lemma 4:

For any $\delta,\varepsilon>0$ , if $t^{*}\in(0,1)$ is such that $t^{*}\leq(1-\varepsilon)$ and

[TABLE]

then $S_{0}(t^{*})=S_{1}(t^{*})=0$ with probability at least $\rho=\rho(\varepsilon,\delta)$ .

This lemma says that as long the density of the $2$ -clauses stays below $1-\delta$ times the satisfiability threshold for $2$ -SAT at time $t\leq t^{*}$ , there is a positive probability that there are no [math]- or $1$ -clauses at $t=t^{*}$ . So, next we need to bound the expected value of $S_{2}$ , and show that $S_{2}$ stays close to it.

To see the second problem, let’s look at the system of differential equations describing the scaling limit. If we let the functions $c_{i}$ be defined by $c_{i}(t):=\lim_{n\to\infty}\frac{1}{n}\mathbb{E}S_{i}(tn)$ for $i\geq 2$ , they will satisfy the following system (which is the scaling limit of the system of difference equations describing $S_{i}$ ):

[TABLE]

The system has the following unique solution:

[TABLE]

We want to show that the trajectory of $S_{i}$ unlikely to deviate much from the trajectory of $c_{i}n$ . There is a theorem by Wormald [32] that lets us do precisely that, and in order for it to apply we need the Markov chain to satisfy these two properties:

(i)

The system of differential equations describing $c=(c_{2},\ldots,c_{k})$ can be written on the form $c^{\prime}(t)=f(c(t),t)$ for some Lipschitz continuous function ${f:D\times I\mapsto\mathbb{R}^{k-1}}$ (for some appropriate domain $D\subseteq\mathbb{R}^{k-1}$ and time interval $I$ ). 2. (ii)

Conditioned on the history of $S_{2},\ldots,S_{k}$ up to time $j$ the probability that ${|S_{i}(j+1)-S_{i}(j)|>n^{1/5}}$ is at most $o(n^{-3})$ .

These properties are largely unaffected by the value of $p$ . The first holds for any time interval $I=[0,1-\varepsilon]$ regardless of $p$ . For the second one, the increments follow an approximate Skellam distribution, which have exponential tails.

Since the first condition doesn’t hold all the way until time $1$ , we will only analyze the algorithm on the time interval $[0,1-\varepsilon]$ , and then show that the formula remaining at time $1-\varepsilon$ is sparse enough to be satisfied easily. Applying Wormald’s theorem, we get the following lemma.

Lemma 5:

For any $\delta,\varepsilon>0$ , there exists $\eta=o(1)$ such that with probability at least $1-\eta$ ,

[TABLE]

In order for Lemma 4 to apply, we need that $S_{2}(tn)$ is sufficiently small for all $t\leq(1-\varepsilon)$ , and now Lemma 5 tells us that $S_{2}(tn)$ will stay close to $c_{2}(t)n$ for all such $t$ . That is, for any $\delta>0$ , $S_{2}(tn)/n\leq c_{2}(t)+\delta$ holds w.h.p. So,

[TABLE]

For this to be at most the upper bound for $S_{2}$ in Lemma 4, we need the expression within brackets to be bounded away from $1$ . We accomplish that by choosing $c\leq(1-\varepsilon^{\prime})k^{-2}(2p(1-p))^{1-k}$ and $\delta\leq\varepsilon\varepsilon^{\prime}/2$ for some $\varepsilon^{\prime}>0$ . Thus, for $c$ as above and $t=1-\varepsilon$ , with probability at least $\rho(\varepsilon,\varepsilon^{\prime}/2)$ we have that there are no clauses smaller than $2$ at time $t$ .

Let $F$ be the formula remaining at time $t=1-\varepsilon$ , conditional on there being no clause of size less than $2$ , and let $m_{t}$ be the number of clauses it consists of. To show that $F$ is satisfiable w.h.p., we construct a new formula $\tilde{F}$ from $F$ by uniformly at random throwing away literals from each clause with more than $2$ literals. Any assignment satisfying $\tilde{F}$ also satisfies $F$ . The formula $\tilde{F}$ is a $2$ -SAT formula, and it follows the distribution $\Phi_{2}^{p}((1-t)n,m_{t})$ . By Theorem 2, $\tilde{F}$ is satisfiable w.h.p. if $\frac{4p(1-p)m_{t}}{(1-t)n}<1$ .

$S_{i}$ is close to $c_{i}n$ , so it follows that $m_{t}<n(\delta+\sum_{i=2}^{k}c_{i}(t))$ (with high probability). Thus

[TABLE]

If we let $\varepsilon=2p(1-p)/k$ and $\delta=\varepsilon/10$ , the expression within brackets is at most $\frac{1}{10}+\frac{1}{k}(1+\frac{1}{k})^{k}<\frac{9}{10}$ . The pre-factor is at most $1$ , so the entire expression is bounded away from $1$ and thus the condition of Theorem 2 is satisfied. So $\tilde{F}$ is satisfiable w.h.p., and any assignment that satisfies $\tilde{F}$ also satisfies $F$ .

For the ‘furthermore’ part of the theorem, note that the signs assigned to the variable are i.i.d. Bernoulli r.v.’s., so that the number of variables set to ‘FALSE’ is a binomial random variable with $n$ tries and success probability of $p$ . ∎

4 Method of moments bounds on satisfiability threshold

The earliest proven upper bound ( $\alpha_{k}\leq 2^{k}\log 2$ ) was found by applying the first moment method to the number of satisfying assignments.555In this section, we work in discrete time. This upper bound has been improved many times, often by using variations on the same method.

4.1 Vanilla first and second moment methods

In this section we will estimate the expected number of satisfying assigment in the biased $k$ -SAT model, which leads to an upper bound on $\alpha_{k}(p)$ . We will also employ the second moment method, but this only gives a non-trivial lower bound when $k$ is logarithmic in $n$ .

While the classical random $k$ -SAT problem is vertex transitive on the set $\Sigma_{n}:=\{-1,1\}^{n}$ of solutions, introducing a bias breaks that symmetry. However, the biased version is vertex transitive on any fixed weight ‘layer’ of $\Sigma_{n}$ , i.e. subset where the number of coordinates equal to $1$ is equal to some constant $i$ . The number of such layers is relatively small ( $n+1$ , compared to $2^{n}$ vertices in total), so dealing with each layer separately and then applying a union bound only generates small error terms. First, we will need some notation.

(i.)

For any integers $i,r\in[n]$ , define $L_{i}:=\{x\in\Sigma_{n}:\{j:x_{j}=1\}=i\}$ to be the $i$ :th layer of $\Sigma_{n}$ , and take $x,y\in L_{i}$ such that $\|x-y\|_{1}=2r$ (i.e. the Hamming distance between $x$ and $y$ is $r$ ). 2. (ii.)

Let $C\in\mathcal{S}_{k}$ be a $p$ -biased random sub-cube, and let $Q(i,n):=\mathbb{P}(x\in C)$ and $Q^{r}(i,n):=\mathbb{P}({x\in C}\textrm{ and }{y\in C})$ . Note that $Q^{0}=Q$ . 3. (iii.)

Let $Z_{m,i}$ be the (random) number of non-covered vertices in $L_{i}$ after $m$ clauses (or cubes) have been drawn, and let $Z_{m}:=\sum_{i}Z_{m,i}$ . 4. (iv.)

Finally, let $c_{p,x}:=\inf\{c:\mathbb{E}Z_{cn,xn}\leq 1\}$ and $c_{p}:=\sup_{x}c_{p,x}$ .

Applying the first moment method to the random variable $Z_{m}$ , we see that $\alpha_{k}(p)\leq(1+o(1))c_{p}$ . We will begin by estimating $c_{p}$ .

Claim 1:

For $i\leq k$ , $\big{(}\frac{i-j+1}{k-j+1}\big{)}^{j}\leq{\binom{i}{j}}\big{/}{\binom{k}{j}}\leq\big{(}\frac{i}{k}\big{)}^{j}$ .

Proof. Expand the binomial coefficients and repeatedly apply the inequality $\frac{a}{b}<\frac{a+c}{b+c}$ (valid for $0<a<b$ and $c>0$ ). ∎

Claim 2:

Define the binary entropy $H(x):=-x\log(x)-(1-x)\log(1-x)$ . Then, for $x,y\in(0,1)$ , $H(xy)>xH(y)$ and $\log(1-xy)<x\log(1-y)<-xy$ .

Proof. The second part of the lemma follows from the function $z\mapsto\log(1-z)$ being concave and monotonely decreasing. For the first part, note that $H$ is concave and $H(0)=0$ , so $H(xy)>xH(y)+(1-x)H(0)=xH(y)$ . ∎

Claim 3:

For any $p,x\in(0,1)$ , $c_{p,x}=-{\frac{1}{n}\log\binom{n}{xn}}/{\log\big{(}1-Q(xn,n)\big{)}}$

Proof. Let $i:=xn$ . There are $\binom{n}{i}$ vertices in $L_{i}$ , each of which fails to be covered by $m$ $p$ -biased random sub-cube with probability $(1-Q(i,n))^{m}$ , so $\mathbb{E}Z_{m,i}=\binom{n}{i}(1-Q(i,n))^{m}$ . This equals $1$ precisely when

[TABLE]

∎

Claim 4:

*For any $p,x\in(0,1)$ , let $\eta_{p}(x):=x(1-p)+(1-x)p$ . Then

$Q(xn,n)=(1-o(1))\cdot\eta_{p}(x)^{k}$ *

Proof. Let $i:=xn$ . For a fixed $v\in L_{i}$ , and a $k$ -set $I\subset[n]$ , there is a unique cube $C$ containing $v$ and whose set of locked variables is precisely $I$ . The number of cubes containing $v$ and with precisely $j$ variables locked to $1$ is then $\binom{i}{j}\binom{n-i}{k-j}$ . Each such cube appears with probability $p^{j}(1-p)^{k-j}\binom{n}{k}^{-1}$ , so that

[TABLE]

We approximate the binomials using claim 1:

[TABLE]

where the constants implicit in the big- $O$ notation do not depend on $j$ . Summing over all $j\leq k$ , we get that

[TABLE]

The claim follows. ∎

Claim 5:

$Q^{r}(i,n)\leq\Big{(}1-\frac{r}{n-k+1}\Big{)}^{k-1}Q(i,n)$ **

Proof. First, note that $\binom{n}{k}Q^{r}(i,n)=\binom{n-r}{k}Q(i-\frac{r}{2},n-r)$ . But ${Q(i-\frac{r}{2},n-r)}={\frac{n}{n-r}Q(i,n)}$ , so

[TABLE]

Together with claim 1 this proves the claim. ∎

Now that we have some good estimates for the probabilities $Q$ and $Q^{r}$ , we can proceed to use the first moment method to get an upper bound on the satisfiability threshold, and the second moment method to get a lower bound.

Proposition 6 (Bounds on first moment threshold):

For any integer $k\geq 3$ and $p\in(0,\frac{1}{2}]$ (with $k$ and/or $p$ possibly depending on $n$ ), let $x_{+}:=\min(\frac{1}{2},\frac{p}{(k-1)(1-2p)})$ and $x_{-}:=\frac{2}{5}x_{+}$ . Then

[TABLE]

In particular,

(1.)

If $p=\frac{1}{2}-\frac{\beta}{2k}$ for some $\beta\leq 1$ , then

[TABLE] 2. (2.)

If $k=\omega(1)$ and $p\in(0,1)$ is fixed, then for all sufficiently large $n$

[TABLE]

Proof. For the lower bound, note that $c_{p}\geq c_{p,x}$ by definition. So, in particular, $c_{p}\geq c_{p,x_{+}}$ , and it suffices to estimate $c_{p,x_{+}}$ . For the upper bound, we shall find an small interval on which the supremum is attained. Recall that

[TABLE]

where $\eta(x):=p(1-x)+x(1-p)$ . For the sake of simplicity we will work with $f(x):=H(x)\eta(x)^{-k}$ rather than directly with $c_{p,x}$ .

First, note that $f$ is a strictly concave continuous function on $[0,1]$ , whence there exists a unique $x_{0}$ which maximizes $f$ . Since $f(0)=f(1)=0$ whereas $f(x)>0$ for any $x\neq 0,1$ , we must have $x_{0}\in(0,1)$ . Second, note that $f$ has a continuous derivative on $(0,1)$ , so if $x_{-}<x_{+}$ are such that $f^{\prime}(x_{-})>0>f^{\prime}(x_{+})$ , then $x_{-}<x_{0}<x_{+}$ . Now, for any $x\in(0,1)$ ,

[TABLE]

The pre-factor $\eta^{-k-1}$ is always positive, so the sign of $f^{\prime}$ will be the same as the sign of $\Delta$ . Expanding out the definition of $H$ , we can rewrite $\Delta$ as

[TABLE]

Let $x_{+}:=\min(\frac{1}{2},\frac{p}{(k-1)(1-2p)})$ and consider $\Delta(x_{+})$ . Then the term (1) equals [math] (since at least one of the factors in that term equals [math]), while the term (2) is negative for any $x\in(0,1)$ . So $\Delta(x_{+})<0$ , and thus $f^{\prime}(x_{+})<0$ .

Next, we let $x_{-}:=\frac{2}{5}x_{+}$ (which is at most $\frac{1}{5}$ ) and consider $\Delta(x_{-})$ . Then the term 1 equals

[TABLE]

which is decreasing in $x_{-}$ and hence at least $\log\big{(}\frac{1-\frac{1}{5}}{\frac{1}{5}}\big{)}\cdot\frac{3}{5}p>0.8p$ . The term 2, on the other hand, equals

[TABLE]

which is also decreasing in $x_{-}$ and hence at least ${5\log\left(1-\frac{1}{5}\right)\frac{2k}{5(k-1)}p}$ , which for $k\geq 3$ is at least $-0.7p$ . Together this gives us that $\Delta(x_{-})>0.8p-0.7p>0$ . It follows that $f^{\prime}(x_{-})>0$ , and together with $f^{\prime}(x_{+})<0$ we have that $x_{-}<x_{0}<x_{+}$ .

Now that we have an interval $(x_{-},x_{+})$ which we know contains $x_{0}$ , we can estimate $f(x_{0})=H(x_{0})\eta(x_{0})^{-k}$ . To bound the first factor from above, note that $H$ is a strictly increasing function on $[0,\frac{1}{2}]$ , and $x_{0}<x_{+}\leq\frac{1}{2}$ . Thus $H(x_{0})<H(x_{+})$ . Similarly, to bound the second factor from above, note that $\eta^{-k}$ is decreasing on $[0,1]$ , and $x_{-}<x_{0}$ , whence $\eta(x_{0})^{-k}\leq\eta(x_{-})^{-k}$ . Together these two inequalities gives us that

[TABLE]

This proves the upper bound part of the proposition. For the ‘in particular’-statements, apply the definition of $H$ and the inequality $H(x)\leq x(1-\log x)$ . ∎

Next, we will show that for $k$ growing sufficiently fast and bias sufficiently small, the first moment bound is tight (i.e. $\alpha_{k}(p)=(1+o(1))c_{p}$ ).

Theorem 7:

Assume $k\geq K\cdot(\log_{2}n+\omega(1))$ for some $K\geq 1$ and let $\varepsilon>0$ be fixed.

(i.)

For any $p$ , we have that

[TABLE]

where $x_{*}$ is defined as the smaller of the two roots of the following equation:

[TABLE]

In other words, $\Phi^{\frac{1}{2}-b}_{k}(m,n)$ is satisfiable w.h.p. for ${m<(1-\varepsilon)c_{p,x_{*}}n}$ . 2. (ii.)

If $|p-\frac{1}{2}|\leq\frac{1-o_{K}(1)}{\log n}$ , then $\alpha_{k}(p)=(1+o(1))c_{p}$ . In other words,

[TABLE]

These results are similar to those known for the unbiased $k$ -SAT problem. Setting $K=1$ (and hence $x_{*}=1/2$ ) in our theorem we recover the lower bound $k\geq\log_{2}n+\omega(1)$ which is known for that case [15, 21]. We also note that the result remains valid for $K$ which depend on $n$ .

Proof. Let $I_{K}:=\{x:2x^{2}+2x+2^{-1/K}-1\geq 0\}$ . We will use the second moment method to prove that for any $x\in I_{K}$ , there exists a solution in $L_{xn}$ with high probability if $m<(1-\varepsilon)c_{p,x}$ . From this the two parts of the theorem follow by noting that (i.) $x_{*}\in I_{K}$ , and (ii.) for $p$ sufficiently close to $\frac{1}{2}$ , the $x$ that maximizes $c_{p,x}$ lies in $I_{K}$ .

[TABLE]

In order for the event $\{u\textrm{ not covered by time }m\}$ to occur, a Poisson process of intensity $Q(i,n)$ must have had no event on the the time interval $[0,m]$ . The probability of this happening is $\exp(-Q(i,n)m)$ . Thus the denominator equals $\binom{n}{i}^{2}\exp(-2Q(i,n)m)$ .

Similarly, in order for the event $\{u,v\textrm{ not covered at time }m\}$ to occur, no clause covering $u$ or $v$ can have occurred. By inclusion-exclusion, the total intensity of clauses covering at least one of $u$ and $v$ is $2Q(i,n)-Q^{r}(i,n)$ , where $r=\frac{1}{2}d(u,v)$ , so this event has probability $\exp((-2Q(i,n)+Q^{r}(i,n))m)$ .

How many pairs $u,v$ have Hamming distance $2r$ ? Starting from $u$ , such a $v$ is uniquely determined by which $r$ $1$ ’s we flip to [math]’s, and which $r$ [math]’s we flip to $1$ ’s. So for any $u\in L_{i}$ there are $\binom{i}{r}\binom{n-i}{r}$ vertices $v$ at distance $2r$ from $u$ . It follows that

[TABLE]

Define $g$ to be the function

[TABLE]

so that $\frac{\mathbb{E}[Z_{m,i}^{2}]}{\mathbb{E}[Z_{m,i}]^{2}}\leq\sum_{r=0}^{i}e^{g(r/n)}$ .

Claim 6:

The function $g$ is concave on $[\frac{\varepsilon}{4}x,x]$ .

Proof. The second derivative of $\log\binom{i}{zn}$ is $-(1+o(1))(\frac{n}{z}+\frac{n}{x-z})$ , which is at most $-\frac{n}{x}$ . Similarly, the second derivative of $\log\binom{n-i}{zn}$ is at most $-\frac{n}{1-x}$ .

Using that $k-1$ is at least $\frac{\log n}{-\log(1-2x(1-x))}$ , we see that the second derivative of $(-1+(1-\varepsilon)(1-2z)^{k-1})\log\binom{n}{i}$ is at most

[TABLE]

Hence $\frac{d^{2}}{dz^{2}}\log g(z)\leq n\left(-\frac{1}{x}-\frac{1}{1-x}+4H(x)k^{2}n^{-\frac{\varepsilon}{4}}\right)\leq n(-1+o(1))$ . ∎

We will estimate $g$ for different ranges of $z$ :

Case I:

$z<\frac{\varepsilon}{4}x$

The factor $-1+(1-\varepsilon)(1-\frac{r}{n})^{k-1}$ is at most $-\varepsilon$ , and ${\binom{n}{i}^{-\varepsilon}\leq\binom{n}{\varepsilon i/2}^{-2}}$ . On the other hand, $\binom{i}{zn}\binom{n-i}{zn}\leq\binom{n}{2zn}$ , which (by assumption) is at most $\binom{n}{\varepsilon i/2}$ . Together this gives us that $e^{g(z)}\leq\binom{n}{\varepsilon i/2}^{-1}$ . It follows that

[TABLE] 2. Case II:

$z\geq z_{0}:=x(1-x)(1-\delta)$

For convenience, let $y:=2x(1-x)$ . Using that $k\geq\frac{\log n+\omega(1)}{-\log(1-y)}$ , it follows that

[TABLE]

where the last inequality comes from noting that $y\mapsto\frac{y}{-(1-y)\log(1-y)}$ is an increasing function on $[0,\frac{1}{2}]$ , and thus at most $1/\log 2$ . So $\binom{n}{i}^{(1-2z)^{k-1}}$ is at most $\exp\big{(}H(x)\cdot o(n^{2\delta})\big{)}$ , which is $1+o(1)$ if we set $\delta:=\frac{1}{\log n}$ . It follows that $e^{g(z)}<(1+o(1))\cdot\binom{i}{zn}\binom{n-i}{zn}\binom{n}{i}^{-1}$ for $z$ in this interval, and summing over $r$ we get

[TABLE] 3. Case III:

$\frac{\varepsilon}{4}x\leq z<z_{0}$

Since $g$ is concave on this interval, its graph lies beneath any of its tangent lines. In particular, this is true for the tangent line at $z=z_{0}$ . In other words, for any $z$ we have that ${g(z)\leq g(z_{0})+(z-z_{0})g^{\prime}(z_{0})}$ . Since $z-z_{0}<0$ , we need to lower bound $g^{\prime}(z_{0})$ and upper bound $g(z_{0})$ .

[TABLE]

Since $1-2z_{0}>1-y\geq\frac{1}{2}$ , (4) is at most

[TABLE]

and $(1-2z_{0})^{k-1}=o(n^{-1})$ by the previous case. The fraction in the right-hand side of (3) is at least $\frac{1}{1-\delta}$ . Thus

[TABLE]

To bound $e^{g(z_{0})}$ from above, recall from the previous case that it is at most $(1+o(1))\binom{i}{z_{0}n}\binom{n-i}{z_{0}n}\binom{n}{i}$ . The expression $\binom{i}{zn}\binom{n-i}{zn}\binom{n}{i}$ is maximized for $z=x(1-x)$ , where it is $O(n^{-\frac{1}{2}})$ . Hence

[TABLE]

and thus

[TABLE]

Together these three cases give us that $\sum_{r=0}^{z_{0}i}e^{g(r/n)}\leq 1+o(1)$ , which implies that $\frac{\mathbb{E}[Z_{m,i}^{2}]}{\mathbb{E}[Z_{m,i}]^{2}}=1+o(1)$ . Chebyshev’s inequality then gives us that $Z_{m,i}>0$ with probability $1-o(1)$ . ∎

4.2 Improved first moment method

When the unit-clause algorithm from section 3 succeeds in finding a solution, the number of variables set to ‘false’ is concentrated around $pn$ . So we might suspect that $i=pn$ is the dominating term in the expected number of solutions. Recall that $Q(i,n,k,p)$ is the probability that a $p$ -biased $k$ -clause covers an arbitrary vertex in $L_{i}$ , and that for $q(x):=Q(xn,n,k,p)$ we have

[TABLE]

Furthermore, recall that $Z_{i,n}$ is the number of uncovered vertices in $L_{i}$ , and that

[TABLE]

For small $p$ , in order for the right hand side to be negative we need $c$ to be of order $\log\frac{1}{p}\cdot p^{1-k}$ . This only differs from the lower bound by a log-factor, and a small improvement on the vanilla first moment method suffices to correct this: the single-flip method (due to Dubois & Boufkhad [12]). We will adapt this method to the biased random $k$ -SAT model.

Proposition 8:

For small enough $p$ , $\alpha_{k}(p)\leq 2p^{1-k}\alpha_{k}({\frac{1}{2}})$ .

Proof. We will apply the first moment method to a subset of solutions, which is guaranteed to be non-empty if the set of solutions is non-empty.

Let $C_{1},C_{2},\ldots$ be the sub-cubes drawn, and let $K_{m}:=\bigcup_{j=1}^{m}C_{j}$ be the set of covered vertices at time $m$ . The hyper-cube can be given a lattice ordering as follows: $u\leq v$ if $u_{i}\leq v_{i}$ for every coordinate $i$ . This is isomorphic to the lattice ordering on $2^{[n]}$ induced by inclusion. Let $M_{m,i}$ be the number of solutions (uncovered vertices) in $L_{i}$ that are locally minimal w.r.t. this order.666We might equally well count the number of locally maximal solutions. Note that $\sum_{i}M_{m,i}$ is non-zero if and only if the formula $C_{1}\wedge\ldots\wedge C_{m}$ is satisfiable. We will bound $\mathbb{E}M_{m,i}$ from above.

Pick any $u\in L_{i}$ , and let $u^{1},\ldots,u^{n-i}$ be the vertices in $L_{i-1}$ adjacent to $u$ . The probability that $u$ is a locally minimal solution can then be written as

[TABLE]

The event $\{u^{j}\in K_{m}\}$ is the union over all $t\in[m]$ of the events $\{u^{j}\in C_{t}\}$ . For any fixed $t$ , the events $\{u^{j}\in C_{t}\}$ and $\{u^{j^{\prime}}\in C_{t}\}$ are mutually exclusive conditional on $\{u\notin K_{m}\}$ , because any cube $C_{t}$ covering both $u^{j}$ and $u^{j^{\prime}}$ for some $j\neq j^{\prime}$ must also cover $u$ (by convexity of $C_{t}$ ). We can therefore consider this as a balls-and-bins problem, where the balls are clauses and the bins are vertices $u^{j}$ . Let $X_{j}:=|\{t\in m:u^{j}\in C_{t}\}|$ , i.e. the number of ‘balls’ in ‘bin’ number $j$ . Dubhashi & Ranjan [11] studied negative dependence of balls-and-bins problems, and in particular showed that the vector of the number of balls in each bin satisfies the negative association property (theorem 13 of that paper).

So $X=(X_{1},X_{2},\ldots X_{n-i})$ is a negatively associated vector, and proposition 4 from the same paper gives the following inequality:

[TABLE]

The left hand side is precisely ${\mathbb{P}(u^{1},u^{2},\ldots u^{n-i}\in K_{m}|u\notin K_{m})}$ , while each factor on the right hand side is ${\mathbb{P}(u^{j}\in K_{m}|u\notin K_{m})}$ , whence

[TABLE]

Claim 7:

The probability of a $p$ -biased random cube covering $u^{1}$ but not $u$ is $\frac{(1+o(1))p}{\eta_{p}(x)}\cdot\frac{k}{n}q(x)$

Proof. Assume WLOG that $u^{1}_{1}=1$ , $u_{1}=-1$ (and thus $u^{1}_{j}=u_{j}$ for $j>1$ ). For a cube to cover $u^{1}$ but not $u$ , it must have a $1$ in the first position. This happens with probability $p\frac{k}{n}$ . The other $n-1$ positions can be seen as a $(k-1)$ -co-dimensional sub-cube of $\Sigma_{n-1}$ , which must cover the vertex $(u_{2},u_{3},\ldots,u_{n})\in\Sigma_{n-1}$ . This happens with probability

[TABLE]

The claim follows. ∎

Using claim 7, we can calculate the conditional probability of $u^{1}$ not being covered.

[TABLE]

from which it follows that $u$ is a locally minimal solution with probability at most

[TABLE]

Recall that $M_{c,x}=M_{c,x}(n,p)$ is the number of locally minimal solutions (in layer $xn$ ) to a $p$ -biased random $k$ -SAT instance with $cn$ clauses. We can estimate the expected number of minimal solutions to be at most

[TABLE]

Let $c=2p^{1-k}$ . We will use the following bound (valid for sufficiently small $p$ and $n=\omega_{p}(1)$ ).

[TABLE]

Then $\frac{c}{2x}(x+p)^{k}>\frac{1}{2}\cdot\frac{p}{x}(\frac{x}{p}+1)^{k}$ . Let $\varphi_{k}(t):={(t+1)^{k}}/{t}$ . This function is minimized for $t={1}/{(k-1)}$ , giving the inequality $\varphi_{k}(t)\geq{(1+\frac{1}{k-1})^{k}}/{\frac{1}{k-1}}>(k-1)e$ . In particular, for $t=x/p$ we have that

[TABLE]

which in turn gives the following bound on $U$ :

[TABLE]

The right hand side is less than $-x/2$ for $k\geq 3$ . Hence ${\mathbb{E}M_{c,x}\leq e^{-nx/2}}$ , which for $x\geq 1/\sqrt{n}$ is at most $e^{-\sqrt{n}/2}$ . But for smaller $x$ , it suffices to note that $M_{c,x}\leq Z_{c,x}$ , and

[TABLE]

Thus, by Markov’s inequality,

[TABLE]

In other words, $\alpha_{k}(p)\leq c=2p^{1-k}$ for $p$ small enough. ∎

5 Satisfiability threshold for bias near 0

In this section we will prove the main theorem of this paper777From now on, we will work with the continuous-time version of the biased $k$ -SAT problem. , which describes the shape of the threshold near $p=\frac{1}{2}$ . We will work with the parametrization $p=\frac{1}{2}-b$ for some small $b$ .

Theorem 9:

For any $k\geq 3$ , there exists a constant ${K_{k}<2^{8k}}$ such that for all sufficiently small $b$ ,

[TABLE]

The upper bound on $K_{k}$ can be improved slightly to $2^{6k-4}$ by optimizing the choice of parameters in the proof of Lemma 12 due to Chvátal-Szemeredi [6], but in order to achieve a sub-exponential upper bound Lemma 19 would need to be significantly sharpened.

As an aside, what should we expect the correct quadratic coefficient to be? While the lower bound $1+2kb^{2}$ matches our exact result of $1+4b^{2}+O(b^{4})$ for $k=2$ , the upper and lower bounds for larger bias suggest a larger quadratic coefficient for $k\geq 3$ . The threshold curve for $p$ near [math] or $1$ scales like $(4p(1-p))^{1-k}$ , and if it follows that curve for $p$ near $1/2$ too, we get the Taylor expansion $1+4(k-1)b^{2}+O(b^{4})$ near $b=0$ . One might therefore guess that $4(k-1)$ is the correct coefficient. This also matches the result for $k=2$ .

Before we can continue with the proof of Theorem 9, we will need some background on spine variables.

Definition 10:

Let $\Phi$ be a satisfiable formula and $x$ a variable in it. We say that $x$ is a spine variable in $\Phi$ if $x$ has the same value in any assignment satisfying $\Phi$ . If such an $x$ always has value ‘TRUE’, we say that it is a positive spine variable and that it is locked to TRUE. (Similarly for negative.)

For $\Phi$ an unsatisfiable formula, we say that $x$ is a spine variable in $\Phi$ if there exists a satisfiable formula $F\subset\Phi$ such that $x$ is a spine variable in $F$ .

We will use the following definition and lemma from Chvátal-Szemerédi [6].

Definition 11:

Let $x,y>0$ . A $k$ -uniform hypergraph with $n$ vertices is ( $x$ , $y$ )-sparse if every set of $s\leq xn$ vertices contains at most $ys$ edges.

Lemma 12 (Chvátal-Szemerédi):

Let $k,t>0$ and $y>1/(k-1)$ . Then w.h.p. the random $k$ -uniform hypergraph with $n$ vertices and $tn$ edges is $(x,y)$ -sparse, where

[TABLE]

Using this lemma, Boetcher, Istrate & Percus proved that the number of spine variables is either $o(n)$ or at least $\delta n$ for some $\delta>0$ (part 1 of theorem 3 in [17]). But their proof actually works even when replacing $o(n)$ with [math], leading to the following theorem.

Theorem 13 (Boetcher-Istrate-Percus888A special case of their Theorem 3, using our notation):

For $k\geq 3$ , there is a constant $\delta=\delta(t,k)>0$ such that the following holds with high probability: for every unsatisfiable $\Phi=\Phi_{k}^{\frac{1}{2}-b}(t^{\prime}n,n)$ with $t^{\prime}<t$ , the number of spine variables of $\Phi$ is either [math] or at least $\delta n$ .

The following lemma is a consequence of the proofs of Lemmas 12 and 13, and we state it without proof.

Lemma 14:

For a fixed $k$ , $\delta=\delta(t,k)$ is a decreasing function of $t$ . Furthermore, for a fixed $t<2^{k}$ , $\delta=\exp(-O(k))$ .

Next, we will need a lemma that gives a correspondence between spine variables and clauses that turn a satisfiable formula into an unsatisfiable one.

Lemma 15:

Let $\Phi$ be a satisfiable $k$ -CNF with a set $S_{+}\subseteq[n]$ of positive spine variables and a set $S_{-}\subseteq[n]$ of negative spine variables. Then $\Phi\wedge C$ is unsatisfiable if and only if $C$ can be written as

[TABLE]

for some $K_{\pm}\subseteq S_{\pm}$ .

In other words, $\Phi\wedge C$ is unsatisfiable iff every variable that occurs in $C$ is a spine variable in $\Phi$ , and the sign that it has in $C$ is incompatible with the truth value it is locked to in $\Phi$ .

Proof. For the ‘if’ part, assume $C$ is of the above form. If $x\in\{\pm 1\}^{n}$ is a solution to $\Phi$ , it has $x_{i}=-1$ for $i\in S_{-}$ , so ${\bigvee_{i\in K_{-}}x_{i}=-1}$ . Similarly, ${\bigvee_{i\in K_{+}}-x_{i}=-1}$ . But then $C(x)=-1$ , so $x$ is not a solution to $\Phi\wedge C$ .

On the other hand, if $x$ is not a solution to $\Phi$ it is not a solutions to $\Phi\wedge C$ . Hence $\Phi\wedge C$ is unsatisfiable, proving the ‘if’ part.

For the ‘only if’ part, assume instead $C$ is not of that form. Then either

There exists an $i\in S_{-}$ such that $x_{i}$ occur in $C$ with negative sign. In that case, any $x$ that satisfies $\Phi$ will also satisfy $C$ , because such an $x$ will have $x_{i}=-1$ , and $C$ will have a term $-x_{i}$ . Hence $\Phi\wedge C$ is satisfied by $x$ . 2. 2.

There exists an $i\in S_{+}$ such that $x_{i}$ occur in $C$ with positive sign. Analogously to the previous case, any $x$ that satisfies $\Phi$ will also satisfy $C$ . 3. 3.

There exists an $i\notin S_{+}\cup S_{-}$ such that $x_{i}$ occur in $C$ . In that case, there exist $x,x^{\prime}$ with $x_{i}\neq x^{\prime}_{i}$ that satisfy $\Phi$ (otherwise $x_{i}$ would have been a spine variable!). Either $x$ or $x^{\prime}$ will satisfy $C$ , so $\Phi\wedge C$ is satisfiable.

So in any case, $\Phi\wedge C$ is satisfiable. ∎

Overview of proof idea

In the remainder of this section, we will assume that $p=\frac{1}{2}-b$ for some small positive $b$ , and work with $b$ rather than $p$ .

Let $P(t,b)$ be the probability that $\Phi_{k}^{\frac{1}{2}-b}(tn,n)$ (a $(\frac{1}{2}-b)$ -biased $k$ -CNF) is satisfiable. By studying the partial derivatives of $P$ and estimating the ratio between them, we derive a pair of differential inequalities for the implicit function given by $P(t,b)=\frac{1}{2}$ . Solving these inequalities then gives us an upper and a lower bound on the satisfiability threshold.

The $t$ -derivative is given by the probability of making the formula unsatisfiable by adding one more clause. Lemma 15 gives us a complete description, in terms of spine variables, of when adding a new clause can turn a satisfiable formula into an unsatisfiable one.

In order to calculate the $b$ -derivative, we employ Margulis-Russo’s formula from percolation theory (Proposition 17). This will result in something very similar to the $t$ -derivative, only depending on the signs of variables slightly differently. To bridge that gap, we study the effects of re-randomizing the signs that spine variables occur with in clauses, conditional on certain other spine variables not being affected (Lemmas 19, 21 and 20).

Margulis-Russo’s formula was proven by Russo [27] specifically for indicator random variables of increasing events, but the event we are interested in (satisfiability) is not monotone with respect to changing signs. However, in Grimmett’s textbook on percolation theory [16] there is a generalization of Russo’s formula (thm 2.32) to any real-valued random variable. Here is a version of that theorem, restated with our notation and for finite-dimensional product spaces.

Theorem 16 (Russo’s formula, finite case):

Let $I$ be a finite set, let the probability space $\mathcal{S}=\{-1,1\}^{I}$ be equipped with the product measure where $\mathbb{P}(s_{i}=-1)=p$ for any $i\in I$ , and let $X$ be a real-valued random variable on $\mathcal{S}$ . For any $s\in\mathcal{S}$ , let $s^{\pm i}$ be $s$ but with the $i$ -coordinate set to $\pm 1$ . Furthermore, let the pivotal $\delta^{i}X$ be defined by

[TABLE]

Then, for any $0<p<1$ ,

[TABLE]

This theorem allows us to calculate the rate of change of $\mathbb{E}_{p}[X]$ by studying the expected effect of ‘local’ changes to $s$ . We will apply it with $X=\mathbf{1}_{\Phi\in\textrm{SAT}}$ .

Proof of Theorem 9

Proposition 17:

Let $C$ be a $b$ -biased random $k$ -clause, and let $C_{\pm}$ be $C$ but with the sign of the first variable changed to $\pm$ . Furthermore, let $\rho_{\pm}:={\mathbb{P}\big{(}\Phi\wedge C_{\pm}\notin\operatorname{SAT},\Phi\in\operatorname{SAT}\big{)}}$ and $\rho:=\mathbb{P}\big{(}\Phi\wedge C\notin\operatorname{SAT},\Phi\in\operatorname{SAT}\big{)}$ . Then

[TABLE]

Proof. The $t$ -derivative is trivial; the prefactor comes from scaling time by $n$ . For the $b$ -derivative, let $X=\mathbf{1}_{\Phi\in\textrm{SAT}}$ and apply Russo’s formula. Let $M$ be the set of clauses in $\Phi(t,b)$ .

[TABLE]

where $\delta^{x,C}$ is the pivot of the variable $x$ in the clause $C$ . Now, pick an arbitrary $C\in M$ . Let $x$ be the first variable in $C$ . The signed pivotal $\delta^{x,C}X$ is $+1$ if $\Phi\wedge C_{+}$ is unsatisfiable and $\Phi\wedge C_{-}$ is satisfiable, $-1$ if the reverse holds, and [math] otherwise. Thus

[TABLE]

Since every term in the sum in eq. 5 above have same expected value, we can apply Wald’s equation to arrive at

[TABLE]

Noting that $\mathbb{E}|M|=nt$ , the proposition follows. ∎

Lemma 18:

For $\rho$ , $\rho_{+}$ and $\rho_{-}$ defined as in Proposition 17 and $\delta$ as in Theorem 13,

[TABLE]

In order to prove Lemma 18, we will need the estimates in Lemmas 19, 20 and 21.

Lemma 19:

For $\Phi$ conditioned on being satisfiable and having a spine variable $x$ , the probability that $x$ is locked to ‘TRUE’ is at most $\frac{1}{2}+\frac{tk}{\delta}b$ .

Lemma 20:

Let $\Gamma$ be the event that $\Phi$ is satisfiable and that the variables $x_{1},\ldots x_{s}$ are spine variables in $\Phi$ , of which $x_{2},x_{3},\ldots x_{s}$ are locked to signs $\sigma_{2},\sigma_{3},\ldots\sigma_{s}$ (for some $\sigma_{i}=\pm 1$ ). Then, conditional on $\Gamma$ , the probability that $x_{1}$ is locked to ‘TRUE’ is between $\frac{1}{2}-\frac{tk}{\delta}b$ and $\frac{1}{2}+\frac{tk}{\delta}b$ .

Furthermore, if we let $\sigma=\pm 1$ with probability $\frac{1}{2}\pm b$ , then (conditional on $\Gamma$ ) the probability that $x_{1}$ is locked to $\sigma$ is between $\frac{1}{2}-\frac{tk}{\delta}b^{2}$ and $\frac{1}{2}+\frac{tk}{\delta}b^{2}$

Lemma 21:

For $\Phi$ conditioned on being satisfiable and having a spine variable $x$ , the probability that $x$ is locked to ‘TRUE’ is at least $\frac{1}{2}+b$ .

Proof of Lemma 18. Assume (wlog) that the variables in $C$ are $x_{1},\ldots x_{k}$ and they occur with signs $\sigma_{1},\ldots\sigma_{k}$ . Let the events $E_{1}$ , $E_{2}$ , $E_{2}^{-}$ and $E_{2}^{+}$ be defined in the following way:

$E_{1}$ :

$\Phi$ is unsatisfiable and the $k$ variables in clause $C_{\pm}$ are spine variables in $\Phi$ .

$E_{2}$ :

These $k$ variables all occur with the opposite sign (in $C$ ) to the sign they are locked to in $\Phi-C$ .

$E^{\pm}_{2}$ :

These $k$ variables all occur with the opposite sign (in $C_{\pm}$ ) to the sign they are locked to in $\Phi-C_{\pm}$ .

First, note that the event $\{\Phi\notin\operatorname{SAT},\Phi-C\in\operatorname{SAT}\}$ happens if and only if the events $E_{1}$ and $E_{2}$ happen, and similarly for $C_{\pm}$ , $E_{1}$ and $E_{2}^{\pm}$ . Thus $\rho_{\pm}=\mathbb{P}(E_{1},E^{\pm}_{2})$ and $\rho=\mathbb{P}(E_{1},E_{2})$ . But $\mathbb{P}(E_{1},E_{2})=\mathbb{P}(E_{2}|E_{1})\cdot\mathbb{P}(E_{1})$ , and similarly for $\pm$ . It follows that

[TABLE]

We now want to estimate the probabilities $\mathbb{P}(E^{\pm}_{2}|E_{1})$ . Starting with $E^{+}$ , we see that

[TABLE]

We will use Lemmas 19 and 21 to estimate the first probability.

[TABLE]

For all the subsequent probabilities, we use Lemma 20.

[TABLE]

For $b$ sufficiently small, the product of these $k-1$ probabilities is at most

[TABLE]

and similarly at least ${2^{1-k}(1-\frac{2tk^{2}}{\delta}b^{2})}$ . Together these estimates yields

[TABLE]

Similarly,

[TABLE]

Using these bounds, we see that

[TABLE]

Finally, to estimate $\mathbb{P}(E_{2}|E_{1})$ , simply note that it is a weighted mean of $\mathbb{P}(E^{+}_{2}|E_{1})$ and $\mathbb{P}(E^{-}_{2}|E_{1})$ , both of which are $2^{-k}(1\pm O(b))$ . Thus, cancelling the prefactors of $2^{-k}$ , we get that

[TABLE]

The lemma follows. ∎

Proof of Lemma 19. Let $\Gamma$ be the event ${\{\Phi\in\operatorname{SAT},s(\Phi)\neq 0\}}$ . We start by conditioning on $\Gamma$ and on $x$ having degree $d$ in the hypergraph of $\Phi$ . Let $C_{1},\ldots,C_{d}$ be the clauses containing $x$ , and let $s_{i}$ be sign $x$ occurs with in $C_{i}$ . Let $f$ be the function $f:\Sigma_{d}\to\Sigma_{1}$ such that $f(\sigma)=1$ iff the formula obtained by replacing $x$ in $C_{i}$ with $\sigma_{i}$ for every $i$ is satisfiable (otherwise $f(\sigma)=-1$ ). Note that $f$ is non-decreasing: satisfying more clauses can only make the rest of the formula easier to satisfy.

When trying to find a satisfying assignment to $\Phi$ , we can either satisfy all clauses containing $x$ or all clauses containing $-x$ . If $f(\mathbf{s})=f(-\mathbf{s})=-1$ , then the formula $\Phi$ is unsatisfiable regardless of the value we assign to $x$ , and similarly if $f(\mathbf{s})=f(-\mathbf{s})=1$ it is always satisfiable. But if $f(\mathbf{s})\neq f(-\mathbf{s})$ then $\Phi$ is satisfiable and $x$ is a spine variable locked to $\frac{1}{2}(f(\mathbf{s})-f(-\mathbf{s}))$ .

We therefore let $g(\sigma):=\frac{1}{2}(f(\sigma)-f(-\sigma))$ . The function $g$ is both non-decreasing and odd.

Now, pick a $b$ -biased random $\mathbf{b}\in\Sigma_{d}$ conditional on $g(\mathbf{b})\neq 0$ . Construct a new formula $\Phi^{\prime}$ from $\Phi$ by replacing the signs $\mathbf{s}$ that $x$ occur with in $\Phi$ with $\mathbf{b}$ . Because of the conditioning, $x$ is a spine variable in $\Phi^{\prime}$ too, and $\Phi^{\prime}$ is satisfiable. (Note that $S(\Phi)$ is not necessarily equal to $S(\Phi^{\prime})$ , we only know that $x$ belongs to both.) What is the expected value of $g(\mathbf{b})$ ?

Define $W(\sigma)$ to be the probability of $\mathbf{b}=\sigma$ , i.e. $W(\sigma)=\left(\frac{1}{2}+b\right)^{h}\left(\frac{1}{2}-b\right)^{d-h}$ where $h$ is the number of $+1$ ’s in $\sigma$ . Furthermore, define $w(g)$ as expected value of $g(\mathbf{b})$ , conditional on $g(\mathbf{b})\neq 0$ , i.e.

[TABLE]

We will upper bound $\frac{W(\sigma)-W(-\sigma)}{W(\sigma)+W(-\sigma)}$ , and thus get an upper bound for $w(g)$ . Cancelling common factors of $W(\sigma)$ and $W(-\sigma)$ , we see that

[TABLE]

But for any integer $a$ ,

[TABLE]

Noting that $\binom{a}{i-1}/\binom{a}{i}$ is at most $a$ , we see that the above expression is at most $2ab$ . Hence the expression $\frac{|W(\sigma)-W(-\sigma)|}{W(\sigma)+W(-\sigma)}$ is at most ${2b\cdot|d-2h|\leq 2db}$ , and it follows that $w(g)\leq 2db$ . So the expected sign of $x$ is at most $2db$ , or in other words

[TABLE]

We don’t know the average degree of spine variables, but we do know the average degree of all variables, and that there are at least $\delta n$ spine variables. This gives us the upper bound

[TABLE]

But $\mathbb{E}[\deg(x)]=tk$ , and $\mathbb{P}(x\in S|\Gamma)\geq\delta>0$ by Theorem 13. Thus

[TABLE]

∎

Proof of Lemma 20. We cannot simply re-randomize the signs that $x_{1}$ appear with, conditional on it remaining a spine variable, because that could change whether or not $x_{2}$ (say) is a spine variable.

What we can do, however, is re-randomize the signs of $x_{1}$ conditional on $S_{+}$ and $S_{-}$ being unchanged. Consider $D\in\Sigma_{d}$ of all sign vectors $\sigma=(\sigma_{1},\ldots\sigma_{d})$ such that replacing the original signs of $x_{1}$ with $\sigma$ will not change $S_{+}$ or $S_{-}$ . This set $D$ is symmetric, $D=-D$ , so its elements comes in anti-podal pairs.

Now the proof from Lemma 19 carries through as before, giving the upper bound of the corollary. For the lower bound, replace $b$ with $-b$ throughout. ∎

Before we continue with the proof of Lemma 21, we will need the following definition and theorem concerning the possible sizes of simplicial complexes. The theorem was proven independently in [20, 18].

Definition 22:

For positive integers $N$ and $r$ , the $r$ -cascade of $N$ is defined 999Our indices here are slightly non-standard; usually one works with $n_{i+1}:=a_{i}-i$ . as the unique non-increasing sequence of positive integers $a_{i}$ such that

[TABLE]

Given such $a_{i}$ , we define $N^{(r)}:=\binom{a_{0}}{r+1}+\binom{a_{1}-1}{r}+\ldots+\binom{a_{j}-j}{r-j+1}$ .

Theorem 23 (Kruskal-Katona):

For an integral vector $f$ , there exists a $d$ -dimensional simplicial complex $\Delta$ such that $f=f_{\Delta}$ if and only if $0\leq f_{r}\leq f_{r-1}^{(r)}$ for every $0\leq r\leq d$ .

Proof of Lemma 21. Recall that $w(g)$ is the expected sign of a variable, conditional on $g(\mathbf{b})\neq 0$ .

Now, $\Delta=g^{-1}(1)$ is a simplicial complex (equal to $f^{-1}(1)$ or a subcomplex of it), and this correspondance is a bijection, so $w$ is determined by $\Delta$ . Not only that, $w$ only depends on the number of $1$ ’s of $g$ at each level of the hypercube $\Sigma_{d}$ , so $w$ is determined solely by the $f$ -vector $f_{\Delta}$ of $\Delta$ (the vector whose $i$ :th coordinate is the number of $i$ -faces of $\Delta$ ). So henceforth we consider $w$ to be a function of $f_{\Delta}$ .

We want to minimize $w(f)$ over the set of $f$ ’s that can be written as $f=f_{\Delta}$ for some $\Delta$ . The Kruskal-Katona theorem gives sufficient and necessary conditions for the existence of a simplicial complex with a given $f$ -vector.

Claim 8:

If $f$ is the $f$ -vector of some $d$ -dimensional simplicial complex $\Delta$ and $f$ minimizes $w(f)$ , then $f_{r}=f^{(r)}_{r-1}$ for every $r$ . In other words, there exists a non-increasing sequence $\mathbf{a}=(a_{i})_{i=0}^{j}$ such that $f_{r}=\sum_{i=0}^{j}\binom{a_{i}-i}{r+1}$ for every $r$ .

Proof. Let $F$ be the set of all integral $(d+1)$ -vectors $f$ such that $f=f_{\Delta}$ for some simplicial complex $\Delta$ . We want to find $\min_{f\in F}w(f)$ .

Note that for every $r$ such that $r>\frac{d}{2}$ , increasing $f_{r}$ by $1$ will decrease $w(f)$ slightly, but if $r<\frac{d}{2}$ doing so will increase it slightly. (For $r=\frac{d}{2}$ , changing $f_{r}$ has no effect on $w(f)$ .) So we want to increase $f_{r}$ for big $r$ ’s and decrease $f_{r}$ for small $r$ ’s, whenever possible.

We therefore let $r_{*}:=\lceil\frac{d}{2}\rceil$ , and consider an arbitrary fixed integer $\ell$ such that $1\leq\ell\leq\binom{d}{r_{*}}$ . Let $F_{\ell}$ be the set of all $f\in F$ such that $f_{r_{*}}=\ell$ .

We now aim to find the $f$ that achieves $\min_{f\in F_{\ell}}w(f)$ . Let $\mathbf{a}$ be the $r_{*}$ -cascade of $\ell$ .

By Kruskal-Katona, for any $r>r_{*}$ we have that $f_{r}\leq\binom{a_{0}}{r}+\ldots+\binom{a_{j}-j}{r-j}$ and that this bound is tight. So we can assume wlog that this holds with equality for all such $r$ . Similarly, for any $r<r_{*}$ we have that wlog $f_{r}=\binom{a_{0}}{r}+\ldots+\binom{a_{j}-j}{r-j}$ .

So for any $f$ that minimizes $w(f)$ over $F_{\ell}$ , we have that $w(f)$ is determined by the $r_{*}$ -cascade of $f_{r_{*}}$ . It follows that for any non-increasing sequence of non-negative integers $\mathbf{a}$ , we can define a simplicial vector $f(\mathbf{a})$ by letting, for each $r$ ,

[TABLE]

This will be the unique minimizer (in $F_{\ell}$ ) of $w$ . ∎

Since $f(\mathbf{a})$ satisfies Kruskal-Katona (by design), there exists a simplicial complex $\Delta$ (not necessarily unique) with simplicial vector $f(\mathbf{a})$ . The size of $\Delta$ is given by

[TABLE]

where the last inequality comes from noting that $2^{a_{i}-i}\leq 2^{a_{0}-i}$ . But $|\Delta|$ cannot be larger than $2^{d-1}$ , because $\Delta$ and $-\Delta$ are disjoint. (By definition, $\Delta=g^{-1}(1)$ , and by symmetry $-\Delta:=g^{-1}(-1)$ .) It follows that either $a_{0}=d-1$ and $j=0$ , or $a_{0}\leq d-2$ . In the former case, $|\Delta|=2^{d-1}$ , and in the latter case $|\Delta|<2^{d-1}$ regardless of the values of $a_{1},a_{2}\ldots a_{j}$ and $j$ .

Claim 9:

For a non-increasing sequence $\mathbf{a}=(a_{0},\ldots a_{j})$ ,

[TABLE]

and $S$ is a strictly decreasing function of both of its arguments.

Proof. Consider a subcube of $\Sigma_{d}$ whose lowest corner is at level $i$ and whose highest corner is at level $a_{i}$ . For any $r$ , this cube will have $\binom{a_{i}-i}{r-i}$ vertices at level $r$ , exactly matching the $i$ -term in the $r$ -cascade of $f_{r}$ . One such cube is

[TABLE]

The probability of a $b$ -biased random string following the pattern above is precisely ${(\frac{1}{2}-b)^{i}(\frac{1}{2}+b)^{d-a_{i}}}$ , so its $w$ -weight will be

[TABLE]

The claim follows. ∎

Claim 10:

The minimum of $w(f(\mathbf{a}))$ is either achieved by $\mathbf{a}=a$ or $\mathbf{a}=\underbrace{(a,\ldots,a)}_{d+1}$ for some integer $a$ .

Proof. We will use that $\sum_{i=0}^{j}2^{a_{i}-i}\leq 2^{d-1}$ . If $a_{0}=d-1$ we must have $\mathbf{a}$ equal to the $1$ -term sequence $(d-1)$ , because $2^{a_{0}}=2^{d-1}$ is already as large as the sum can be. So assume instead that $a_{0}\leq d-2$ .

First, assume $\mathbf{a}$ is not a constant sequence, seeking a contradiction. Then there exists an $i$ with $a_{i-1}>a_{i}$ . Let $\mathbf{a}^{\prime}$ be $\mathbf{a}$ but with $a_{i}$ replaced by $a_{i}+1$ . (I.e. $\mathbf{a}^{\prime}:=a_{0},\ldots,a_{i-1},a_{i}+1,a_{i+1},\ldots,a_{j}$ .) The sequence $\mathbf{a}^{\prime}$ is still non-increasing, and by claim 9, $w(f(\mathbf{a}^{\prime}))<w(f(\mathbf{a}))$ .

Next, assume $a=a_{1}=a_{2}=\ldots=a_{j}$ for some $0<j<d$ , again seeking a contradiction. Then, if $S(j,a)>0$ , decrease the length of $\mathbf{a}$ by $1$ , which decreases $w$ by $S(j,a)$ . If instead $S(j,a)\leq 0$ , increase the length of $\mathbf{a}$ by $1$ , which decreases $w$ by $-S(j+1,a)>-S(j,a)\geq 0$ . ∎

We are now left with only the following candidates for $\mathbf{a}$ : $(a)$ , for some $a\leq d-1$ , or $(a,\ldots,a)$ for some $a\leq d-2$ . It is easy to check that $w(f(a))<w(f(a-1,a-1,\ldots,a-1))$ . Furthermore, $w(f(a))=(\frac{1}{2}+b)^{d-a}-(\frac{1}{2}-b)^{d-a}$ is decreasing in $a$ , so the minimum is achieved for $a=d-1$ . It follows that

[TABLE]

∎

We now have all the ingredients necessary to prove Theorem 9.

Proof of Theorem 9. Let $t=\psi(b)$ be the implicit function defined by $P(t,b)=\frac{1}{2}$ . The derivative of $\psi$ is given by

[TABLE]

By Proposition 17, $\partial P/\partial b=nkt(\rho_{+}-\rho_{-})$ and $\partial P/\partial t=n\rho$ . So $\psi^{\prime}=\frac{(\rho_{+}-\rho_{-})}{\rho}k\psi$ . But by Lemma 18, $4b\leq\frac{\rho_{+}-\rho_{-}}{\rho}\leq\frac{4tk^{2}}{\delta}b$ , which leads to the following pair of differential inequalities:

[TABLE]

We will relax these inequalities slightly. First, we see that $\psi^{\prime}>0$ , so $\psi(b)\geq\psi(0)$ . By Lemma 14, $\delta=\delta(t)$ is a decreasing function, and we may replace the function $\delta$ with the constant $\delta_{0}:=\delta(\psi(0))\geq\exp(-O(k))$ in eq. 6.

This leads to the differential inequality $\psi^{\prime}(b)\leq\frac{4k^{3}}{\delta_{0}}\psi^{2}(b)b$ , which has the solution $\psi(b)\leq{\psi(0)}\big{(}1-\frac{2k^{3}\psi(0)}{\delta_{0}}b^{2}\big{)}^{-1}$ . Noting that $\psi(0)\leq 2^{k}\log 2$ , we see that $\psi(b)<2^{k}$ for small enough $b$ . We may therefore also replace $\psi^{2}$ with $2^{k}\psi$ in the right-hand side of eq. 6, leading to

[TABLE]

and this new pair of of differential inequalities has the solution

[TABLE]

Using the lower bound $\delta_{0}\geq\exp(-O(k))$ again, the theorem follows. ∎

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. Achlioptas. Lower bounds for random 3-SAT via differential equations. Theoretical Computer Science , 265(1-2):159–185, 2001.
2[2] D. Achlioptas and C. Moore. Random $k$-sat: Two moments suffice to cross a sharp threshold. SIAM J. Comput. , 36(3):740–762, Sept. 2006.
3[3] P. Austrin. Balanced max \max 2 2 2 -sat might not be the hardest. In Proceedings of the Thirty-ninth Annual ACM Symposium on Theory of Computing , STOC ’07, pages 189–197, 2007.
4[4] B. Bollobás, C. Borgs, J. T. Chayes, J. H. Kim, and D. B. Wilson. The scaling window of the 2 2 2 -SAT transition. Random Structures and Algorithms , 18(3):201–256, 2001.
5[5] M.-T. Chao and J. Franco. Probabilistic analysis of two heuristics for the 3 3 3 -satisfiability problem. SIAM Journal on Computing , 15(4):1106–1118, 1986.
6[6] V. Chvátal and E. Szemerédi. Many hard examples for resolution. Journal of the Association for Computing Machinery , 35(4):759–768, 1988.
7[7] V. Chvátal and B. Reed. Mick gets some (the odds are on his side). In Foundations of Computer Science, 1992. Proceedings., 33rd Annual Symposium on , pages 620–627, 1992.
8[8] A. Coja-Oghlan. The asymptotic k-SAT threshold. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing , STOC ’14, pages 804–813, 2014.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Biased random kkk-SAT

Abstract

1 Introduction

1.1 Biased k-SAT

1.2 Structured coupon collection

1.3 Structure of paper

2 Satisfiability threshold for biased random 2-SAT

Theorem 1 (CFS):

Theorem 2:

3 Algorithmic lower bound on satisfiability threshold

Theorem 3:

Lemma 4:

Lemma 5:

4 Method of moments bounds on satisfiability threshold

4.1 Vanilla first and second moment methods

Claim 1:

Claim 2:

Claim 3:

Claim 4:

Claim 5:

Proposition 6 (Bounds on first moment threshold):

Theorem 7:

Claim 6:

4.2 Improved first moment method

Proposition 8:

Claim 7:

5 Satisfiability threshold for bias near 0

Theorem 9:

Definition 10:

Definition 11:

Lemma 12 (Chvátal-Szemerédi):

Theorem 13 (Boetcher-Istrate-Percus888A special case of their Theorem 3, using our notation):

Lemma 14:

Lemma 15:

Overview of proof idea

Theorem 16 (Russo’s formula, finite case):

Proof of Theorem 9

Proposition 17:

Lemma 18:

Lemma 19:

Lemma 20:

Lemma 21:

Definition 22:

Theorem 23 (Kruskal-Katona):

Claim 8:

Claim 9:

Claim 10:

Biased random $k$ -SAT