Sharp thresholds in inference of planted subgraphs
Elchanan Mossel, Jonathan Niles-Weed, Youngtak Sohn, Nike Sun, Ilias, Zadik

TL;DR
This paper investigates the sharp threshold phenomena in the inference of planted subgraphs within Erdős–Rényi graphs, establishing a connection between all-or-nothing inference transitions and generalized expectation thresholds.
Contribution
It introduces a unified framework linking sharp threshold behavior in graph inference to first moment and expectation thresholds, extending understanding of the AoN phenomenon.
Findings
AoN occurs iff the generalized expectation threshold is roughly constant in q
Characterization of AoN via first moment and expectation thresholds
Bridging random graph theory and Bayesian inference techniques
Abstract
A major question in the study of the Erd\H{o}s--R\'enyi random graph is to understand the probability that it contains a given subgraph. This study originated in classical work of Erd\H{o}s and R\'enyi (1960). More recent work studies this question both in building a general theory of sharp versus coarse transitions (Friedgut and Bourgain 1999; Hatami, 2012) and in results on the location of the transition (Kahn and Kalai, 2007; Talagrand, 2010; Frankston, Kahn, Narayanan, Park, 2019; Park and Pham, 2022). In inference problems, one often studies the optimal accuracy of inference as a function of the amount of noise. In a variety of sparse recovery problems, an ``all-or-nothing (AoN) phenomenon'' has been observed: Informally, as the amount of noise is gradually increased, at some critical threshold the inference problem undergoes a sharp jump from near-perfect recovery to near-zero…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
Sharp thresholds in inference of planted subgraphs
Elchanan Mossel*⋆∘*, Jonathan Niles-Weed†, Youngtak Sohn⋆, Nike Sun⋆,
and Ilias Zadik⋆
Abstract.
We connect the study of phase transitions in high-dimensional statistical inference to the study of threshold phenomena in random graphs.
A major question in the study of the Erdős–Rényi random graph is to understand the probability, as a function of , that contains a given subgraph . This was studied for many specific examples of , starting with classical work of Erdős and Rényi (1960). More recent work studies this question for general , both in building a general theory of sharp versus coarse transitions (Friedgut and Bourgain 1999; Hatami, 2012) and in results on the location of the transition (Kahn and Kalai, 2007; Talagrand, 2010; Frankston, Kahn, Narayanan, Park, 2019; Park and Pham, 2022).
In inference problems, one often studies the optimal accuracy of inference as a function of the amount of noise. In a variety of sparse recovery problems, an “all-or-nothing (AoN) phenomenon” has been observed: Informally, as the amount of noise is gradually increased, at some critical threshold the inference problem undergoes a sharp jump from near-perfect recovery to near-zero accuracy (Gamarnik and Zadik, 2017; Reeves, Xu, Zadik, 2021). We can regard AoN as the natural inference analogue of the sharp threshold phenomenon in random graphs. In contrast with the general theory developed for sharp thresholds of random graph properties, the AoN phenomenon has only been studied so far in specific inference settings, and a general theory behind its appearance remains elusive.
In this paper we study the general problem of inferring a graph planted in an Erdős–Rényi random graph, thus naturally connecting the two lines of research mentioned above. We show that questions of AoN are closely connected to first moment thresholds, and to a generalization of the so-called Kahn–Kalai expectation threshold that scans over subgraphs of of edge density at least . In a variety of settings we characterize AoN, by showing that AoN occurs if and only if this “generalized expectation threshold” is roughly constant in . Our proofs combine techniques from random graph theory and Bayesian inference.
⋆Department of Mathematics, MIT; ∘MIT Institute for Data, Systems, and Society; †Center for Data Science & Courant Institute of Mathematical Sciences, NYU. Email: {elmos,youngtak,nsun,izadik}@mit.edu; [email protected]
Contents
-
2.2 Statement of AoN characterization for sufficiently dense graphs
-
3 AoN in a general Bernoulli model: definitions and key sufficient conditions
-
3.3 Proofs for “all” regime by truncated first moment in planted model
-
3.4 Proofs for “nothing” regime by truncated second moment in null model
-
3.5 Connections to spread: second Kahn-Kalai conjecture in inference
-
4 Characterization of AoN at linear scale for sufficiently dense graphs
-
4.1 Theorem 2.5 forward direction: AoN implies almost-balanced
-
4.2 Theorem 2.5 reverse direction: almost-balanced implies AoN
-
5.3 Theorem 5.3 forward direction: AoN implies first-moment-flat
-
5.4 Theorem 5.3 reverse direction: first-moment-flat implies AoN
1. Introduction
We consider the statistical model of a graph planted uniformly at random in an Erdős–Rényi random graph . That is to say, the observation is the union of a uniformly random copy of in the complete graph (the signal) together with a sample from the Erdős–Rényi measure (the noise). Given the observation, the goal is to approximately recover the hidden signal , where recovery is measured in terms of the fraction of correctly recovered edges (see (2.1)). The model is formally specified in Definition 2.1 below. This paper is concerned with the characterization of sharp information-theoretic thresholds in this inference problem.
Perhaps the most canonical such setting in the literature is the planted clique model, where is a clique on vertices and [Jer92]. It is a folklore result that exact recovery of is possible when , and impossible when . However, to the best of our knowledge, the natural question of whether one can recover a constant fraction of when has not been previously considered, although strong impossibility results have been established in this regime for the slightly different detection framework [ACV14]. We note that other specific choices of subgraphs have been studied in this literature, including the case where is a tree [MST19], or a Hamiltonian cycle [BDT*+*20].
Obtaining a more refined understanding of statistical recovery guarantees in such models is further motivated by a growing body of recent work, initiated by [GZ22] and [RXZ21], which reveals that several high-dimensional Bayesian estimation models exhibit a sharp “all-or-nothing” (AoN) phase transition: a very slight change in the signal-to-noise ratio separates a regime where one can recover almost all of the hidden signal (the “all” phase) from a regime where recovering even a constant fraction of the signal is impossible (the “nothing” phase). This is contrary to prior intuition derived from high-dimensional models where the transition is much smoother, e.g., compressed sensing or generalized linear models in the “proportional regime” [RP19, BKM*+*19].
The underlying fundamental reasons why some inference models exhibit AoN, while others do not, remains — to the best of our knowledge — largely unknown. This movitates us to study the general planted subgraph setting and ask:
*Which choices of hidden graphs lead to a sharp AoN transition?
How do the graph theoretic properties of relate with sharp statistical phenomena? *
Notably, the study of sharp thresholds for the occurence of specific subgraphs in the “null” (no hidden subgraph) Erdős–Rényi random graph model has a long and celebrated history dating back to [ER60]. This literature has recently led to the striking resolution of the Kahn–Kalai conjecture [KK07, Conjecture 1], which approximately locates the critical threshold for general monotone properties [FKNP21, PP22]. The community’s notable understanding of transitions in the “null” model raises the possibility of better understanding information-theoretic transitions in the “planted” (hidden signal) models:
*How do the well-studied threshold phenomena in the “null” Erdős–Rényi model
relate with statistical threshold phenomena in the “planted” model?*
Our work is largely driven by this question.
1.1. An overview
In this paper, we aim to characterize the graph sequences for which the sharp all-or-nothing phenomenon occurs in the associated inference problem. While the planted clique model is commonly formulated with fixed and varying, for general we adopt the more suitable perspective that we fix and vary . Thus, given , we ask whether there exists a critical value such that when it is (information-theoretically) possible to recover an -fraction of edges of the planted subgraph (the “all” phase), while when it is impossible to recover any nontrivial fraction of the planted subgraph (the “nothing” phase). In other words, there is no intermediate “something” phase where one can recover a non-trivial fraction of the edges, but a non-trivial fraction must also be missed.
In this work we are able to characterize the occurrence of AoN in large families of planted subgraph models via a connection with a generalization of the expectation threshold of [KK07]. For a given , the expectation threshold is intended to approximate the critical threshold at which becomes likely to contain a copy of . To make this more precise, for any graph we define the first moment threshold to be the minimum such that contains at least one copy of in expectation. The expectation threshold is the maximum first moment threshold among all subgraphs , and the “second Kahn–Kalai conjecture” [KK07, Conjecture 2] posits that this is within a logarithmic factor of the threshold of interest . The second Kahn–Kalai conjecture has been proved for bounded graphs , see e.g. [Ruc87, Theorem 4], but remains open in general. (It is not implied by the Kahn–Kalai conjecture for monotone properties that was mentioned above, see also [MNWSZ22a]).
In this work, we define for the generalized expectation threshold to be the maximum first moment threshold among all subgraphs such that contains at least a fraction of the edges of (Definition 2.4). Then is exactly the Kahn–Kalai expectation threshold. Our main result is a characterization of AoN for a large class of hidden graphs based on structural properties of . We now state our main finding informally as follows:
Theorem 1.1** (Main result, informally stated).**
For various families of graph sequences , the model of planted in exhibits AoN if and only if is asymptotically constant as a function of .
See Theorems 2.5 for the formal statement. As a corollary, we deduce for example that the model of a -clique planted in exhibits AoN if and only if is diverging with (see Corollary 3.5 in the Appendix). Our main result as stated above may not appear readily intuitive. For this reason, while stating our theorems in the following sections, we present several illustrative examples. Moreover, in Section 2.4 we give general intuition by explaining how established results from random graph theory, alongside with the planting trick from the theory of random constraint satisfaction problems, are suggestive of such a connection.
It should be finally noted that a related, but incomparable, general investigation was initiated by [Hul22] for the information-theoretic limits of inferring a hidden induced subgraph in . By contrast, the observation in our setting is the union of with a hidden copy of , i.e., the hidden copy need not be an induced subgraph of . This difference makes our results incomparable; see also [Hul22, Section B.1] for more discussion on differences between these two models.
1.2. Further motivations
As indicated above, the problem of recovering a hidden graph is directly connected to two major lines of research:
Threshold phenomena in random graphs. Starting from the work of [ER60], a major research goal in the field of random graphs has been to understand, for any given graph , for which values of is likely to appear in an Erdős–Rényi graph . Note that may be a fixed graph, such as a triangle [ER60], but it can also be a graph whose size and structure depends on , such as a perfect matching [ER60] or Hamilton cycle [Pós76, Kor76]. Recent results on this question mainly go in one of two directions:
First, results in discrete Fourier analysis [Fri98, Fri99, Hat12] characterize general settings in which the transitions are sharp or coarse. A sharp transition means that the probability for to contain a copy of is near zero for , and near one for . A coarse transition means that the probability stays bounded away from zero and one for a non-trivial range of values . The known characterizations of coarse thresholds can be interpreted as “low complexity” conditions, while sharp thresholds correspond to properties that do not have witnesses of low complexity. There has been a number of conjectures relating sharp thresholds in graphs to computational complexity, see e.g. [KS06].
A more recent line of research aims to roughly identify the location of the threshold in terms of (variants of) the Kahn–Kalai expectation threshold [KK07, Tal10]. This will be further discussed below. We note that the Fourier analysis literature mostly does not address the location of , while the expectation threshold literature mostly does not address the sharpness of the transition. The optimal results on the Kahn–Kalai conjectures are tight only up to logarithmic factors [FKNP21, PP22]. 2. 2.
Threshold behavior of inference problems (all-or-nothing phenomena). AoN was first identified in the context of sparse linear regression [GZ22, RXZ21], and has since been established for numerous other models, including sparse tensor PCA [NWZ20], Bernoulli group testing [TAS20, NWZ21, COGHK*+*22], and random graph matching [WXS22]. A striking observation is that AoN arises for several models which are conjectured to exhibit a statistical-computational gap, although no rigorous connections currently exist.
The setting studied by [MR*+*20, NWZ20] consists of Bayesian inference problems where one observes a rank-one spike corrupted by gaussian noise. In this setting, [NWZ20] give general sufficient conditions on the prior distribution under which AoN occurs, as a function of the noise variance . These conditions amount to a quantitative “anti-concentration” requirement that independent draws from the prior are unlikely to be highly correlated.
The models considered by [RXZ21, TAS20, LMB22, NWZ21, COGHK*+*22] are generalized linear models. As in the gaussian setting, when the prior satisfies a suitable anti-concentration condition, AoN occurs (here, as a function of the number of observations). AoN for a Bernoulli model with added noise was established by [WXS22], who consider the problem of recovering the correspondence between a pair of correlated random graphs with randomly permuted vertex labels. In this last case, AoN arises as a function of the random graph density.
The similarities between sharp thresholds and AoN phenomena are quite clear. However, the settings are very different. In the random graph setting we are looking for a graph that appears at random, while in inference, there is a true signal that is corrupted by random noise. In this paper we connect the two by studying the inference problem in the random graph setting. A major theme of this paper is revealing that the all-or-nothing phenomenon in inference models are actually closely connected with the behavior of (variants of) the expectation threshold in the corresponding null models; see below for more discussion.
From a technical standpoint, the problem studied in this paper also naturally brings together the two communities: one studying properties of random graphs and the other studying inference in high-dimensions. Interestingly, the proofs in the paper combine ideas and techniques from both communities: some of our conditions are stated in terms of the expectation thresholds of subgraphs of varying sizes, thus refining definitions from [KK07]. As customary in the study of expectation of thresholds, our proofs use variants of the second moment method but also recent ideas in the community, such as the one used for the “spread lemma” [ALWZ21, FKNP21]. However, we also heavily use the Bayesian perspective, in particular the planting trick that we borrow from the study of planted constraint satisfaction problems, see also [ACO08, COGHK*+*22]. Other concepts from high dimensional inference such as the I-MMSE relation from information theory [GWSV11] and Nishimori identity from statistical physics [Nis01] also play a role in some of the proofs.
We finally note a related line of work in coding theory. The goal of coding theory is to recover code words sent over a noisy channel. The analogy to the model we study in the paper is quite clear. Our “codewords” are the planted graphs, and the “channel” is the operation of taking a union with a sample from the Erdős–Rényi measure. Many results in coding theory can be viewed as AoN statements (but for probability error metrics, not partial recovery metrics) as they show that the decoding error probability jumps sharply from zero to one as a function of the channel noise. A recent striking example was established in [KKM*+*16] for the Reed–Muller code and the erasure channel. An interesting aspect of their example is the use of variants of the KKL theorem [KKL88] from discrete Fourier analysis to prove a sharp threshold corresponding to AoN which then in turn allow to prove that these codes achieve the capacity of the channel.
2. Main results
In this section we state our first main result, Theorem 2.5, which characterizes the occurrence of AoN (at linear scale) in sufficiently dense graphs under mild technical assumptions, and describe in high level the rest of our results. This section is organized as follows:
- •
In §2.1 we formalize the planted subgraph model, and define the generalized expectation thresholds which were informally introduced above.
- •
In §2.2 we give the statement of Theorem 2.5, along with several motivating examples.
- •
In §2.3 we give an overview of some of our results beyond the setting of AoN at linear scale for sufficiently dense graphs. We present also several relevant examples.
- •
In §2.4 we offer some intuition behind our main result, based on known results in the random graphs literature.
2.1. Generalized expectation thresholds
We begin by formalizing the planted subgraph model discussed above:
Definition 2.1** (planted subgraph model).**
Let be a given graph. We will abbreviate for the number of vertices of , and for the number of edges of . We always assume . Let be the set of (isomorphic) copies of in the complete graph , and let be the uniform probability measure over . We work on the so-called planted model , the observation is where and is an independent sample from . The goal is to recover from . For comparison, we also introduce the null model where there is no hidden , and we observe simply .
Throughout the paper we identify all graphs on vertices, e.g., the instances of , with their naturally corresponding binary vectors in Our measure of “recovery” in the planted subgraph model is the fraction of correctly recovered edges, which naturally corresponds to the minimimum mean squared error (MMSE):
[TABLE]
Since almost surely, the MMSE must lie in , so we always normalize it by in what follows. It is well known that the MMSE is nondecreasing in , and we review the short proof in Lemma 3.2 below. In light of this, it is natural to ask in what situations the MMSE has a sharp transition. To this end, following the AoN literature, we make the following definition:
Definition 2.2** (all-or-nothing).**
We say that the model from Definition 2.1 exhibits an all-or-nothing (AoN) transition at critical probability if
[TABLE]
for any constant .
As mentioned above, a theme of this paper is that the location of is closely connected to first moment thresholds of the subgraphs of in the null model . For a given subgraph of the first moment threshold is the smallest value of such that the expected copies of in a sample from the null model is at least one. More formally, for let denote the number of copies of that are contained in . We define the first moment threshold to be the value of that satisfies
[TABLE]
that is, , where , the number of copies of in .
In Theorem 2.5 below, we characterize the occurrence of AoN for a class of planted subgraphs that are sufficiently dense, meaning more precisely that and
[TABLE]
in the limit (Definition 4.1).
We start our quest with identifying the AoN threshold for this class of planted subgraphs. In view of the classical random graph theory, the first natural attempt would be to ask whether coincides with the first moment threshold (2.2) of the subgraph . The answer is no, as we illustrate with Example 2.3 below. In fact, the possibility of is closely related to ideas underlying the Kahn–Kalai expectation threshold [KK07] and the threshold found by Theorem 2.5 turns out to indeed be a variant of the Kahn–Kalai expectation threshold.
Let us take a moment to discuss the first moment threshold (2.2) in more detail. Note that is the number of copies of in :
[TABLE]
where denotes the automorphism group of . It follows that
[TABLE]
for any ; and the sharpness of this trivial lower bound depends on the size of the automorphism group of . However, for graphs that are sufficiently dense (in the sense of (2.3)), the factor becomes negligible when raised to the power , so for such graphs we obtain the simplification
[TABLE]
This is formalized in Lemma 4.2 below.
Example 2.3** (AoN and first moment thresholds can differ).**
Let be a clique on vertices . We let be obtained from as follows: take additional vertices , and form an edge between vertex and vertex for each . Thus , , and . We consider the model of planted in (Definition 2.1). Assume , and note that contains most of the edges of . Since we are interested in edge recovery we expect that the inference problem for exhibits AoN at (cf. Corollary 3.5); and indeed we prove this in Theorem 2.5 below (see also Example 2.6). The first moment threshold for is much smaller than that of , so it does not match the AoN transition:
[TABLE]
(The threshold can be obtained by direct calculation, or by appealing to (2.6) or Lemma 4.2.)
The problem illustrated by Example 2.3 is closely related to the ideas underlying the Kahn–Kalai conjectures. Recall from above that the basic assertion of these conjectures is that while may be far from , there must be a subgraph for which is not too far from . Clearly, this is highly analogous to Example 2.3, where the AoN transition is driven by the clique . That is to say, the transition can be estimated by the expectation threshold
[TABLE]
In particular, the “second Kahn–Kalai conjecture” [KK07, Conjecture 2] posits that
[TABLE]
where the lower bound is trivial, and the logarithmic factor is known to be necessary. We discuss this further in Example 2.9 below.
Analogously to the Kahn–Kalai conjectures, it is natural to ask whether is related to . In this paper we study the question of locating up to factors, as opposed to logarithmic factors. At this level of precision, it turns out that does not necessarily coincide with , as illustrated by Example 2.8 below. One reason is that, in the context of AoN, since we are interested in recovery of almost all or almost none of the edges, we expect that only linear-sized subgraphs of should be relevant to the transition. For this reason, we define a slight generalization of the expectation threshold which turns out to be more relevant to the AoN question:
Definition 2.4** (generalized expectation threshold).**
For and a given graph , define the -constrained expectation threshold to be the largest first moment threshold among subgraphs of with at least fraction of the edges. That is,
[TABLE]
In particular, is the same as the Kahn–Kalai expectation threshold .
2.2. Statement of AoN characterization for sufficiently dense graphs
The following theorem characterizes AoN for sufficiently dense graphs , subject to the additional technical requirement that must be “delocalized,” meaning roughly that does not contain a sublinear sized subgraph that is particularly dense. More precisely, we require that must contain a subgraph with , such that has nearly maximal density among all subgraphs of :
[TABLE]
where and are constants not depending on . If is dense and satisfies for positive constants and , then is delocalized; see Definition 4.3 for details. For dense delocalized graphs, we have the following result, which greatly generalizes Example 2.3:
Theorem 2.5** (characterization of AoN for sufficiently dense graphs).**
Suppose is sufficiently dense ((2.3) or Definition 4.1) and delocalized ((2.10) or Definition 4.3). Then the model of planted in exhibits AoN if and only if
[TABLE]
Moreover, in this case for any .
We say that is almost balanced if it satisfies condition (2.11) (this derives from the terminology of balanced graphs; see Example 2.7 below). We illustrate Theorem 2.5 with the following:
Example 2.6** (generalization of Example 2.3).**
Suppose is sufficiently dense (as in (2.3)), and that there is a subgraph such that (i) contains most of the edges of , , and (ii) the first moment threshold of captures the expectation threshold of ,
[TABLE]
Then, for any , it follows from Definition 2.4 that This implies that is delocalized (see Definition 4.3), and satisfies condition (2.11). Therefore, Theorem 2.5 implies that the model of planted in exhibits AoN at . This generalizes Example 2.3.
Example 2.7** (dense balanced graphs).**
Suppose is sufficiently dense (in the sense of (2.3)), and balanced in the sense that has maximal edge density among all its subgraphs [ER60, Bol81]:
[TABLE]
This implies that is delocalized ((2.10) or Definition 4.3). It follows from (2.6) or Lemma 4.2 that we have for all , so the almost-balanced condition (2.11) is satisfied. Therefore, it follows by Theorem 2.5 that the model of planted in exhibits AoN at . This implies the AoN result for the planted clique model (also established by another argument in the Corollary 3.5 in the Appendix).
Given Examples 2.6 and 2.7, one might ask if it is true that always equals the expectation threshold. We next present a simple example where the AoN and expectation thresholds differ:
Example 2.8** (AoN and expectation thresholds can differ).**
Let be the disjoint union of where is a -clique while is a -clique for each , with . Then is dense, since , while
[TABLE]
Note that is the densest subgraph of , with . Using (2.6) or Lemma 4.2, we have the lower bound
[TABLE]
On the other hand, accounts for only a negligible fraction of the edges of . We will argue that if is lower bounded by any positive constant, then the density of cannot be much larger than . To this end, let us decompose where is the number of vertices in . Then
[TABLE]
We also trivially have , so in order for we must have . It follows that for all , the right-hand side above is roughly , and therefore
[TABLE]
Moreover, the bound is clearly asymptotically achieved by taking most of the to be either zero or . It follows that for we have
[TABLE]
It is straightforward to check that is delocalized ((2.10) or Definition 4.3), so it follows from Theorem 2.5 that this model exhibits AoN at , which is smaller than the expectation threshold .
2.3. Results beyond dense graphs
One reason that (sufficiently) dense graphs are easier to analyze is that in this case, the first moment threshold (2.2) can be approximated by the much simpler expression (2.6). We do not have a similarly strong characterization for general graphs, where we expect the order of the automorphism group to play a role. Indeed, in Example 2.11 below, we show that the dense assumption is necessary in Theorem 2.5. However, in this work we also present results that are able beyond the dense graphs regime under the following assumptions:
A general “nothing” phase In §3.5 we give a general “nothing” result that applies to all planted subgraphs . In Theorem 3.15 we prove that when for all , where is the generalized expectation threshold of Definition 2.4 “nothing” holds for general graphs (i.e., ). This can be interpreted as an approximate variant of the second Kahn–Kalai conjecture, since for “nothing” to appear the “noise” much have created an approximate copy of that is nearly disjoint from the signal. See Example 2.9 below for further discussion and §3.5 for more relevant references. 2. 2.
AoN for small sparse graphs In §4.3 we prove a result for the sparse regime. In Theorem 4.9 we proves AoN for models where the planted subgraph is small (of size ), sparse, and strongly balanced (Definition 4.8). The latter is a slight generalization of the notion introduced by [RV86], and is more restrictive than the balanced condition appearing in Example 2.7. It follows from Theorem 4.9 that if is a small tree or cycle — more precisely, if satisfies the bound (4.7) — then the model of planted in exhibits AoN at (see Example 4.10). 3. 3.
AoN in the exponential scale In §3.2 and Section 5 we consider AoN phenomena at exponential scale (Definition 3.6) rather than linear scale (Definition 2.2), meaning the transition is in terms of rather than itself (when “nothing” holds, but when “all” holds). This relaxation of the AoN phenomenon allows us to establish a characterization without any density assumption, and therefore taking into account the automorphism group of . In Theorem 5.3, we show that under a mild technical condition of being “first-moment-stable” (Definition 5.1), AoN occurs at the exponential scale if and only if is “first-moment-flat,” meaning that
[TABLE]
for all (see Definition 5.2, and compare with the almost-balanced condition (2.11)). We illustrate this with Example 2.10 below. We also prove a location result, Theorem 5.8, which says that the AoN and first moment thresholds must always coincide at the exponential scale.
Example 2.9** (perfect matchings).**
Let be a perfect matching on vertices, so the number of edges is . Recovering a matching planted in an Erdős–Rényi graph was proposed in the physics literature as a toy model for particle tracking [CKK*+*10, SSZ20]. Subsequent work has rigorously analyzed planted matching recovery in closely related models, with remarkably precise results [MMX21, DWXY21]. In particular, AoN generally does not occur in planted matching models. We note here that our general theorems, which are not tailored to the matching problem, nevertheless indicate weaker results of a similar flavor. To see this, let be any subgraph of with . Then
[TABLE]
It follows by Stirling’s formula that
[TABLE]
The exponent is an increasing function on , so we conclude for all . One can then check that the conditions of Theorem 5.3 are satisfied, so we can conclude AoN at the exponential scale at . Theorem 3.15 allows us to say something slightly more in one direction, namely that is in the “nothing” regime for this model. However, for all , the total number of edges in the observed graph will be of order , so even a random subset of of the observed edges will have non-trivial overlap with the hidden matching. For this reason we expect that the model has an “all” phase for , a “something” phase around , and a “nothing” phase for .
Let us also note that, in the context of the second Kahn–Kalai conjecture, it is known that (as confirmed by the above calculation), but . Indeed, in order to contain a perfect matching the graph must have minimum degree at least one, and the coupon collector effect is responsible for the factor. Moreover, this is the reason for the logarithmic factor in the second Kahn–Kalai conjecture (2.8) (see the discussion of [KK07, §2]).
Example 2.10** (small balanced graphs).**
We saw in Example 2.7 that dense balanced graphs exhibit AoN at the first moment threshold. If is balanced and sufficiently small, , but not necessarily dense, we can obtain the weaker result that AoN occurs at the exponential scale. Indeed, for , similar considerations as (2.6) give
[TABLE]
If the graph is balanced, then it follows that
[TABLE]
for all . This implies that the conditions of Theorem 5.3 are satisfied, so we have AoN at the exponential scale at .
Example 2.11** (cycle with out-edges).**
This example shows the necessity of the “dense” assumption in Theorem 2.5. Let be formed by a cycle on vertices, together with one extra outgoing edge for each vertex of the cycle, so that in total . Assume , so that the model of planted in has AoN at by Example 4.10. It is not too difficult to verify that for every fixed (see Example 5.4 for the details of calculating for the -cycle, which is similar). Therefore is delocalized (see Definition 4.3) and almost-balanced (condition (2.11)). However, we claim that the model of planted in does not exhibit AoN at the linear scale (Definition 2.2): for , we are below , so we expect to be able to recover most of . However, for , a linear fraction of the vertices of will typically have more than one outgoing edge in , meaning we will fail to recover a constant fraction of the edges in . It follows that for we have neither “all” nor “nothing,” so AoN at linear scale (Definition 2.2) does not occur. However, the conditions of Theorem 5.3 are satisfied (indeed, this is a special case of Example 2.10), so we do have AoN at the exponential scale.
2.4. Intuition from random graph theory and proof outline
We close by offering some intuition behind our main result connecting AoN with the generalized expectation thresholds (see the informal statement of Theorem 1.1, or the formal statements of Theorems 2.5 and 5.3).
We start with the intuition from the theory of random graphs. Recall the “second Kahn–Kalai conjecture” (2.8), which estimates in terms of the expectation threshold . This corresponds to the natural idea that the existence of a copy of in is driven by the existence threshold of its “least likely” subgraph, measured in terms of the largest first moment threshold. Indeed, for a graph to appear in , clearly all its subgraphs need to appear as well. Moreover, motivated by well-established results for bounded (e.g. [Ruc87, Theorem 5]) we expect also a “clustering” picture to emerge in : whenever a copy of the “least likely” subgraph of appears in , we expect multiple copies of to appear as distinct extensions of the same subgraph as a core (leading to a “sunflower” structure). For the interested reader, we remark that the sunflower picture corresponds to a “condensation phase” in the language of random constraint satisfaction problems [KMRT*+*07]. We note however that the second Kahn–Kalai conjecture (and the suggested “sunflower” structure) remains open for general , although several variants of it have been proved [FKNP21, PP22, MNWSZ22b].
We now explain how the above picture suggests our main result Theorem 1.1. Roughly speaking, the generalized threshold (Definition 2.4) fails to be constant over if and only if the “least likely” subgraph has . Moreover, suppose for simplicity that . From the “clustering” intuition mentioned above, we expect that whenever the null graph has copies of , all of them should appear as extensions of much fewer copies of the “less likely” . Now consider : since , the null contains no copies of , hence none of . However, in the planted model there should be a plethora of copies of , all intersecting with the planted copy on its “least likely” subgraph — this is heuristically justified by the assumption that , and the “least likely” is already planted. For such , we expect that it will be possible to recover (so we are not in a “nothing” phase), but it will be impossible to distinguish the true among all the overlapping copies (so we are not in an “all” phase). Our theorem establishes that this intuition is indeed valid, in fact as an equivalence statement, and that in many settings AoN can occur if and only if is roughly constant over .
Most of the above discussion is based on a heuristic picture and state-of-the-art conjectures in random graph theory. It gives the guiding intuition for this work, but we emphasize that our proof proceeds in a quite different manner, with more direct characterizations of “all” and ”nothing” phases. Specifically we first establish, via a combination of the planting trick, Nishimori identity and a second moment method argument, a number of different results linking the MMSE of the planted model, with the subgraph structure of (see e.g. Lemmas 3.8 and 3.11). These intermediate results, allows us to apply the above intuition and obtain the AoN characterization for dense graphs (Section 4), as well as the general “nothing” result via the spread condition (Theorem 3.15). For our characterization for AoN in the exponential scale, we first prove a variant of the I-MMSE relation in our setting (Lemma 5.9) which allows to locate the AoN threshold in the exponential scale (Theorem 5.8). Then the planting trick alongside second moment method argument, allows us to argue again using our intuition of the previous paragraph and conclude the AoN characterization in the exponential scale (Theorem 5.3).
3. AoN in a general Bernoulli model: definitions and key sufficient conditions
In this section we derive general tools in a natural abstraction of the planted subgraph model, which we term the Bernoulli inference model (Definition 3.1), where the graphs are replaced by more general binary vectors.
- •
In §3.1 we formally define the Bernoulli inference model, and prove Theorem 3.4 which gives a sufficient condition for AoN in this model.
- •
In §3.2 we state Theorem 3.7, which gives an analogue of Theorem 3.4 at exponential scale. The exponential scale will be investigated further for the planted subgraph model in Section 5.
- •
In §3.3 we prove the “all” results of Theorems 3.4 and 3.7 by a truncated first moment calculation in the planted model.
- •
In §3.4 we prove the “nothing” results of Theorems 3.4 and 3.7 by a truncated second moment calculation in the null model.
- •
In §3.5 we prove Theorem 3.14, which gives a “nothing” regime for the Bernoulli model in terms of the spread property. As a consequence we deduce Theorem 3.15, which was mentioned in the introduction as an inference version of the second Kahn–Kalai conjecture.
3.1. General Bernoulli inference model
We begin this subsection by formally defining the general Bernoulli inference model. The main result of this subsection is Theorem 3.4, which gives a sufficient condition for AoN (at linear scale) in this model. As an application, at the end of this subsection we prove Corollary 3.5, characterizing the occurrence of AoN in the planted clique model.
Definition 3.1** (Bernoulli inference model).**
Let and . Assume a uniform prior on certain family of -subsets of , that is,
[TABLE]
and for all . We assume the model is marginally symmetric, meaning that for all . Denote ; with a minor abuse of notation we also write for the set of vectors (for ). In the Bernoulli inference model, we first sample the (hidden) signal , and denote . We then let be the random subset which contains each element of independently with probability , so that . We observe , equivalently, . The goal is to recover from . We let the planted model denote the joint law of . For comparison, we let denote the null model where there is no hidden signal , so and .
The Bernoulli model defined above is clearly an abstraction of the planted subgraph model (Definition 2.1), with corresponding to the set of all available edges in , and corresponding to the (edges of the) hidden subgraph. Note that the general Bernoulli model need not have the geometric structure of edges connected by vertices.
Generalizing (2.1), our measure of “recovery” in the Bernoulli inference model (and as a consequence also for the planted subgraph model) is the minimimum mean squared error (MMSE):
[TABLE]
Since , the MMSE must lie in , so we always normalize it by in what follows. Recall that in the planted subgraph model, corresponds to the set of edges in the hidden subgraph; thus MMSE is a measure of edge recovery rather than vertex recovery.
It is well known that is a non-decreasing function of . This fact was already mentioned (and used) in the introduction, and we review the short proof here:
Lemma 3.2** (monotonicity of MMSE).**
* is a non-decreasing function of .*
Proof.
For , consider a coupling where , is marginally distributed according to , is marginally distributed according to , and (coordinatewise). Then, under this coupling, with and , we have
[TABLE]
where the second identity is justified because if we already know , then knowing gives no additional information on . ∎
Given Lemma 3.2, it is natural to ask in what situations the MMSE has a sharp transition. We therefore make the following definition, generalizing Definition 2.2 for the planted subgraph model:
Definition 3.3** (all-or-nothing).**
We say that the model from Definition 3.1 exhibits an all-or-nothing (AoN) transition at critical probability if
[TABLE]
for any constant . We will sometimes refer to this as “AoN at linear scale,” to distinguish it from the scaling of Definition 3.6 below.
One of the goals of this paper is to characterize the conditions for the all-or-nothing phenomenon to hold, and if so, how to locate the threshold . As mentioned above, a theme of this paper is that the location of is closely connected to first moment thresholds in the null model . For a given family of sets (on which the prior is uniformly distributed upon) the first moment threshold is the smallest value of such that the expected number of elements of in a sample from the null model is at least one. More formally, for let denote the number of elements of that are contained in ,
[TABLE]
We define the first moment threshold to be the value of that satisfies
[TABLE]
that is, . We use this value of to specify the following growth condition on the prior, which controls from above the probability that two independent draws of the prior overlap in a specific number of elements:
[TABLE]
where and are independent draws from (the uniform measure over ). We highlight that in the language of [NWZ20], the growth condition (3.4) is a bound on the overlap rate function of the prior ; that paper showed that a similar condition implies the existence of an all-or-nothing threshold for sparse estimation problems with gaussian noise.
Theorem 3.4** (AoN for Bernoulli inference model under growth condition).**
Suppose the prior satisfies condition (3.4), and moreover that . Then the associated Bernoulli inference model (Definition 3.1) has AoN at as defined by (3.3): that is,
[TABLE]
for any constant .
From Theorem 3.4 it is straightforward to deduce the following characterization of AoN in the planted clique model:
Corollary 3.5**.**
Consider the model of a clique on vertices planted in .
- (a)
If then the model exhibits AoN at
[TABLE] 2. (b)
If then the model does not exhibit AoN at any value of .
For the proof of Corollary 3.5 we review the so-called “Nishimori identity,” which refers to the fact that the pair (the original signal, together with one sample from the posterior distribution) is equidistributed as the pair (two independent samples from the posterior distribution). This is a basic consequence of Bayes’s rule:
[TABLE]
and the pair is clearly exchangeable.
Proof of Corollary 3.5.
For part (a), by Theorem 3.4 it suffices to check the growth condition (3.4). The quantities of Definition 3.1 in the context of the planted clique model are
[TABLE]
Let be a sample from the prior , that is, is uniformly random among all -cliques contained in the complete graph . If is an independent copy of , then
[TABLE]
This verifies condition (3.4), and the claim follows.
For part (b) we focus on the regime with . It suffices to show that for any such , the normalized MMSE does not tend to zero or one. In light of (3.5), it suffices to show that if are two samples from the posterior distribution given , then the expected edge-overlap between and , normalized by , is not tending to zero or one. To this end, recall that denotes the total number of -cliques contained in . We have
[TABLE]
where is the number of vertices shared with the planted clique. Recalling , we have
[TABLE]
This shows that is a stochastically bounded random variable. For AoN to hold, the only possibility is that with high probability. However, for such values of it follows from standard results that the graph contains a clique with nonnegligible probability, which implies with nonnegligible probability. This shows that the normalized MMSE does not converge to zero or one in this regime of , so AoN does not occur. ∎
3.2. Results for exponential scale
In this subsection we state another result concerning AoN in the Bernoulli model, Theorem 3.7 below. It is very similar to Theorem 3.4, but with AoN happening at a larger scale (Definition 3.6), and under a weaker condition. The proofs for Theorems 3.4 and 3.7 are very similar, and we present both in this section.
Definition 3.6** (AoN at exponential scale).**
We say that the model from Definition 3.1 exhibits an all-or-nothing transition at the exponential scale with critical probability if
[TABLE]
for any constant , where we further require to stay bounded away from one for the definition to be meaningful.
We note that if is bounded away from one then
[TABLE]
so AoN at the linear scale at a threshold bounded away from one implies AoN at the exponential scale at that same threshold. The converse is clearly false. The following is a variant of the growth condition (3.4):
[TABLE]
We assume throughout that , which ensures that is bounded away from one. Under this assumption, condition (3.4) implies condition (3.6).
Theorem 3.7** (AoN at exponential scale under (3.6)).**
Suppose that the prior satisfies (3.6), and moreover that . Then the associated model has AoN at the exponential scale at as defined by (3.3): that is,
[TABLE]
for any constant .
3.3. Proofs for “all” regime by truncated first moment in planted model
In this subsection we prove the positive (“all”) results of Theorems 3.4 and 3.7, showing in both cases, if is sufficiently below , then the (normalized) MMSE tends to zero.
Lemma 3.8** (MMSE upper bound).**
For any , if the prior satisfies
[TABLE]
then .
Proof.
Recall that under the planted model , the observation is where contains each element of independently with probability . We will argue that, with high probability, any element satisfying must have overlap at least with the planted set . Since the Bayes estimator is the average of over all such , it follows by linearity that with high probability, and therefore we have
[TABLE]
as claimed. Thus, using Markov’s inequality, the result will follow once we show
[TABLE]
where denotes the number of subsets that are contained in and have overlap with the hidden signal . Then note that for any we have
[TABLE]
In the above we have used that if we condition on and consider with , then the probability is contained in is the same as the probability that is contained in , which is . It follows that (3.7) implies (3.8), which proves the claim. ∎
Proof of Theorem 3.4 “all” result.
By Lemma 3.8, it suffices to check that the condition (3.7) holds for any , and for any . We first use the assumption on to bound
[TABLE]
It follows by combining with the growth condition (3.4) that for all ,
[TABLE]
Since , it follows by summing over that condition (3.7) holds. The claim follows. ∎
Proof of Theorem 3.7 “all” result.
Again, by Lemma 3.8, it suffices to check that condition (3.7) holds for any , and for any . Using the assumption on , instead of (3.10) we have
[TABLE]
Combining with the growth condition (3.6), instead of (3.11) we have
[TABLE]
for each . Since we assumed and , we have . It follows by summing over that condition (3.7) holds, and this proves the claim.∎
3.4. Proofs for “nothing” regime by truncated second moment in null model
In this subsection we prove the negative (“nothing”) results of Theorems 3.4 and 3.7, showing that in both cases, if is sufficiently above , then the scaled MMSE tends to one.
Recall that in the planted model , there is a hidden signal and we observe where is an independent -biased random set. We now compare this with the null model , where there is no hidden signal and we observe . Note that
[TABLE]
with as in (3.2). It follows that the marginal law of under the planted model is given by
[TABLE]
That is, the Radon–Nikodym derivative between and is given by the ratio of to . An immediate consequence is that this ratio is unlikely to be small under :
Lemma 3.9**.**
For any we have .
Proof.
It follows from (3.13) that
[TABLE]
as claimed. ∎
Recall from above that denotes the number of subsets that are contained in and have overlap with the hidden signal . We also let denote the number of pairs with and , . The following is a consequence of (3.12):
Lemma 3.10**.**
For any , we have
[TABLE]
Proof.
It follows from (3.12) that
[TABLE]
as claimed. ∎
The next result gives a counterpart to Lemma 3.8:
Lemma 3.11** (MMSE lower bound).**
If the prior satisfies
[TABLE]
then .
Proof.
Let denote a sample from the posterior distribution of given . We will show that, for , we have the bound
[TABLE]
Since the Bayes estimator is the expectation of given , it follows that
[TABLE]
Now note that we can apply Lemmas 3.9 and 3.10 to bound, for any ,
[TABLE]
Since is arbitrary, it suffices to show . Then note that
[TABLE]
Summing over gives the left-hand side of (3.14), and proves the claim. ∎
Proof of Theorem 3.4 “nothing” result.
By Lemma 3.11, it suffices to check condition (3.14) for and any . We combine the assumption on with condition (3.4) to bound, for ,
[TABLE]
Since , we can sum over to conclude that , which concludes the proof. ∎
Proof of Theorem 3.7 “nothing” result.
Again, Lemma 3.11, it suffices to check condition (3.14) for and any . We combine the assumption on with condition (3.6) to bound, for ,
[TABLE]
Since and , we can sum over to conclude that , which concludes the proof. ∎
3.5. Connections to spread: second Kahn-Kalai conjecture in inference
The main result of this subsection is Theorem 3.15, which can be thought of as a version of the second Kahn–Kalai in the inference setting.
Definition 3.12** (spread condition).**
Let be a probability measure over a family of subsets of . For , we say that is -spread if for any subset we have
[TABLE]
where is a random sample from . Note that it suffices to check the condition for all .
Intuitively the growth condition (3.4) ask for the prior to satisfy an approximate version of the spread condition, a celebrated condition in probabilistic combinatorics [ALWZ21, FKNP21, Rao20, Tao20, BCW21, Hu21, Sto22, MNWSZ22b]. One can wonder if the well-studied spread condition by itself also implies a version of the all-or-nothing phenomenon. In this work, we establish that any prior satisfying the following relaxation of the spread condition does satisfy “nothing” for a wide range of value of .
Definition 3.13** (generalized spread condition).**
Let be a probability measure over a family of -subsets of . For , we say that is -spread if we have
[TABLE]
for any satisfying .
We have the following theorem which establishes a “nothing” regime for the Bernoulli inference model in terms of the generalized spread condition:
Theorem 3.14** (“nothing” under the generalized spread condition).**
In the Bernoulli inference model (Definition 3.1), suppose for every that is -spread for some . We then have
[TABLE]
for any satisfying for all .
Proof.
By Lemma 3.11, it suffices to check condition (3.14) for any constant . Since is supported on -subsets, for all we have
[TABLE]
Therefore condition (3.14) holds for . This completes the proof. ∎
The following gives an inference version of the second Kahn–Kalai conjecture, which was mentioned in the discussion of §2.3:
Theorem 3.15** (“nothing” for arbitrary graphs).**
Let be an arbitrary graph. The model of planted in is in the “nothing” regime when satisfies
[TABLE]
for as defined by (2.9).
Proof.
Given Theorem 3.14, it suffices to check that , the uniform measure over copies of in , satisfies the generalized spread condition with the appropriate parameters. For any nonempty subgraph , let denote the uniform measure over copies of in , and let denote a sample from . Let be arbitrary fixed copies of in . Let denote the number of copies of in . It follows by symmetry that
[TABLE]
If and then we obtain
[TABLE]
which is the generalized spread condition. The claim follows.∎
4. Characterization of AoN at linear scale for sufficiently dense graphs
The main purpose of the current section is to prove Theorem 2.5, which characterizes AoN at linear scale for dense graphs under mild conditions. The main message of Theorem 2.5 is that for dense graphs , AoN at linear scale can occur if and only if the graph is almost balanced, i.e., it is almost the case that is the densest subgraph of the full graph. With this in mind, the current section is organized as follows:
- •
In §4.1 we prove the forward direction of Theorem 2.5 by showing that for dense graphs that are delocalized (Definition 4.3), AoN implies the almost-balanced condition (2.11).
- •
In §4.2 we prove the reverse direction of Theorem 2.5 by showing that for delocalized dense graphs, condition (2.11) implies AoN.
- •
In §4.3 we prove Theorem 4.9 which gives sufficient conditions for AoN at linear scale for graphs without a density restriction, but only provided the number of vertices and edges is very small. The result illustrates that we have limited understanding of AoN for sparse planted subgraphs, and we leave this as an interesting direction for future research.
We begin with some preliminaries. Throughout the remainder of this paper, we specialize from the abstract Bernoulli inference model (Definition 3.1) to the planted subgraph model (Definition 2.1). Recall that the explicit correspondence between Definitions 2.1 and 3.1 is as follows: is the set of all edges in the complete graph , is the set of all copies of insides , and is the total number of distinct copies of in :
[TABLE]
where denotes the automorphism group of , and denotes the number of distinct copies of inside the complete graph on vertices. We have the trivial bounds
[TABLE]
Recall that denotes the uniform probability measure over . Under the planted model we observed where and is an independent sample from . The first moment threshold is defined by (2.5). We now proceed with some of the definitions that were informally given in the introduction:
Definition 4.1** (sufficiently dense).**
We say that a graph sequence is sufficently dense, or simply dense, if
[TABLE]
where denotes the number of vertices in , and denotes the number of edges in .
For dense graphs, and are easily computable by the following lemma, the proof of which was already sketched in the introduction (see (2.6)):
Lemma 4.2**.**
Let be dense (cf. Definition 4.1). Then, we have
[TABLE]
Thus, if we let
[TABLE]
then for any dense graph , we have for . Moreover, for an arbitrary graph , we have the lower bound for .
Proof.
It follows from the definitions that
[TABLE]
We then note that the dense assumption implies
[TABLE]
Similarly, it also implies
[TABLE]
For the final claim regarding the one-sided bound of , note that for any graph ,
[TABLE]
The claim follows. ∎
The following condition was introduced in (2.10) and used in the statement of Theorem 2.5:
Definition 4.3** (delocalized).**
Let be a sequence of graphs that are dense. We say the sequence is delocalized if there exist and , which are independent of , such that
[TABLE]
We remark that if is dense and satisfies for some and , then is delocalized: by Lemma 4.2, we have
[TABLE]
and rearranging gives condition (4.4). This fact was used in some of the examples presented in Section 2.
4.1. Theorem 2.5 forward direction: AoN implies almost-balanced
We now turn to the proof of Theorem 2.5.
Lemma 4.4**.**
Let be the uniform measure on copies of in , and let be the uniform measure on copies of in . Suppose . If denotes a sample from , then
[TABLE]
where denotes the minimum number of vertices in a graph of edges that can arise as an intersection of a copy of with a copy of .
Proof.
It follows by a direct counting argument that
[TABLE]
We can simplify the above bound as
[TABLE]
as claimed.∎
The next two lemmas give sufficient conditions for bounding the normalized MMSE from below (Lemma 4.5), and from above (Lemma 4.6).
Lemma 4.5**.**
Suppose is dense in the sense of Definition 4.1. Fix . We then have
[TABLE]
as long as for as defined by (4.3).
Proof.
By Lemma 3.11, it suffices to check condition (3.14) with . By Lemma 4.4,
[TABLE]
where . It follows from the definition (4.3) that
[TABLE]
Consequently, as long as , we have
[TABLE]
Summing over and recalling the dense assumption gives (3.14). ∎
Lemma 4.6**.**
Suppose is dense in the sense of Definition 4.1. With the notation (4.3), suppose that
[TABLE]
for some , where is a finite constant. Then for small we have
[TABLE]
as long as .
Proof.
Suppose for some we have
[TABLE]
for some , where may depend on but stays bounded by in the limit . Let be a subgraph with and . Let denote the set of all copies of in . As in the proof of Lemma 3.8, it suffices to show that, with high probability, any element satisfying has overlap at least with the planted subgraph . To this end, let denote the set of all copies of in ; it then suffices to show that any element satisfying has overlap at least with . Similarly as in Lemma 3.8, let denote the number of elements that are contained in and have overlap with ; we want to show that with high probability this quantity is zero for all . Similarly to (3.9), we have
[TABLE]
where is the uniform measure over , and . Applying Lemma 4.4 gives
[TABLE]
It follows from the definition (4.3) that , and . Combining with the assumption on gives
[TABLE]
Let be an arbitrary constant for now, and set
[TABLE]
where the last estimate holds for small enough. For , we have
[TABLE]
where the last bound uses the dense assumption. Summing over proves
[TABLE]
which implies the desired bound on MMSE. The claim follows by taking . ∎
Proof of Theorem 2.5 forward direction.
Let as in (4.3). Lemma 4.5 tells us that “all” cannot hold for
[TABLE]
for any ; since we assume AoN, it means that “nothing” holds. On the other hand, using the delocalized assumption (Definition 4.3), Lemma 4.6 tells us that “nothing” cannot hold for
[TABLE]
for sufficiently small ; again, since we assume AoN, it means that “all” holds. If then . We therefore obtain a contradiction unless
[TABLE]
for all , and for all . ∎
4.2. Theorem 2.5 reverse direction: almost-balanced implies AoN
The following is a strengthening of Lemma 4.6 under the assumption (2.11).
Lemma 4.7**.**
Suppose is delocalized in the sense of Definition 4.3. If condition (2.11) holds, then
[TABLE]
for all , for any .
Proof.
Fix small enough . Write , and let be the subgraph of which achieves maximum density among all the subgraphs of having at least edges, and write . Let denote the set of all copies of in . Similarly to the proof of Lemma 4.6, it suffices to show that, with high probability, any element satisfying has overlap at least with the planted subgraph . For this it suffices to show
[TABLE]
where, as in the proof of Lemma 4.6, we have
[TABLE]
We then divide the bound into two regimes:
- (a)
For , we have by assumption, while
[TABLE]
where the intermediate bound follows from the delocalized assumption (Definition 4.3) for small enough , and the last equality holds for any by (2.11). Thus, for and , we have
[TABLE]
It follows by taking and summing over (using the dense assumption) that the contribution to (4.5) from all such is . 2. (b)
For , we have
[TABLE]
It follows that
[TABLE]
having used the dense assumption. It follows by summing over that the contribution to (4.5) from all such is .
Combining the above bounds proves (4.5), and hence the claim. ∎
Proof of Theorem 2.5 reverse direction.
Let for any fixed , and let be any positive constant. Lemma 4.7 tells us that “all” occurs for . On the other hand, Lemma 4.5 tells us that “nothing” occurs for all . This proves the theorem. ∎
4.3. Sparse strongly balanced graphs
Given the result of Theorem 2.5, it is natural to ask about AoN for general graphs without a density requirement as in Definition 4.1. In this subsection we show that AoN at linear holds for graphs that are strongly balanced (see below) and have a sufficiently small number of vertices and edges, but with no restriction on the density. We do not have a complete understanding of the sparse case, and leave this as an interesting open question for future investigations.
Definition 4.8** (strongly balanced).**
We say the graph is strongly balanced with parameter if we have
[TABLE]
for all nonempty subgraphs . That is to say, subgraphs of must be strictly less dense than , with the difference given by (4.6). The case corresponds to the definition of [RV86].
Theorem 4.9** (AoN at linear scale for small strongly balanced graphs).**
Suppose is strongly balanced (Definition 4.8) with parameter , and satisfies , , and
[TABLE]
Then the model of planted in exhibits AoN at .
Example 4.10** (small trees and small cycles).**
One can easily check that trees and cycles are strongly balanced with parameter ; this fact is also noted by [RV86]. Theorem 4.9 implies that if is a tree or cycle with
[TABLE]
then the model of planted in exhibits AoN at linear scale at . Compare with Example 5.4, which addresses AoN at the exponential scale for cycles much larger than (4.7). We also note that Theorem 4.9 does not apply to the -cycle with extra edges (Example 2.11), since this is not strongly balanced.
In preparation for the proof of Theorem 4.9 we introduce some new notation. For a subgraph , in keeping with our earlier notation (4.1) let denote the number of copies of in . Given a copy of in , let denote the number of ways to extend this to a copy of : we can rewrite this as
[TABLE]
where . Note also that if denotes a uniform copy of in , then we have
[TABLE]
where denotes a fixed copy of in .
Proof of Theorem 4.9.
We start by remarking that the conditions on and imply
[TABLE]
By Theorem 3.4, it suffices to check the condition (3.4). For , we now bound
[TABLE]
To form another copy of with , we need to choose vertices from the available vertices, and also choose edges among the at most possible edges. This gives a crude bound
[TABLE]
Next note that the strongly balanced assumption implies
[TABLE]
so . The strongly balanced assumption also implies
[TABLE]
If we then account for the enumeration of and , we obtain
[TABLE]
where the first binomial coefficient accounts for the number of vertex-induced subgraphs . Simplifying the bound gives
[TABLE]
We claim that is decreasing in , so that each term in the last sum is upper bounded by
[TABLE]
We also have the trivial inequality
[TABLE]
Combining the above bounds gives
[TABLE]
so it follows from (4.9) that , which gives condition (3.4). It remains to verify that is in fact nonincreasing in as claimed above. To this end we bound
[TABLE]
having used that (which clearly holds for large , since we assumed ). Then, since , , and , we can further bound
[TABLE]
where the last inequality again holds by (4.9). This concludes the proof. ∎
5. AoN at exponential scale
Recall the generalized expectation threshold from Definition 2.4, and let us define
[TABLE]
We then introduce the following:
Definition 5.1** (first-moment-stable).**
We say the graph sequence is first-moment-stable if it holds uniformly in that
[TABLE]
for as in (5.1). Informally this says that if then we must have . We will see that it is equivalent to require ; the other direction holds automatically.
Definition 5.2** (first-moment-flat).**
We say that a graph sequence is first-moment-flat if
[TABLE]
for all .
The conditions from Definitions 5.1 and 5.2 should be compared with the conditions that appeared in the result for dense graphs Theorem 2.5: delocalized (Definition 4.3) and almost-balanced (condition (2.11)). To give a concrete instance, the clique with out-edges discussed in Examples 2.3 and 2.6 is delocalized, but not first-moment-stable.
Theorem 5.3** (AoN at exponential scale).**
Let be a graph sequence with and bounded away from one. Suppose further that is first-moment-stable in the sense of Definition 5.1, and also that either or . Then AoN occurs at the exponential scale if and only if is first-moment-flat (Definition 5.2). In this case .
The main purpose of this section is to prove Theorem 5.3, which characterizes AoN at exponential scale for graphs that either have first moment threshold tending to zero, or are subpolynomial in size. This section is organized as follows:
- •
In §5.1 we return to the generalized setting of Definition 3.1, and show that if the prior satisfies a certain “replica weak separation” (RWS) condition, then AoN at exponential scale can only occur at the first moment threshold.
- •
In §5.2 we show that if the graph sequence is first-moment-stable (Definition 5.1), then the uniform measure on copies of in satisfies RWS.
- •
In §5.3 we prove the forward direction of Theorem 5.3 by showing if that the planted subgraph model has AoN at the exponential scale, then the graph sequence must be first-moment-flat.
- •
§5.4 we prove the reverse direction of Theorem 5.3 by showing if that the graph sequence is first-moment flat, then the corresponding planted subgraph model has AoN at the exponential scale.
Theorem 5.3 goes beyond the dense regime — this is illustrated by Examples 2.9, 2.10, 2.11 from §2.3, as well as by the following:
Example 5.4** (cycle).**
Let be a cycle on vertices. If with , then must consist of disjoint paths, with , of lengths summing to . It follows that . We also note that an automorphism of is determined by how it acts on the endpoints of the paths, so crudely . It follows that
[TABLE]
This shows that is first-moment-stable (Definition 5.1) and first-moment-flat (Definition 5.2). It then follows from Theorem 5.3 that the model of planted in exhibits AoN at exponential scale at . Compare with Example 4.10 (which is restricted to smaller cycles, but gives AoN at linear scale), and with Example 2.11 (the cycle with out-edges).
We also include an example where AoN does not occur even at the exponential scale:
Example 5.5** (lack of AoN at exponential scale).**
Take where the ratio is a small constant . Let be the disjoint union of where each is a clique on vertices. Then
[TABLE]
for a small constant . It follows that
[TABLE]
On the other hand, for the full graph we have
[TABLE]
so is not first-moment-flat (Definition 5.2). It is however first-moment-stable (Definition 5.1). It follows from Theorem 5.3 that the model of planted in does not have AoN at the exponential scale. Indeed, for on the exponential scale, it will be possible to recover but not .
5.1. Locating the AoN threshold under RWS condition
In this subsection we show that if satisfies a “replica weak separation” condition (Definition 5.6 below), then at the exponential scale it can only exhibit all-or-nothing at the first moment threshold from (3.3).
Definition 5.6** (replica weak separation).**
We say the measure satisfies replica weak separation (RWS) if
[TABLE]
where are independent samples from .
In §5.2 we show that if is first-moment-stable in the sense of Definition 5.1, then the uniform measure on copies of in satisfies the RWS condition of Definition 5.6. The following lemma explains how this is related to the earlier condition (3.6) which was used in the proof of Theorem 3.7:
Lemma 5.7**.**
If the prior satisfies (3.6) and , then it also satisfies (5.2).
Proof.
Recall from (3.3) that . It follows that
[TABLE]
The claim follows by recalling the assumption and sending . ∎
Theorem 5.8** (location of AoN threshold at exponential scale).**
Assume satisfies RWS in the sense of condition (5.2) from Definition 5.6. If the corresponding planted model has all-or-nothing at the exponential scale at some critical value , then
[TABLE]
where is the first moment threshold defined in (3.3). In other words,
In preparation for the proof, let be the posterior probability of , given :
[TABLE]
where denotes the number of subsets that are contained with . Note that if then , and therefore . Note also that
[TABLE]
with probability one, since the prior is uniform on sets of size . The next lemma gives a bound on the derivative of in terms of the MMSE of the model:
Lemma 5.9**.**
Let be the Kullback–Leibler divergence between the laws of under the planted and null models. Then
[TABLE]
Proof.
Abbreviate and . Recalling (3.13), we have
[TABLE]
where and . Next note that for any function , we have
[TABLE]
In the above, refers to the indicator that . Now consider . If , then does not depend on . If , then depends on , and we have
[TABLE]
Substituting this into (5.6) gives
[TABLE]
Then, using that for all , we have
[TABLE]
The claim follows by making the change of variables . ∎
Corollary 5.10**.**
If the planted model has all-or-nothing at the exponential scale at threshold , then we must have for as in (3.3).
Proof.
Take and let . Since the MMSE is nondecreasing in , we have uniformly over all . Note also that , so Lemma 5.9 gives
[TABLE]
uniformly over . On the other hand, since always, it follows from (5.5) that
[TABLE]
for all . Combining the above bounds gives
[TABLE]
and the claim follows by sending . ∎
The next lemma says that under the RWS condition, in the “all” regime the mutual information between and must be close to its maximal value :
Lemma 5.11**.**
Let be the mutual information between and under the planted model . Suppose the prior satisfies the RWS condition (Definition 5.6). If the model is in the “all” regime in the sense that , then
[TABLE]
Proof.
It follows using (3.12) and (3.13) that with , we have
[TABLE]
where we note that under the planted model we have with probability one. The assertion of the lemma can then be rewritten as
[TABLE]
We then decompose where
[TABLE]
and is the remainder. The “all” assumption implies that for any fixed we have
[TABLE]
in probability as . We then crudely bound
[TABLE]
Combining with the RWS assumption (5.2) gives
[TABLE]
It follows by combining the above bounds that , and consequently we have , with high probability under . ∎
Corollary 5.12**.**
Suppose the prior satisfies the RWS condition (Definition 5.6). If the planted model has all-or-nothing at the exponential scale at threshold , then we must have for as in (3.3).
Proof.
Since always, it follows from (5.8) that for all we have
[TABLE]
Take and let : this is in the “all” regime, so combining with Lemma 5.11 gives
[TABLE]
Dividing through by gives
[TABLE]
and the claim follows by sending . ∎
Proof of Theorem 5.8.
Follows by combining Corollaries 5.10 and 5.12. ∎
5.2. First-moment-stable graphs satisfy RWS
Recall from Definition 5.1 that a graph sequence is first-moment-stable if for all with we have . Recall also the notations and from (4.8). In this subsection we give an equivalent characterization of first-moment-stability, and also show that it implies RWS.
Lemma 5.13** (alternate characterization of first-moment-stability).**
If is bounded away from one, then is first-moment-stable in the sense of Definition 5.1 if and only if uniformly over with .
Proof.
Recall from (3.3) that the first moment threshold of is given by
[TABLE]
In particular, note that is bounded away from one if and only if . Next, for all subgraphs with , it holds uniformly that
[TABLE]
having used the assumption . Now recall from (4.8) that
[TABLE]
It follows that for with , we have always, and we have (hence first-moment-stability) if and only if . ∎
Corollary 5.14** (first-moment-stability implies RWS).**
Let be the uniform prior on all copies of in . If is bounded away from one, and is first-moment-stable in the sense of Definition 5.1, then satisfies RWS in the sense of Definition 5.6.
Proof.
We can bound
[TABLE]
so the claim follows by applying Lemma 5.13 and sending . ∎
Thus, Corollary 5.14 tells us that Theorem 5.8 applies to first-moment-stable graphs.
5.3. Theorem 5.3 forward direction: AoN implies first-moment-flat
We now turn to the proof of Theorem 5.3. The result holds for the following class of graph sequences: all graphs that have first moment threshold tending to zero, or that contain a subpolynomial number of vertices. To this end we introduce the following useful technical condition:
[TABLE]
The next lemma explains that (5.9) holds exactly in the cases mentioned above:
Lemma 5.15**.**
We have the following equivalences:
- (a)
* if and only if ;* 2. (b)
* if and only if .*
If either of the above holds, then satisfies (5.9).
Proof.
The first claim (a) follows immediately from (2.5). For the second claim (b), note that the upper bound in (4.2) says , so implies . Conversely, if , then combining with the lower bound in (4.2) gives
[TABLE]
This proves (b). ∎
Recall that we assume is bounded away from one, or equivalently . As a consequence, Lemma 5.15 case (b) includes dense graphs (Definition 4.1), since for such graphs we have
[TABLE]
However, Theorem 5.3 goes beyond dense graphs, as discussed above.
Proof of Theorem 5.3 forward direction.
We argue by contradiction. Assume AoN occurs at the exponential scale, but the graph is not first-moment-flat. This means that for some there must exist a subgraph with and
[TABLE]
Let . Since is first-moment-stable, Corollary 5.14 gives that the uniform measure on copies of in satisfies RWS (Definition 5.6). It then follows by Theorem 5.8 (in fact by Corollary 5.12) that . Moreover, the first-moment-stable assumption gives
[TABLE]
so for sufficiently close to one we will also have
[TABLE]
Consequently, for close to one,111In fact, in light of (5.10) we can take . However in this argument we use to highlight the places where the first-moment-stability assumption is required. we can take to satisfy
[TABLE]
Now recall that denotes the planted copy of . Since is in the “nothing” regime, with high probability the observed graph contains another copy of which has negligible overlap with planted copy , in the sense that
[TABLE]
It follows that under the null model contains, with high probability, an approximate copy of — that is, a subgraph on vertices which can be made into a copy of by adding at most edges. As a consequence, also contains with high probability an approximate copy of the graph from (5.11), since .
Let count the total number of -approximate copies of contained in — that is, subgraphs on vertices which can be made into a copy of by adding at most edges. The preceding argument shows that with high probability under the null model . We will derive a contradiction by showing that . To this end note
[TABLE]
The number of is . On the other hand, for any with ,
[TABLE]
(this is the same reasoning as the easy direction of Lemma 5.13.) It follows from (5.11) that
[TABLE]
Combining these bounds gives
[TABLE]
where the last inequality holds by taking . Recalling (5.11) again now gives
[TABLE]
To conclude, recall from (5.10) that for sufficiently close to one we will have , which allows us to conclude that the above bound is . ∎
5.4. Theorem 5.3 reverse direction: first-moment-flat implies AoN
Proof of Theorem 5.3 reverse direction.
Since we assumed and bounded away from one, it follows that , so we can apply Theorem 3.7. Hence it suffices to check the growth condition (3.6), that is,
[TABLE]
where and are i.i.d. draws from , the uniform measure over all copies of in .
Now notice that for the bound (5.12) is immediate, since in this case
[TABLE]
It therefore suffices to bound the case . Let be any fixed copy of in , and note that
[TABLE]
where is the set of all subgraphs with that can arise as an intersection of with another copy of . We then bound
[TABLE]
By the first-moment-flat condition (Definition 5.2) together with the first-moment-stable assumption,
[TABLE]
so we can bound
[TABLE]
Combining these bounds and rearranging gives
[TABLE]
To finish the proof, we now claim that
[TABLE]
That the left hand side is at most is clear. For the other part of the inequality, recall that a graph in must be realized as a (vertex-induced) intersection of two copies of . Hence, choosing the isomorphism class of the subgraph of ( choices), the vertices used ( choices), and the way to embed the graph in these vertices ( choices) implies the desired result. By the assumption (5.9), we have
[TABLE]
for all . This completes the proof. ∎
6. Conclusion
In this work we considered the model of a general subgraph planted in . We showed that, under various assumptions on , the AoN phenomenon in the planted model can be characterized in terms of the “generalized expectation thresholds” of in the null model . (See Theorems 2.5 and 5.3 for the precise statements.) A natural question would be whether an AoN characterization can be obtained for all planted subgraphs . In a more general context, our results, alongside with the intuition described in Section 2.4, suggest that AoN can be characterized by merely studying structural properties of the “solution space” in the null model (corresponding, e.g., to the absence of a “condensation phase” in the language of random constraint satisfaction problems [KMRT*+*07]). It would be interesting to investigate further this connection.
Lastly, as indicated above, sharp thresholds in boolean Fourier analysis have been long conjectured to be connected with computational hardness, see e.g. [KS06]. A prime example of such a connection is the fact that bboolean circuits of “low complexity” do not exhibit sharp threshold behavior [KS06, §6]. Meanwhile, on the inference side, a large amount of work in the past decade has been devoted to studying the existence of “computational-statistical” gaps: regimes where the inference task is information-theoretically possible, but appears intractable by efficient algorithms. Intriguingly, AoN (the inference analogue of sharp thresholds) has been empirically observed to appear (with a few puzzling exceptions) in models with a computational-statistical gap. For instance, we have seen that AoN appears for the planted clique model (Corollary 3.5), but not for the planted matching problem (Example 2.9). Correspondingly, there is a substantial body of evidence towards a computational-statistical gap in the planted clique problem (e.g., [BHK*+*19, FGR*+*17, GZ19]), but the planted matching problem does not exhibit such a gap (the maximum matching is polynomial-time computable, and gives non-trivial recovery up to the information-theoretic threshold [MMX21, DWXY21]). This leads us to ask:
Is AoN a provable barrier for a subclass of polynomial-time methods?
We consider this a natural and intriguing question for future work.
Acknowledgements
We acknowledge the support of Simons-NSF grant DMS-2031883 (E.M., Y.S., N.S., and I.Z.), the Vannevar Bush Faculty Fellowship ONR-N00014-20-1-2826 (E.M., Y.S., and I.Z.), the Simons Investigator Award 622132 (E.M.), the Sloan Research Fellowship (J.N.W.), NSF CAREER grant DMS-1940092 (N.S.), and the Solomon Buchsbaum Research Fund at MIT (N.S.).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ACO 08] D. Achlioptas and A. Coja-Oghlan. Algorithmic barriers from phase transitions. In 2008 49th Annual IEEE Symposium on Foundations of Computer Science , pages 793–802. IEEE, 2008.
- 2[ACV 14] E. Arias-Castro and N. Verzelen. Community detection in dense random networks. The Annals of Statistics , 42(3):940 – 969, 2014.
- 3[ALWZ 21] R. Alweiss, S. Lovett, K. Wu, and J. Zhang. Improved bounds for the sunflower lemma. Annals of Mathematics , 194(3):795 – 815, 2021.
- 4[BCW 21] T. Bell, S. Chueluecha, and L. Warnke. Note on sunflowers. Discrete Math. , 344(7):112367, 2021.
- 5[BDT + 20] V. Bagaria, J. Ding, D. Tse, Y. Wu, and J. Xu. Hidden hamiltonian cycle recovery via linear programming. Operations research , 68(1):53–70, 2020.
- 6[BHK + 19] B. Barak, S. Hopkins, J. Kelner, P. K. Kothari, A. Moitra, and A. Potechin. A nearly tight sum-of-squares lower bound for the planted clique problem. SIAM Journal on Computing , 48(2):687–735, 2019.
- 7[BKM + 19] J. Barbier, F. Krzakala, N. Macris, L. Miolane, and L. Zdeborová. Optimal errors and phase transitions in high-dimensional generalized linear models. Proceedings of the National Academy of Sciences , 116(12):5451–5460, 2019.
- 8[Bol 81] B. Bollobás. Threshold functions for small subgraphs. Math. Proc. Cambridge Philos. Soc. , 90(2):197–206, 1981.
