Sharp Online Hardness for Large Balanced Independent Sets
Abhishek Dhawan, Eren C. K{\i}z{\i}lda\u{g}, Neeladri Maitra

TL;DR
This paper investigates the size and algorithmic complexity of large balanced independent sets in dense bipartite graphs, establishing tight bounds and designing an online algorithm that nearly achieves the optimal size.
Contribution
It introduces a precise size characterization for large balanced independent sets in dense bipartite graphs and develops an online algorithm that approaches this bound, supported by a new lower bound using the OGP framework.
Findings
Largest $oldsymbol{ extgamma}$-balanced independent set size is $oldsymbol{rac{ ext{log}_b n}{ extgamma(1- extgamma)}}$ whp.
Proposed online algorithm achieves size close to the upper bound with high probability.
No online algorithm can surpass the bound by a factor of $(1+oldsymbol{ extepsilon})$, establishing tightness.
Abstract
We study the algorithmic problem of finding large -balanced independent sets in dense random bipartite graphs; an independent set is -balanced if a proportion of its vertices lie on one side of the bipartition. In the sparse regime, Perkins and Wang established tight bounds within the low-degree polynomial (LDP) framework, showing a factor- statistical-computational gap via the Overlap Gap Property (OGP) framework tailored for stable algorithms. However, these techniques do not appear to extend to the dense setting. For the related large independent set problem in dense random graph, the best known algorithm is an online greedy procedure that is inherently unstable, and LDP algorithms are conjectured to fail even in the "easy" regime where greedy succeeds. We show that the largest -balanced independent set in dense random bipartite graphs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Markov Chains and Monte Carlo Methods · Advanced Graph Theory Research
Sharp Online Hardness for Large Balanced Independent Sets
Abhishek Dhawan Email: [email protected]. Partially supported by the NSF RTG grant DMS-1937241.
Eren C. Kızıldağ Email: [email protected]. Department of Statistics, University of Illinois Urbana-Champaign
Neeladri Maitra Email: [email protected].
Abstract
We study the algorithmic problem of finding large -balanced independent sets in dense random bipartite graphs; an independent set is -balanced if a proportion of its vertices lie on one side of the bipartition. In the sparse regime, Perkins and Wang [PW24] established tight bounds within the low-degree polynomial (LDP) framework, showing a factor- statistical–computational gap via the Overlap Gap Property (OGP) framework tailored for stable algorithms. However, these techniques do not appear to extend to the dense setting. For the related large independent set problem in dense Erdős-Rényi random graph , the best known algorithm is an online greedy procedure that is inherently unstable, and LDP algorithms are conjectured to fail even in the “easy” regime where greedy succeeds.
For constant , we show that the largest -balanced independent set in has size with high probability (whp), where is the size of each bipartition, is the edge probability, and . We design a two-stage online algorithm—revealing vertices sequentially and making irrevocable decisions based solely on current information—that achieves whp for any , where . We complement this with a sharp lower bound, showing that no online algorithm can achieve with nonnegligible probability.
Our results suggest that the same factor- gap is also present in the dense setting, supporting its conjectured universality. While the classical greedy procedure on is straightforward, our algorithm is more intricate: it proceeds in two stages, incorporating a stopping time and suitable truncation to ensure that -balancedness—a global constraint—is met despite operating with limited information. Our lower bound utilizes the OGP framework. Although the traditional scope of the OGP has been stable algorithms, we build on a recent refinement of this framework for online models and extend it to the bipartite setting.
Contents
1 Introduction
In this paper, we study the algorithmic problem of finding large balanced independent sets in dense random bipartite graphs. While finding large independent sets—or even approximating them to within an factor—is NP-hard in the worst-case [Hås99, Kho01], the situation becomes far more intriguing in the presence of randomness.
For the Erdős-Rényi random graph , the largest independent set has size approximately with high probability (whp) [Mat70, Mat76, GM75, BE76]. Moreover, a simple greedy algorithm operating in an online fashion—where vertices are revealed sequentially, and the decision at step depends only on partial information available at step —finds an independent set of size with high probability [GM75]. In 1976, Karp asked whether it is possible to design an efficient algorithm that, whp, finds an independent set of size for [Kar76]. Surprisingly, this question remains open and is widely believed to be computationally intractable. It is worth mentioning that proving the hardness of Karp’s task unconditionally would imply .
Karp’s problem stands as a central question in average-case complexity and the algorithmic theory of random graphs.111For instance, Frieze explicitly highlighted it as a major open problem in his 2014 ICM plenary lecture [Fri14]. It is perhaps the earliest instance of a statistical-computational gap—a gap between the existential bound and the best known polynomial-time algorithm. This gap has been extensively studied (e.g., prompting Jerrum to propose the now-famous planted clique model [Jer92]), and a large body of work has since uncovered similar “factor 2-gaps” in other random graph models, suggesting a certain universality. For a broad overview of such gaps both in the context of random graphs and beyond, see the surveys [BPW18, Gam21, GMZ22, Gam25].
The sparse case, with constant , has seen substantial progress. In this case, the largest independent set has size [Fri90, FL92, BGT10], while the best known efficient algorithm only achieves , both whp [LW07].222Both guarantees hold in the double limit followed by . Is it hard to find independent sets of size ? As noted above, resolving this question would imply a conclusion even stronger than . Accordingly, contemporary research has instead focused on providing rigorous evidence of hardness, e.g., by establishing unconditional lower bounds for certain classes of algorithms. This algorithmic gap led Gamarnik and Sudan to introduce the Overlap Gap Property (OGP) framework [GS14]. Leveraging this framework, tight hardness results were obtained for powerful algorithmic classes, including low-degree polynomial (LDP) algorithms [Wei22] and local algorithms [RV17]. These arguments were subsequently extended to sparse random bipartite graphs by Perkins and Wang [PW24], a work closely related to ours (see below); see also [DW24] for an extension to hypergraphs.
Dense Random Graphs
The situation is markedly different in the dense regime, with constant . In this setting, while the hardness results were recently established for LDP algorithms [HS25], it remains unknown whether the greedy algorithm can itself be implemented as an LDP. In fact, it has been conjectured in a recent AIM workshop [24] that for :
Conjecture 1.1**.**
No degree- polynomial can return an independent set of size .
That is, LDP algorithms likely fail even in the regime where the greedy algorithm succeeds, suggesting they may not be viable at all in the dense setting. Given that LDP algorithms are quite powerful in many high-dimensional problems, this is particularly surprising—especially since the greedy algorithm operates using only partial, sequential information without accessing the full graph. Thus, different techniques are needed for analyzing the greedy algorithm and the online setting.
A recent work [GKW25] has refined the OGP framework and subsequently obtained sharp lower bounds for a broad class of online algorithms, which includes the greedy algorithm as a special case. Our work extends these techniques to dense random bipartite graphs, as we detail below.
1.1 Random Bipartite Graphs
Bipartite graphs arise frequently in modeling real-world scenarios with an inherent two-part structure (e.g., job assignment). From a theoretical standpoint, many classical problems in graph theory and extremal combinatorics—such as Turán- and Ramsey-type questions—have natural bipartite counterparts. In our context, random bipartite graphs are a natural testbed for investigating the robustness of statistical-computational gaps (e.g., the factor- gap discussed above).
Our focus is on the dense Erdős-Rényi random bipartite graph defined on disjoint vertex sets where and is a constant. Each edge between and is included independently with probability . In general, maximum independent sets in bipartite graphs can be found efficiently via max-flow. Likewise, finding large independent sets in is algorithmically easy; in fact, even approximately counting or sampling such sets in the sparse case is tractable—see [DW24] and references therein. However, powerful statistical mechanics heuristics [MP87] suggest that introducing global constraints may lead to a glassy phase and computational hardness.
A natural global constraint, as studied in [PW24], is to consider balanced independent sets , where . More generally, they allow for a specified proportion of vertices from each side of the bipartition, which is also our focus. Indeed, introducing such a global constraint already leads to computational barriers: finding a largest balanced independent set in a bipartite graph is NP-hard [GJ90, Fei02]. For further references and detailed discussion, see [PW24].
In this paper, we study -balanced independent sets—those in which a fraction of vertices lie in one part and a fraction in the other, where without loss of generality.
Definition 1.2**.**
Given a bipartite graph with bipartition , an independent set of is -balanced if \bigl{|}|I\cap L|-\gamma|I|\bigr{|}<1 or \bigl{|}|I\cap R|-\gamma|I|\bigr{|}<1.
For the sparse case, , Perkins and Wang [PW24] provide a fairly comprehensive picture. They show that (i) the largest -balanced independent set has size , (ii) there are local/LDP algorithms finding an independent set of size , and (iii) for large, local/LDP algorithms fail to find an independent set of size . That is, a factor- statistical-computational gap emerges. In the balanced case , this reproduces the familiar factor- gap.
Can we extend these results to dense bipartite graphs? To begin with, their algorithmic result crucially exploits sparsity—specifically, the fact that sparse graphs are locally ‘tree-like’. This structural property no longer holds in the dense regime. In fact, as discussed earlier and suggested by Conjecture 1.1, such algorithms may not even be viable candidates in the dense setting. Instead, online algorithms emerge as a natural candidate. However, enforcing the -balancedness constraint in an online setting with randomly ordered vertex arrivals poses new difficulties (see below). Furthermore, the core of their hardness result relies on the stability of local/LDP algorithms. Online algorithms, however, may be unstable; see [GKW25].
Dense Bipartite Graphs
In this paper, we characterize the landscape for dense random bipartite graphs, including (i) the statistical threshold for the largest -balanced independent set, (ii) an online algorithm achieving the threshold , and (iii) a sharp algorithmic lower bound, showing that no online algorithm can surpass . This suggests that is the computational threshold for this problem. In particular, our results establish a similar factor- statistical-computational gap, thus reinforcing its apparent universality.
Before describing our main results, we highlight key challenges. First and foremost, any approach that incurs even mild logarithmic factors would fall short of addressing a constant-factor computational gap. In the sparse setting, the OGP framework has proven to be a powerful tool for addressing constant-factor gaps. However, this framework applies primarily to stable algorithms, including local/LDP algorithms. In light of Conjecture 1.1, such algorithms are unlikely to be viable in the dense case. Instead, online algorithms offer a more natural starting point—further motivated by the fact that the best known algorithm for itself is an online algorithm. Despite this, applications of the OGP framework to online algorithms remain only a handful, see Section 2.1. Even for classical greedy on , lower bounds via OGP require delicate refinements and have been established only very recently [GKW25]. In our setting, the technical challenges are compounded by two additional factors: (i) the greedy algorithm operates with only partial information, while the -balancedness constraint is global; and (ii) the random vertex arrival order—a core feature of online algorithms—can lead to situations revealing few or no cross-edges in the bipartite graph, further complicating the analysis (see below).
In the classical online setting (e.g., greedy on ), vertices arrive sequentially and in random order. At each step , the algorithm decides whether to include the incoming vertex by inspecting its connections to , the independent set constructed thus far. Crucially, the decision for must be made immediately—it cannot be deferred.
The introduction of the -balancedness constraint makes the situation more delicate. In the setting of [PW24], the algorithm is not online; once an independent set is constructed, it can be rebalanced—e.g., by discarding vertices—to satisfy the balancedness constraint. In our setting, however, such post-hoc pruning is not allowed, as it breaks the requirement that decisions are irrevocable. Consequently, the algorithm must be cognizant of the balancedness constraint throughout its execution.
This challenge is further compounded by the randomness in vertex arrival order. Crucially, if majority of the vertices come from the same side of the bipartition in, e.g., the first steps, the algorithm receives limited information about the status of cross-edges. In contrast, in , each step reveals the status of some new edges, regardless of the vertex order.
The random arrival order, along with the -balancedness constraint also necessitates truncation, elaborated next. Consider the case where the first (or ) vertices all come from the same side of the bipartition. A greedy algorithm might add all of them—clearly violating the balancedness constraint. More subtly, even adding as few as of these vertices can prevent any subsequent vertex from the opposite side from being included—indicating that a careful truncation (without violating onlineness) is essential. Hence, while the analysis for classical greedy on is rather easy, this is not the case in our setting. Moreover, any meaningful algorithmic lower bound should be oblivious to the arrival order—including adversarial ones such as the scenarios described above.
To address these challenges, we construct suitable auxiliary stochastic processes to track the evolution of the algorithm’s output, along with judiciously chosen stopping times that enforce the balancedness constraint without violating the online requirement.
1.2 Summary of Main Results
Recall that is the random bipartite graph on vertex set with , where each edge in is present independently with probability . Fix and set
[TABLE]
For the remainder of this paper, we ignore all floor/ceiling operators for simplicity with the understanding that this does not affect the overall arguments.
Theorem 1.3** (Informal, see Theorem 3.1).**
The largest -balanced independent set in has size \bigl{(}1\pm o(1)\bigr{)}\alpha_{\rm STAT} whp.
Theorem 1.3 identifies in (1) as the statistical threshold. Equipped with this, a natural algorithmic question arises: can we efficiently find large -balanced independent sets? We set
[TABLE]
Our next result shows that is attainable in polynomial time.
Theorem 1.4** (Informal, see Theorem 3.4).**
There is an online algorithm—oblivious to the vertex arrival order—which, for any , finds a -balanced independent set of size whp.
See Definition 3.2 for a description of online algorithms. The decision at time takes polynomial time, so the overall runtime of our algorithm is polynomial in . Notably, our algorithm is completely oblivious to the vertex arrival order. As mentioned earlier, while the analysis of the classical greedy algorithm on is relatively straightforward, the situation here is more delicate. The presence of the -balancedness constraint, along with the random arrival order, introduces additional challenges. We address these by (i) analyzing suitable stochastic processes that track the algorithm’s evolution, and (ii) employing a careful truncation to ensure the -balancedness.
Observe that there is a factor- gap between the statistical threshold and the algorithmic value, reminiscent of the computational gap in the sparse setting. Our final result addresses this gap and establishes a sharp computational lower bound.
Theorem 1.5** (Informal, see Theorem 3.5).**
For any , no online algorithm finds a -balanced independent set of size with probability at least .
We note that there is no restriction on the runtime of the algorithms ruled out. Furthermore, Theorem 1.5 establishes strong hardness—it rules out online algorithms that succeed even with vanishing probability. Importantly, the probability guarantee is essentially optimal: the probability that a randomly chosen -balanced set of size is an independent set is itself .
Taken together, Theorems 1.4 and 1.5 provide a tight characterization of the performance of online algorithms, providing rigorous evidence toward the following conjecture (we note that a version of this conjecture in the sparse regime appears in [PW24]).
Conjecture 1.6**.**
For any and , no polynomial-time algorithm finds a -balanced independent set of size in whp.
Future Queries
The setting of [GKW25] permits querying a limited set of future edges—edges incident to vertices not yet seen—at each step. That is, the decision at time is based not only on the edges revealed so far, but also on a restricted set of such future edges. In this augmented model, [GKW25] prove both lower bounds and algorithmic guarantees: specifically, they show that algorithms with modest access to future information can in fact exceed the threshold, albeit using quasi-polynomial time.
This naturally raises the following question: can algorithms with limited future information outperform the threshold for ? We show that the answer is yes:
Theorem 1.7** (Informal, see Theorem 3.8).**
For any , there exists an online algorithm which makes limited future queries and finds a -balanced independent set of size with probability at least .
Our algorithm is online, and runs in super-polynomial time. It has a mild dependence on the vertex arrival order—indeed, no online algorithm that is fully oblivious to arrival order can surpass . See Section 3.3 for further discussion.
1.3 Proof Overview
In this section, we will provide an overview of our proof techniques. The proof of the statistical threshold follows a standard application of the first and the second moment similar to that of tailored to the bipartite setting. Much of our effort is in the proof of the computational threshold.
Achievability result
The greedy algorithm for ordinary independent sets is inherently online, however, as mentioned earlier the global nature of the -balancedness constraint introduces technical challenges. We overcome these challenges by designing a two-stage online algorithm with truncation steps, i.e., we stop adding vertices to the independent set from a partition once we have reached the desired number of vertices in . More formally:
In stage one, we greedily add vertices to the independent set until contains vertices from one partition. 2. 2.
At this point, without loss of generality, let us assume . During stage two, we only add vertices in to the independent set, stopping once .
Note that there are two “truncation” points in our algorithm (one in each stage). Furthermore, since we may assume at the beginning of stage two.
In order to prove our acheivability result, we must show that enough vertices from are added to during stage two whp. Under the assumption that , the probability that an arbitrary vertex in considered during stage two is in fact added to is . Therefore, it is enough to show that vertices in remain to be exposed during stage two. The key part of our analysis, therefore, is the following result: stage one concludes in at most steps whp irrespective of the vertex arrival order (Lemma 5.1).
Impossibility result
The proof of the upper bound of our computational threshold falls within the OGP framework (see Section 2.1 for a brief history of the technique). As mentioned earlier, OGP-based arguments have predominantly served as a barrier to stable algorithms. As online algorithms may not be stable, one needs to refine the approach to adapt it to this setting. Our work is one of the first to do so.
Given an algorithm , we aim to bound the probability, denoted by , that finds a -balanced independent set of size at least . At the heart of our proof lies a sequence of correlated random graphs for . We define a successful event determined by running the algorithm on each of the graphs . Roughly speaking, denotes the event that for a specific timestep , the independent sets all have size at least . We then show the following:
, and 2. 2.
.
Combining the above completes the proof.
We note that the sequence of correlated graphs is defined with respect to the algorithm . In particular, the vertex arrival order determines the correlations within the sequence. This is in stark contrast to other applications of the OGP framework and is the key refinement to adapt the OGP technique to the (potentially unstable) online setting. For instance, in [Wei22, PW24, DW24], the authors define sequences of correlated random (hyper)graphs independent of the algorithm in question. They then apply the algorithm to each graph to construct a forbidden substructure whp—a sequence of independent sets which appear with low probability; therefore, arriving at a contradiction.
1.4 Open Problems
We conclude this introduction with a description of potential future directions of inquiry.
One-Stage Algorithm
Our algorithm relies on a two-stage structure to enforce the -balancedness constraint. A natural question is whether a one-stage algorithm—akin to greedy on —can achieve . This seems difficult due to the -balancedness constraint, though one might try introducing a bias at each step: based on the vertex’s side of the bipartition, current balance, and connectivity, decide via a coin flip. We leave this for future work.
Hypergraphs
A recent work [DW24] extends the results of [GS14, RV17, Wei22] to sparse random -uniform hypergraphs, obtaining algorithmic guarantees and sharp computational lower bounds for LDP algorithms, and demonstrating the emergence of an analogous statistical–computational gap. It also investigates the universality of this gap via a multipartite hypergraph version of the largest balanced independent set problem (introduced in [Dha25]) in sparse -uniform -partite hypergraphs, recovering and generalizing the results of [PW24]. Extending our results and the results of [GKW25] to dense random hypergraphs are both interesting directions for future work.
Optimization Problems with Global Constraints
The balanced independent set problem in bipartite graphs is an example of how a problem in can be made -hard by imposing global constraints. It is worth investigating whether such gaps persist across similar problems. For instance, the largest induced matching problem—finding the largest matching such that —is -hard [Cam89]. The statistical threshold was determined in [Coo+21], while the computational threshold is unknown. Another problem to explore would be the -partite graph analogue of the largest balanced independent set problem; we note that a coloring variant of this problem was suggested in [Cha23] for deterministic graphs.
2 Statistical-Computational Gaps and OGP: Prior Work
A wide range of problems across probability theory, computer science, high-dimensional statistics, and machine learning involve randomness and exhibit a common phenomenon: a statistical-computational gap. That is, there is often a discrepancy between what is information-theoretically possible and what is achievable by efficient algorithms. For example, in random optimization problems such as the one we study, the optimal value can often be identified through non-constructive means. However, known polynomial-time algorithms yield strictly suboptimal solutions, and no efficient method is known for finding a global optimum without brute-force search. The models with such an apparent gap include random constraint satisfaction problems (CSP) [MMZ05, AR06, AC08, GS17a, BH21, Yun24], spin glass models [Che+19, HS21, HS23, GJ21, GJW20, GJK23, Kız25, Sel25], number balancing and discrepancy minimization [MS25, GK23, Gam+23], Ising perceptron [Gam+22, LSZ24], as well as various computational problems over random graphs [GS14, GS17, GJW20, Wei22, PW24, DDL23, DW24] and more.
Standard complexity theory is tailored primarily to worst-case hardness and offers limited insight into average-case models (see [Ajt96, BBB21, GK21, VV25] for a few notable exceptions). Nevertheless, these gaps are a very active area of investigation; researchers have developed various frameworks for providing rigorous evidence of hardness. For an overview of these methods, we refer the reader to the excellent surveys [WX18, BPW18, Gam21, GMZ22, Gam25, Wei25].
2.1 Computational Gaps in Random Optimization Problems
For random optimization problems, arguably the most powerful framework for establishing algorithmic hardness is the Overlap Gap Property (OGP) introduced by Gamarnik and Sudan [GS14] (and formally named in [GL18]). Building on insight from statistical physics—particularly the intriguing connection between the onset of algorithmic hardness and geometric phase transitions in random CSPs [MMZ05, AC08, AR06]—the OGP framework has proven instrumental in establishing rigorous algorithmic barriers by leveraging intricate geometry of the optimization landscape. For surveys on OGP, see [Gam21, Gam25].
We briefly describe this framework in its original context: finding large independent sets in sparse random graphs. Gamarnik and Sudan [GS14] established that independent sets of size exhibit an ‘overlap-gap’: any two such sets have either a large or a small intersection, with no overlaps of intermediate size. This structural property enabled them to rule out local algorithms at this threshold, thereby refuting a conjecture of Hatami-Lovász-Szegedy [HLS14] which posited that local algorithms can find maximum independent sets in -regular random graphs.
Subsequent work by Rahman-Virág [RV17] extended this hardness result down to the sharp threshold , below which polynomial-time algorithms are known [LW07]. Unlike the pairwise OGP considered in earlier work, their approach relied on analyzing overlaps among multiple independent sets—a notion termed multi-OGP—to obtain tight lower bounds. Recent works have introduced refined variants of the multi OGP: for instance, asymmetric versions of OGP yielded tight lower bounds against LDP algorithms [Wei22, BH21], and the branching OGP [HS21] has emerged as a very powerful tool in the study of spin glasses. The OGP framework has since become the ‘bread-and-butter’ for proving sharp computational lower bounds in numerous random optimization problems. For certain models—such as Ising perceptron and discrepancy minimization—OGP-based hardness results are complemented by more traditional notions of average-case hardness (e.g., worst-case hardness of approximating the shortest vector in lattices) [VV25].333Interestingly, there exist models exhibiting the OGP, which remain solvable in polynomial time (e.g., by linear programming) [LS24]—beyond the classical counterexample of random XOR-SAT solvable by Gaussian elimination. The literature on OGP is now quite extensive; we refer the reader to references above.
OGP for Online Algorithms
The OGP framework is primarily tailored for stable algorithms—those whose output is insensitive to small variations in the input.444Informally, an algorithm is stable if for any inputs with small , the outputs are close. Many prominent algorithms for average-case models fall into this category, including local algorithms (e.g., factors of iid) [GS14, RV17], LDP algorithms, approximate message passing [GJ21], Boolean circuits with low depth [GJW21], as well as gradient descent and Langevin dynamics [GJW20]. OGP-based hardness arguments rely critically on this stability—for instance, to construct interpolation arguments showing that the algorithm’s trajectory evolves smoothly and thus avoids the intermediate overlap region which is forbidden in the solution space. Online algorithms, however, may be unstable—see [GKW25, Proposition 1.1].
Can OGP-based barriers be extended to online algorithms? This question is especially relevant in the modern era of big data, where the online setting is a natural model of decision-making under uncertainty and have been extensively studied in the optimization and machine learning literature [RST10, RST11, RST11a, RS13, Haz+16]. This question has first been addressed in [Gam+23], where sharp lower bounds for online algorithms were obtained for the Ising perceptron. More recently, OGP-based barriers for online algorithms were obtained for the graph alignment problem [DGH25] and the largest submatrix problem [BGG25].
Extending such online barriers to random graphs—the very setting where the OGP has first emerged in—turned out to be quite challenging. This is particularly due to the lack of stability, a key feature that OGP-based arguments crucially build on. The first lower bounds for online algorithms in were obtained in [GKW25] through novel technical refinements. Their arguments include (i) the construction of temporal interpolation paths that evolve with the algorithm (in contrast to earlier OGP-based barriers, which are algorithm-independent) as well as (ii) the use of stopping times tracking the size of the output. In the present paper, we extend the techniques of [GKW25] to the bipartite setting where the arguments are further refined to (i) incorporate the -balancedness constraint, and (ii) handle the random arrival order, which may reveal very few cross-edges—thus providing limited information.
3 Main Results
We begin by determining the size of the largest -balanced independent set in for constant . Recall from (1) that , where .
Theorem 3.1**.**
Let denote the number of -balanced independent sets of size . For any fixed and as in (1), the following hold:
- (S1)
For , . 2. (S2)
For , .
Thus, the largest -balanced independent set is approximately of size , which we refer to as the statistical threshold. Theorem 3.1 follows from a standard application of the first and the second moment method, see Section 4 for the proof.
Given this benchmark, a natural algorithmic question arises: can we find such independent sets efficiently? Motivated by the fact that the best known algorithm for the maximum independent set problem in is an online greedy algorithm, we naturally investigate the performance of online algorithms.
3.1 Algorithmic Setting
The class of online algorithms we consider is formalized as follows.
Definition 3.2**.**
Let have vertex set , where and . A randomized algorithm with internal randomness determined by seed runs for rounds and keeps track of sets and (initially ). At each round :
Based on and all information revealed so far, randomly selects a vertex and reveals the status of all edges , where . 2. 2.
Based on and all information revealed so far, then decides if and updates the sets: (i) if or (ii) if .
Per Definition 3.2, the vertex arrival order is random, determined jointly by the algorithm’s internal randomness and the randomness of (its edges). If (resp. ), then all edges from to vertices in (resp. ) are absent, so information arises only from edges to the opposite side of bipartition inspected so far. The algorithm may select multiple vertices from the same side in succession, potentially revealing no new information—for example, inspecting only in the first rounds and yielding .
Our results hold for the most general setting: (i) the algorithmic bound (Theorem 3.4) is independent of the arrival order, and (ii) the hardness result (Theorem 3.5) applies to all online arrival scenarios allowed by Definition 3.2.
Our focus is on online algorithms that return large independent sets with specificed probability, formalized as follows.
Definition 3.3**.**
For parameters and , an online algorithm operating according to Definition 3.2 is said to -optimize the -balanced independent set problem in if the following is satisfied when :
[TABLE]
3.2 Algorithmic Results
Equipped with Definitions 3.2 and 3.3, we now present our algorithmic results. Recall from (2). Our first result shows is achievable for any .
Theorem 3.4**.**
For any and , there is an online algorithm that -optimizes the -balanced independent set problem in , where
[TABLE]
See Section 5 for the proof. As noted earlier, the algorithm achieving is online, implemented in two stages. In contrast, the classical greedy algorithm on yields a straightforward online algorithm. The two-stage structure ensures that the global -balancedness constraint is satisfied by the final output, even though the algorithm itself operates using only local information.
We next complement Theorem 3.4 with a sharp lower bound.
Theorem 3.5**.**
For any , there exists no online algorithm that -optimizes the balanced independent set problem in , where
[TABLE]
We prove Theorem 3.5 through a refined version of the OGP framework adapted to the online setting, which leverages geometric properties of tuples of large -balanced independent sets. See Section 6 for the details.
Taken together, Theorems 3.4 and 3.5 indicate the presence of factor- statistical-computational gap with respect to online algorithms, with serving as the computational threshold for this model. As noted earlier, the same factor gap also appears in the context of sparse random bipartite graphs and LDP algorithms [PW24], further supporting the universality of this gap.
Remark 3.6**.**
Observe that while there exists an algorithm succeeding modulo an exponentially small probability below , even those with success probability break down above . This is known as strong hardness, see [HS25]. We highlight that the probability guarantee in Theorem 3.5 is essentially the best possible: the probability that a randomly selected, -balanced set of size is an independent set is at most for some constant .
3.3 Surpassing with Limited Future Queries
Our algorithmic lower bound shows that online algorithms, operating exclusively based on the information available up to round , cannot surpass . This provides strong evidence for the conjecture that is the true computational threshold for this model.
At the same time, prior work [GKW25] made an intriguing observation. For the largest independent set problem in , they showed that granting the algorithm access to a limited amount of additional information allows it to exceed the computational threshold.555The resulting algorithm, albeit being online, requires super-polynomial time. This naturally raises the following question: for the balanced independent set problem in , can be surpassed if the algorithm is permitted a limited number of future queries at each round?
In this section, we show that the answer is yes: online algorithms augmented with a small number of future queries can indeed surpass . To make this precise, we extend the standard definition of an online algorithm (Definition 3.2) to allow future queries
Definition 3.7**.**
For , let be the class of online algorithms operating according to Definition 3.2, with the following extension. At each round , the algorithm may, in addition to observing the edges for , query a (possibly random) set of vertex pairs —which may include vertices not yet revealed—and reveal the status of all edges in . The decision is then based on the combined information, where the number of future queries satisfy
[TABLE]
Note that may include pairs involving vertices from . For this reason, we refer to the edges in as future edges. The total amount of such future information is limited to .
Our final main result is as follows.
Theorem 3.8**.**
For any and , there exists and an online algorithm that -optimizes the -balanced independent set problem in , where
[TABLE]
Our algorithm proceeds in three phases, described informally as follows. The first phase runs for rounds and greedily constructs a -balanced independent set with and , of size for a suitable constant . The second phase is an exploration phase, where—using future queries—it identifies sets and not yet inspected, such that there are no edges (i) between and , and (ii) between and . In the third phase, it performs a brute-force search to identify a -balanced independent set of size inside , and augments this to .
Our analysis shows that the procedure articulated above can be implemented in an online fashion; see Section 7 for the details.
Importantly, our algorithm is not fully oblivious to the vertex arrival order. In fact, no online algorithm can produce a -balanced independent set of size (whp) while remaining completely oblivious to the arrival order. Suppose, for contradiction, that such an algorithm outputs a -balanced independent set of size . If the first arrivals all lie on the same side of the bipartition, then by the online constraint (decisions cannot be deferred), the algorithm must select at least vertices from that side. This, however, precludes adding vertices from the opposite side, since the expected number of vertices there with no edge to the chosen vertices is at most . Thus, some dependence on arrival order is unavoidable. That said, our algorithm’s dependence on arrival order is fairly mild: the only step where order matters is in the first greedy phase (Algorithm 7), designed specifically to avoid such pathological situations.
Number of Future Queries
Our analysis in Section 7 shows that it suffices to take
[TABLE]
where is any arbitrary constant satisfying
[TABLE]
In the balanced case ), the condition on boils down to . In this case, one can choose such that as , e.g., by taking . By contrast, for , our analysis suggests that it is not possible to choose such that as . At first glance, this may seem surprising. However, note the following: as , we have . In particular, as the statistical-computational gap is “smaller” for smaller values of , one expects surpassing the gap to be somewhat “harder”. Indeed, our analysis reflects this: interpolating between and , we shift from to .666We remark that the distinction between and is prevalent in the analogous setting of -balanced colorings. In fact, while the problem has been heavily studied for [Cha23, FK10, Dha25, DW25], no results are known for -balanced colorings for (see the discussion in [DW25, Section 1.3] on the challenges involved).
4 Statistical Threshold: Proof of Theorem 3.1
In this section, we will prove Theorem 3.1. Let the vertex set of be with . Recall the definition of the random variable . We define two new variables for denoting the number of -balanced independent sets in such that proportion of lies in the partition determined by . Note that
[TABLE]
Furthermore, it is easy to see by symmetry that and so it is enough to consider .
We first prove (S1) by a simple first moment argument. Note the following for any :
[TABLE]
For , we have
[TABLE]
as desired.
The remainder of this section is dedicated to the proof of (S2). Suppose . It is easy to verify from (4), using the inequality , that
[TABLE]
Moreover, as for it suffices to prove the claim when . We do so by showing . The result then follows by the Paley-Zygmund inequality.
For notational convenience, set . Thus
[TABLE]
where is the indicator of the event that is a -balanced independent set. Next,
[TABLE]
In what follows, we use the following parameterization:
[TABLE]
Clearly,
[TABLE]
Fix . The number of tuples subject to (6) is
[TABLE]
For any such tuple, the quantity depends solely on :
[TABLE]
where the term accounts for the double counted edges. Combining (5), (8), and (9), we obtain
[TABLE]
Combining this with (4), we arrive at
[TABLE]
Note the following as a result of the bounds in (7):
[TABLE]
and
[TABLE]
Applying these bounds to (10), we have
[TABLE]
where we use the fact that and . To bound , we consider cases.
- Case 1:
. Clearly, in this case. 2. Case 2:
and or and . As , it follows that in this case. 3. Case 3:
and . In this case, we have
[TABLE]
once again. 4. Case 4:
and . Observe that we can further modify to get
[TABLE]
where we note that is well-defined since for . Using the bounds on and from (7), we control the harmonic mean of and as follows:
[TABLE]
Consequently,
[TABLE]
where we use the fact that and is sufficiently large in terms of . With this, (12) is again upper bounded by .
Combining all of the above cases, (11) becomes:
[TABLE]
as and . Using the Paley-Zygmund inequality [AS16],
[TABLE]
completing the proof of (S2).
5 Achievability Result: Proof of Theorem 3.4
Recall the statistical threshold from Theorem 3.1 and (1). In this section, we give an online algorithm that finds an independent set of size at least for arbitrary with high probability, where is as defined in (2). This gives a lower bound on the computational threshold.
Our algorithm will proceed in two stages:
In Stage One, we greedily find an independent set satisfying . 2. 2.
At this point, by relabeling the partitions if necessary, we may assume . During Stage Two, we only add vertices in to the independent set, stopping once .
Before we formally describe the algorithms for each stage, we make the following definition: for a given timestep and vertex , denote by the set of all its neighbors in . Let us now formally describe our greedy algorithm, which constitutes Stage of our procedure.
Algorithm 1 Stage One
1:Initialize and .
2:while do
3: Sample the random vertex .
4: if then
5: Set .
6: else
7: .
8: end if
9: if then
10: , .
11: else
12: , .
13: end if
14: Update .
15:end while
Let be the random variable denoting the number of iterations of the while loop of Algorithm 5, i.e.,
[TABLE]
The key result for Stage One is the following lemma, which shows that is sublinear whp irrespective of the vertex arrival order.
Lemma 5.1**.**
.
We defer the proof of Lemma 5.1 to §5.1. Let us now formally describe Stage Two of our online algorithm. Note that at time , the contribution from one of the partitions to the constructed independent set is . By relabeling the partitions if necessary, we may assume this partition is . For the second stage, we employ the following algorithm.
Algorithm 2 Stage Two
1:for do
2: Sample the random vertex .
3: if then
4: , .
5: else
6: , .
7: if and then
8: Set .
9: else
10: .
11: end if
12: end if
13:end for
Namely, if , we do not add it to the independent set. If , we add it to the independent set only if (i) is less than , and (ii) has no neighbors in . This truncation ensures the -balancedness constraint.
The key result for Stage Two is the following lemma.
Lemma 5.2**.**
\mathbb{P}\left[|I_{2n}\cap R|<\frac{(1-\gamma)}{\gamma}(1-\epsilon)\log_{b}n\,\bigl{\lvert}\,T_{f}\leq n^{1-\epsilon/2}\right]=\exp\left(-\Omega\left(n^{\epsilon/2}\right)\right).
We defer the proof of Lemma 5.2 to §5.2. Before we prove these key lemmas, let us complete the proof of our achievability result.
Proof of Theorem 3.4.
Consider running our two-stage algorithm with input . Let be the output and let be the number of iterations of Algorithm 5. Using the inequality , we have
[TABLE]
Note that conditionally on , the event implies the event that , where recall that by relabeling the partitions if necessary, we may assume that has size at time . Therefore, combining Lemmas 5.1 and 5.2 completes the proof. ∎
5.1 Stage One: Proof of Lemma 5.1
Recall Algorithm 5. To prove Lemma 5.1, we will analyze an alternate process, which does not necessarily produce an independent set, but is easier to analyze and can be coupled with our procedure.
Note that by definition, Algorithm 5 terminates at time . From this time point, we continue the algorithm as follows. Let . At each time step , we consider i.i.d. random variables for . We update
[TABLE]
As remarked before, need not be an independent set for any .
Observe that, by definition of , for any , the probability with which we add to is
[TABLE]
where we recall that . Consider an i.i.d. sequence of random variables for . The observation in (13) leads us to consider the following simple process:
- •
Initialize .
- •
At each time step , update as
[TABLE]
Consider the processes and , where for any , we respectively have and . As a consequence of (13), we note that
[TABLE]
where (resp. ) is the sigma algebra generated by all the information up to (and including) timestep for the process (resp. ).
Furthermore, as all edges are included independently in , we can couple the processes and on the same probability space using (14) as follows:
- •
Let . For each , given , consider independent random variables .
- •
For any , given , define as follows.
[TABLE]
Observe that for any , by construction , and furthermore, the variables make sure that marginally, .
Next, we note that if we define the analogous stopping time
[TABLE]
for the set process , then in fact , as for any , the construction of the processes and are the same. In particular, to conclude the lemma, it is enough to show that
[TABLE]
Since the event implies that either or , we have
[TABLE]
where we introduce
[TABLE]
Since and , the RHS in (15) is at most
[TABLE]
where analogously
[TABLE]
Recall the process defining . Observe that irrespective of whether or , we always have
[TABLE]
where we recall is an i.i.d. sequence with distribution . As a consequence, we can bound
[TABLE]
Note that has mean . Furthermore, we have
[TABLE]
for sufficiently large. A standard Chernoff bound (see, e.g., [AS16]) yields
[TABLE]
completing the proof.
5.2 Stage Two: Proof of Lemma 5.2
Let us now analyze Algorithm 5 conditioning on the event . For brevity, we avoid stating this conditioning at every step of the proof. Once again, we will consider an alternate process . Note that during a single iteration of the for loop of Algorithm 5, we do nothing if or . Suppose the latter condition first occurs at timestep . From this time point, we continue the algorithm as follows. Let . At each time step , we do the following: if , we do nothing; if , we add to if .
Note that as we do nothing for vertices in and since , we have the following for each vertex , where :
[TABLE]
In particular,
[TABLE]
where we use the assumption that . By a simple Chernoff bound (see, e.g., [AS16]), we have
[TABLE]
By the definition of , we have
[TABLE]
From here, we may conclude that
[TABLE]
where we use the fact that as .
6 Impossibility Result: Proof of Theorem 3.5
In this section we prove that the class of online algorithms falling under Definition 3.2 fails to produce an independent set of size at least , for any , with suitably high probability. This shows that the factor- statistical-computational gap cannot be bridged by online algorithms. To achieve this result, we study correlated random graph families. Our proof goes by contradiction. Roughly speaking, under the assumption that an online algorithm indeed finds an independent set of size at least , we analyze over multiple correlated random graph families to exhibit the existence of a particular structure. The contradiction is then derived by arguing that the probability of such a structure existing must satisfy certain inequalities, which we prove fail to hold.
Note that the online algorithm we consider in Section 5 is randomized through the arrival of the vertices . We first argue that in Theorem 3.5, it is enough to consider algorithms that are deterministic instead. To do so, let us formally denote an algorithm as
[TABLE]
where is some abstract probability space capturing the randomness coming from the algorithm , is the vector indicating which edges are present in the random graph, and the output vector indicates which vertices are present in the constructed independent set by the algorithm . With a slight abuse of notation, for a bipartite graph on vertex bipartitions and , we let denote both the graph itself as well as the vector indicating which edges are present.
Lemma 6.1** (Reduction to deterministic algorithms).**
There exists such that the deterministic algorithm defined as satisfies
[TABLE]
Proof.
Since
[TABLE]
and since by definition
[TABLE]
we note that the claimed inequality must hold for some . ∎
Recalling Definition 3.3, we quickly note that as a consequence of Lemma 6.1, to prove Theorem 3.5, it is enough to show that no deterministic online algorithm -optimizes the balanced independent set problem in for the specified values of and . Therefore, for the rest of this section, we only consider deterministic online algorithms.
6.1 Correlated Random Graph Families
Consider a random graph . We will use as a base graph to define our correlated graph families.
Consider any online algorithm , and run it on the base graph . Let be the set of all the edges of queried by the algorithm up-to and including timestep for , and let be the set of vertices exposed by in the first steps of the algorithm.
For any , we define a sequence of random graphs
[TABLE]
as follows:
- •
The graph is a copy of .
- •
For any and any , the status of the edge in is exactly the same as it is in , i.e., it is present in all the graphs if and only if it is present in , and absent in all the graphs otherwise.
- •
For any and for any edge , the status of the edge is independently decided for each graph . In other words, for any ,
[TABLE]
where the collection is a collection of i.i.d. random variables.
We immediately observe the following:
Remark 6.2**.**
**
- •
If we consider the entire graph array , then each entry of both the first column and the last row is the same as .
- •
For any and , marginally the distribution of is the same as .
- •
Since is deterministic, by construction, the behavior of the first steps of the algorithm is exactly the same on all the graphs .
6.2 Forbidden tuples of independent sets
Consider an online algorithm . We wish to bound the probability that produces an independent set of size at least . Fix and define the stopping time
[TABLE]
We analyze the algorithm on the graphs . Define (resp. ) if and otherwise (resp. otherwise). Note that is random and depends on the edges as revealed by the algorithm in the first steps. We define the successful event
[TABLE]
where . The key results en route to the proof of Theorem 3.5 provide lower and upper bounds on . We state these results here and prove them in Sections 6.2.1 and 6.2.2, respectively.
Proposition 6.3**.**
Let denote the event that . Then .
Proposition 6.4**.**
Let for some sufficiently large constant . Then,
[TABLE]
Let us now combine the above propositions to prove our impossibility result.
Proof of Theorem 3.5.
Suppose there exists an online algorithm that -optimizes the -balanced independent set problem in . Fix for sufficiently large. By Propositions 6.3 and 6.4, we must have
[TABLE]
as desired. ∎
6.2.1 Lower Bound: Proof of Proposition 6.3
In this section, we will prove Proposition 6.3 by a careful application of Jensen’s inequality and conditional independence. Recall the definition of . We have
[TABLE]
where we use the fact that is determined by as is deterministic. Note that by the observations made in Remark 6.2, the algorithm is identical for the first steps on the graphs . With this in hand, we have
[TABLE]
where the last step follows since the random variables are i.i.d. Plugging this in above, we have
[TABLE]
Observing that and for , we can further simplify the above to
[TABLE]
where we use Jensen’s inequality in the final step. Once again, since , we have
[TABLE]
Plugging this into (18) completes the proof.
6.2.2 Upper Bound: Proof of Proposition 6.4
In this section, we will prove Proposition 6.4. Recall the the definitions of the successful event from (17), the array from Section 6.1, the stopping time from (16), and the random parts of the vertex bipartition. Note that on the event , for any , , where . Recall also that for any , is the set of edges queried by the algorithm in the first steps, and denotes the set of vertices exposed in the first steps.
Observation 6.5**.**
We note that on the successful event , the following statements hold:
- •
By the construction of , the sets are identical for .
- •
* has size for all .*
- •
* has the same size for all .*
It is convenient at this point to introduce the following set:
[TABLE]
Next, let us define certain sets which will be convenient for us to decompose the successful event . Recall and . Additionally, for any set , we denote its power set by .
Definition 6.6** (Forbidden -tuples).**
For any , , , , and , we define the set to be the set of all -tuples , where each has the following properties:
- •
For each , is a -balanced independent set in .
- •
* for each .*
- •
The sets are identical for .
- •
* and for all .*
Let us further define to be the size of the set .
Recalling Observation 6.5, and keeping Definition 6.6 in mind, we conclude that
[TABLE]
Thus to conclude Proposition 6.4, using (19) and a union bound, it is enough to obtain an upper bound on the sum
[TABLE]
To this end, we introduce a new random variable. For a fixed , and , let be the set of -tuples , where for each , is a -balanced independent set in with the following properties:
- •
for each .
- •
The sets are identical for .
- •
and for all .
We further upper bound (20) by
[TABLE]
where . The following key lemma provides a bound on .
Lemma 6.7**.**
For any fixed , , , and , we have
[TABLE]
Before we prove the above lemma, we note that it implies our desired result. Indeed, we have
[TABLE]
It follows that (20) is at most . Recalling (19), this completes the proof of Proposition 6.4.
Proof of Lemma 6.7.
Consider an arbitrary collection of -balanced sets . Recall that by definition of , we have is identical for each . With this in mind, we define the following sets:
[TABLE]
and for each and
[TABLE]
With these definitions in hand and recalling Remark 6.2 and Observation 6.5, we have
[TABLE]
where
[TABLE]
Consider a fixed . Let and . We have two cases to consider depending on which side contains proportion of the balanced independent set .
- Case 1:
. Then, we have
[TABLE] 2. Case 2:
. Then, we have
[TABLE]
Combining both cases, by a simple counting argument we have that is at most
[TABLE]
Plugging the above into (21), we obtain
[TABLE]
Recall that . The above can be simplified to
[TABLE]
Note that since , we have
[TABLE]
With this in hand, we may simplify further to obtain
[TABLE]
where we plug in . Recall that . In particular, for sufficiently large, the above is , as desired. ∎
7 Future Queries: Proof of Theorem 3.8
Fix an arbitrary constant
[TABLE]
where . Indeed, otherwise there exists no -balanced independent set of size per Theorem 3.1. Note that as and , we have
[TABLE]
In what follows, we prove the existence of such an online algorithm with
[TABLE]
As noted, our algorithm proceeds in three phases.
Phase I: Greedy Rounds
In phase I, we find a -balanced independent set of size for arising in (22). Proceeding analogously to the proof of Theorem 3.4, we first run Algorithm 5 until time where
[TABLE]
Plugging in place of in Lemma 5.1, we deduce
[TABLE]
The bound (23) ensures that (24) is not vacuous. Note that at time , one of the partitions to the independent set is of size . By relabeling the partitions if necessary, we again assume this partition is . For the second stage, we employ the following modification of Algorithm 5. Set where is arbitrary.
Algorithm 3 Phase I: Part II
1:for do
2: Sample a random vertex .
3: , .
4: if and then
5: Set .
6: else
7: .
8: end if
9:end for
Lemma 7.1**.**
.
Proof.
For , let denote the number of such that for which there are no edges between and . We have
[TABLE]
as are iid Bernouli random variables with parameter . Clearly, it suffices to show that
[TABLE]
as we only add vertices as long as . As , we have
[TABLE]
by a standard Chernoff bound (see, e.g., [AS16]), completing the proof. ∎
Combining (24) and Lemma 7.1 through a union bound, we have the following with probability for : (i) , (ii) is a -balanced independent set of size where
[TABLE]
Namely, a -balanced independent set of size is found in time with high probability. This concludes Phase I of our algorithm. Note that for , —i.e., no future queries have been made.
In what follows, we condition on the outcome of the greedy rounds.
Phase II: Exploration Phase
We condition on the outcome of Phase I and fix a with
[TABLE]
Observe that is well defined if
[TABLE]
which is equivalent to (22). At round , we do the following:
- •
Sample a random vertex . If , update and ; otherwise update and .
- •
Query involving vertices with and and reveal the edge status of all pairs .
- •
Using the edges in the step above, identify a of size for which . Lemma 7.2 below justifies the existence of whp.
At the end of round , we set . Similarly, at round , we do the following:
- •
Sample a random vertex . If , update and ; otherwise update and .
- •
Query involving vertices with and and reveal the edge status of all pairs .
- •
Using the edges in the step above, identify a of size for which . Lemma 7.2 justifies the existence of whp.
We then set .
Lemma 7.2**.**
For satisfying (25), such sets and indeed exist with probability at least .
Proof.
Let
[TABLE]
Note that for
[TABLE]
Since we deterministically have , and by conditioning, stochastically dominates , which has mean (see, e.g., [Roc24, Chapter 4]). Proceeding identically to the proof of Lemma 7.1,
[TABLE]
as satisfies (25). Similarly, define
[TABLE]
Clearly, for
[TABLE]
As , and therefore stochastically dominates which has mean . Proceeding identically as above, we obtain that
[TABLE]
Noting that such and with sizes exist if and only if , we conclude the proof by combining (26) and (27) through a union bound. ∎
Phase III: Brute-Force Step
At round , we:
- •
Sample a vertex . If , update and ; otherwise update and .
- •
(Pre-Processing) Set and . Note that . Removing at most three vertices from the larger set if necessary, we may assume that .
- •
(Brute-Force Search) Query involving vertices where and and reveal the edge status of all pairs . Using this information, identify a -balanced independent set in ) of size , where
[TABLE]
Lemma 7.3 justifies the existence of whp. We note that this step has quasi-polynomial running time as mentioned in Section 3.3.
Set .
Lemma 7.3**.**
The bipartite random graph with vertex set contains a -balanced independent set of size .
Proof.
Applying Theorem 3.1 to , it suffices to verify that
[TABLE]
which holds due to (25). ∎
Final Phases
For rounds :
- •
We randomly sample a vertex . If , update and . Otherwise, set and .
- •
If , set . Else, set .
The algorithm above is online and constructs a -balanced independent set of size . We finally verify the constraint (3) on the number of queries arising in Definition 3.7.
Number of Future Queries
Note that except . We have:
[TABLE]
This completes the proof of Theorem 3.8. range
pages10
rangepages10
rangepages10
rangepages10
rangepages-1
rangepages9
rangepages6
rangepages1
rangepages32
rangepages14
rangepages9
rangepages56
rangepages15
rangepages10
rangepages30
rangepages5
rangepages10
rangepages26
rangepages10
rangepages33
rangepages67
rangepages12
rangepages31
rangepages51
rangepages1
rangepages8
rangepages24
rangepages30
rangepages12
rangepages38
rangepages28
rangepages169
rangepages13
rangepages10
rangepages11
rangepages14
rangepages1
rangepages1
rangepages22
rangepages35
rangepages27
rangepages36
rangepages31
rangepages-1
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AC 08] Dimitris Achlioptas and Amin Coja-Oghlan “Algorithmic barriers from phase transitions” In 2008 49th Annual IEEE Symposium on Foundations of Computer Science , 2008, pp. 793–802 IEEE
- 2[AR 06] Dimitris Achlioptas and Federico Ricci-Tersenghi “On the solution-space geometry of random constraint satisfaction problems” In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing , 2006, pp. 130–139
- 3[Ajt 96] Miklós Ajtai “Generating hard instances of lattice problems” In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing , 1996, pp. 99–108
- 4[AS 16] Noga Alon and Joel H Spencer “The probabilistic method” John Wiley & Sons, 2016
- 5[BPW 18] Afonso S Bandeira, Amelia Perry and Alexander S Wein “Notes on computational-to-statistical gaps: predictions using statistical physics” In ar Xiv preprint ar Xiv:1803.11132 , 2018
- 6[BGT 10] Mohsen Bayati, David Gamarnik and Prasad Tetali “Combinatorial approach to the interpolation method and scaling limits in sparse random graphs” In Proceedings of the forty-second ACM symposium on Theory of computing , 2010, pp. 105–114
- 7[BGG 25] Shankar Bhamidi, David Gamarnik and Shuyang Gong “Finding a dense submatrix of a random matrix. Sharp bounds for online algorithms” In ar Xiv preprint ar Xiv:2507.19259 , 2025
- 8[BBB 21] Enric Boix-Adserà, Matthew Brennan and Guy Bresler “The Average-Case Complexity of Counting Cliques in Erdös-Rényi Hypergraphs” In SIAM Journal on Computing SIAM, 2021, pp. FOCS 19–39
