The Normalized Matching Property in Random and Pseudorandom Bipartite   Graphs

Niranjan Balachandran; Deepanshu Kush

arXiv:1908.02628·math.CO·June 25, 2021

The Normalized Matching Property in Random and Pseudorandom Bipartite Graphs

Niranjan Balachandran, Deepanshu Kush

PDF

TL;DR

This paper investigates the conditions under which bipartite graphs exhibit the Normalized Matching Property (NMP), establishing thresholds in random graphs and showing that pseudorandom graphs can be made to have NMP after minor vertex removals.

Contribution

It introduces sharp thresholds for NMP in random bipartite graphs and demonstrates that pseudorandom graphs can attain NMP after removing a small fraction of vertices, extending understanding of graph matchings.

Findings

01

Identifies p=log(n)/k as a sharp threshold for NMP in G(k,n,p).

02

Shows pseudorandom bipartite graphs can have NMP after minor vertex deletions.

03

Proves an almost vertex decomposition into Euclidean trees with NMP.

Abstract

A simple generalization of the Hall's condition in bipartite graphs, the Normalized Matching Property (NMP) in a graph $G (X, Y, E)$ with vertex partition $(X, Y)$ states that for any subset $S \subseteq X$ , we have $\frac{∣ N ( S ) ∣}{∣ Y ∣} \geq \frac{∣ S ∣}{∣ X ∣}$ . In this paper, we show the following results about having the Normalized Matching Property in random and pseudorandom graphs. 1. We establish $p = \frac{l o g n}{k}$ as a sharp threshold for having NMP in $G (k, n, p)$ , which is the graph with $∣ X ∣ = k, ∣ Y ∣ = n$ (assuming $k \leq n \leq exp (o (k))$ ), and in which each pair $(x, y) \in X \times Y$ is an edge independently with probability $p$ . This generalizes a classic result of Erd\H{o}s-R\'enyi on the $\frac{l o g n}{n}$ threshold for having a perfect matching in $G (n, n, p)$ . 2. We also show that a pseudorandom bipartite graph - upon deletion of a vanishingly small fraction of vertices…

Equations115

∣ e (A, B) - p ab ∣ \leq p nab (1 + εp a) .

∣ e (A, B) - p ab ∣ \leq p nab (1 + εp a) .

P (X \geq E (X) + t)

P (X \geq E (X) + t)

P (X \leq E (X) - t)

\frac{∣ N ( T ) ∣}{∣ X ∣} < \frac{∣ T ∣}{∣ Y ∣} \Rightarrow \frac{∣ X ∖ N ( T ) ∣}{∣ X ∣} > \frac{∣ Y ∖ T ∣}{∣ Y ∣} \geq \frac{∣ N ( X ∖ N ( T )) ∣}{∣ Y ∣},

\frac{∣ N ( T ) ∣}{∣ X ∣} < \frac{∣ T ∣}{∣ Y ∣} \Rightarrow \frac{∣ X ∖ N ( T ) ∣}{∣ X ∣} > \frac{∣ Y ∖ T ∣}{∣ Y ∣} \geq \frac{∣ N ( X ∖ N ( T )) ∣}{∣ Y ∣},

P [d (x) < \frac{ε n lo g n}{2 k} for some x \in X] \leq k exp (- \frac{( 1 + ε /2 ) ^{2} n lo g n}{2 ( 1 + ε ) k}) \leq n^{- ε^{2} /8} .

P [d (x) < \frac{ε n lo g n}{2 k} for some x \in X] \leq k exp (- \frac{( 1 + ε /2 ) ^{2} n lo g n}{2 ( 1 + ε ) k}) \leq n^{- ε^{2} /8} .

L

L

ℓ

\dots

r_{3}

r_{2}

Pr [N = 0] \leq exp (- \frac{E [ N ]}{2}) \leq exp (- \frac{n ^{α}}{2}) = exp (- n^{Ω_{ε} (1)}) = o (1)

Pr [N = 0] \leq exp (- \frac{E [ N ]}{2}) \leq exp (- \frac{n ^{α}}{2}) = exp (- n^{Ω_{ε} (1)}) = o (1)

P_{ℓ}

P_{ℓ}

P_{k}

P_{ℓ}

P_{ℓ}

\leq exp (- (1 + ε) \cdot n lo g n \cdot \frac{ℓ}{k} (1 - \frac{ℓ}{k}) + (1 + \frac{ε}{8})^{2} \cdot n \cdot \frac{ℓ}{k} \cdot lo g k)

\leq exp (\frac{n ℓ}{k} \cdot lo g n [- (1 + ε) (1 - ε^{'}) + (1 + \frac{ε}{8})^{2}])

< exp (- \frac{ε}{8} \cdot \frac{n}{k} \cdot lo g n)

P_{ℓ} \leq exp (- (1 + ε) \cdot n lo g n \cdot \frac{ℓ}{k} (1 - \frac{ℓ}{k}) + ℓ \cdot (1 + \frac{n}{k}) \cdot (1 + lo g \frac{k}{ℓ}))

P_{ℓ} \leq exp (- (1 + ε) \cdot n lo g n \cdot \frac{ℓ}{k} (1 - \frac{ℓ}{k}) + ℓ \cdot (1 + \frac{n}{k}) \cdot (1 + lo g \frac{k}{ℓ}))

P_{ℓ} \leq exp (- (1 + ε) \cdot n lo g n \cdot ε^{'} (1 - ε^{'}) + 2 n \cdot \frac{ℓ}{k} \cdot lo g \frac{k}{ℓ} + \frac{2 n ℓ}{k}) < exp (- \frac{ε n lo g n}{2} + 3 n)

P_{ℓ} \leq exp (- (1 + ε) \cdot n lo g n \cdot ε^{'} (1 - ε^{'}) + 2 n \cdot \frac{ℓ}{k} \cdot lo g \frac{k}{ℓ} + \frac{2 n ℓ}{k}) < exp (- \frac{ε n lo g n}{2} + 3 n)

n (1 - \frac{ℓ}{k}) \leq ⌈ n (1 - \frac{ℓ}{k}) ⌉ \leq (1 + \frac{ε}{8}) n (1 - \frac{ℓ}{k})

n (1 - \frac{ℓ}{k}) \leq ⌈ n (1 - \frac{ℓ}{k}) ⌉ \leq (1 + \frac{ε}{8}) n (1 - \frac{ℓ}{k})

exp (- (1 + ε) \cdot lo g n \cdot \frac{ℓ}{k} \cdot ⌈ n (1 - \frac{ℓ}{k}) ⌉ + (k - ℓ) (1 + (1 + \frac{ε}{8}) \frac{n}{k}) (1 + lo g \frac{k}{k - ℓ}))

exp (- (1 + ε) \cdot lo g n \cdot \frac{ℓ}{k} \cdot ⌈ n (1 - \frac{ℓ}{k}) ⌉ + (k - ℓ) (1 + (1 + \frac{ε}{8}) \frac{n}{k}) (1 + lo g \frac{k}{k - ℓ}))

\leq exp (- (1 + ε) \cdot n lo g n \cdot \frac{ℓ}{k} (1 - \frac{ℓ}{k}) + (1 + \frac{ε}{8})^{3} n \cdot (1 - \frac{ℓ}{k}) \cdot lo g k)

P_{ℓ}

P_{ℓ}

Σ_{1} := \frac{n}{k} ℓ \geq ℓ_{m i n} \sum (ℓ k) (⌊ \frac{n ℓ}{k} ⌋ n) (1 - p)^{ℓ ⌈ n (1 - \frac{ℓ}{k}) ⌉} ((2 ℓ) (⌈ \frac{n}{k} ⌉ p)^{2})^{t} .

Σ_{1} := \frac{n}{k} ℓ \geq ℓ_{m i n} \sum (ℓ k) (⌊ \frac{n ℓ}{k} ⌋ n) (1 - p)^{ℓ ⌈ n (1 - \frac{ℓ}{k}) ⌉} ((2 ℓ) (⌈ \frac{n}{k} ⌉ p)^{2})^{t} .

Σ_{1}

Σ_{1}

(Using 2 t \geq ℓ - 3 and p \leq \frac{2 lo g n}{k})

(Using \frac{n}{k} \leq lo g n)

\frac{e k}{ℓ} \leq \frac{4 e n lo g n}{k} \leq 4 e lo g^{2} n

\frac{e k}{ℓ} \leq \frac{4 e n lo g n}{k} \leq 4 e lo g^{2} n

Σ_{2}

Σ_{2}

M_{1}

M_{1}

\leq

\leq

(Using s \leq \frac{k}{2 l o g n})

(Using s \leq \frac{k}{2 l o g n} again)

(As \frac{n}{k} \leq lo g n and \frac{k}{n} \leq 1)

<

\frac{n}{k} = \frac{L}{ℓ},

\frac{n}{k} = \frac{L}{ℓ},

q a > e (A, B) > p ab - p nab (1 + εp a)

q a > e (A, B) > p ab - p nab (1 + εp a)

p ab - p nab (1 + εp a) < q a .

p ab - p nab (1 + εp a) < q a .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

The Normalized Matching Property in Random and Pseudorandom Bipartite Graphs

Niranjan Balachandran111Department of Mathematics, Indian Institute of Technology Bombay. Email: [email protected]. and Deepanshu Kush222Department of Computer Science, University of Toronto. Email: [email protected]. Work was performed while the author was at IIT Bombay.

Abstract

A simple generalization of the Hall’s condition in bipartite graphs, the Normalized Matching Property (NMP) in a graph $G(X,Y,E)$ with vertex partition $(X,Y)$ states that for any subset $S\subseteq X$ , we have $\frac{|N(S)|}{|Y|}\geq\frac{|S|}{|X|}$ . In this paper, we show the following results about the Normalized Matching Property in random and pseudorandom graphs.

We establish $p=\frac{\log n}{k}$ as a sharp threshold for having NMP in $\mathbb{G}(k,n,p)$ , which is the graph with $|X|=k,|Y|=n$ (assuming $k\leq n\leq\exp(o(k))$ ), and in which each pair $(x,y)\in X\times Y$ is an edge independently with probability $p$ . This generalizes a classic result of Erdős-Rényi on the $\frac{\log n}{n}$ threshold for having a perfect matching in $\mathbb{G}(n,n,p)$ . 2. 2.

We also show that a pseudorandom bipartite graph - upon deletion of a vanishingly small fraction of vertices - admits NMP, provided it is not too sparse. More precisely, a bipartite graph $G(X,Y)$ , with $k=|X|\leq|Y|=n$ , is said to be Thomason pseudorandom (following A. Thomason (Discrete Math., 1989)) with parameters $(p,\varepsilon)$ if each $x\in X$ has degree at least $pn$ and each pair of distinct $x,x^{\prime}\in X$ has at most $(1+\varepsilon)p^{2}n$ common neighbors. We show that for any large enough $(p,\varepsilon)$ -Thomason pseudorandom graph $G(X,Y)$ , there are “tiny” subsets $\mathrm{Del}_{X}\subset X,\ \mathrm{Del}_{Y}\subset Y$ such that the subgraph $G(X\setminus\mathrm{Del}_{X},Y\setminus\mathrm{Del}_{Y})$ has NMP, provided $p\gg\tfrac{1}{k}$ . En route, we prove an “almost” vertex decomposition theorem: Every such Thomason pseudorandom graph admits - excluding a negligible portion of its vertex set - a partition of its vertex set into graphs that we call Euclidean trees. These are trees that have NMP, and which arise organically through the Euclidean GCD algorithm.

1 Introduction

Consider the following problems:

Suppose $k\leq n$ are positive integers. By a $k\times n$ * star-array* (or simply star-array), we mean a $k\times n$ array whose entries are symbols from the set $\{0,\star\}$ . Given a $k\times n$ star-array, when is it possible to replace some of the $\star$ entries of the array by non-negative integers such that in the resulting array all the row sums equal $R$ , and all the column sums equal $C$ for some integers $R,C>0$ ? 2. 2.

Let $q$ be a sufficiently large prime power and suppose $X,Y\subset\mathbb{F}_{q}$ with $|Y|=10|X|$ , $|X|\geq q/100$ . Is it possible to label each element of $Y$ with some element of $X$ such that each element of $X$ appears as a label exactly $10$ times, and further, for each $y\in Y$ labeled $x$ , the sum $x+y$ is a quadratic residue? More generally, one can ask the same question with a subgroup $H\subset\mathbb{F}_{q}^{*}$ instead of the set of quadratic residues.

In both the problems posed above, there is a natural bipartite graph $G(X,Y,E)$ that captures the problem in its essence: Given a star-array $\mathcal{A}$ , let $X$ and $Y$ denote the set of rows and columns of $\mathcal{A}$ respectively, and a vertex $x\in X$ is adjacent to $y\in Y$ in $G$ if and only if the $(x,y)$ entry of $\mathcal{A}$ corresponding to a $\star$ . For the second problem consider the bipartite graph $G(X,Y,E)$ where $X,Y$ are the given sets, and the pair $(x,y)$ is an edge in $G$ if and only if $x+y\in H$ .

In the rest of the paper, $G(X,Y)$ shall denote a bipartite graph with vertex partition $(X,Y)$ ; we shall drop the $E$ in our notation for convenience. We say that $G=G(X,Y)$ has the Normalized Matching Property (NMP for short) if: For any $S\subseteq X$ , if we denote by $N(S)$ , its set of neighbors in $Y$ , then $\frac{|N(S)|}{|Y|}\geq\frac{|S|}{|X|}$ . In particular, if $|X|=|Y|$ , then this is the familiar Hall’s condition for the existence of a perfect matching in $G$ .

The following theorem of Kleitman [16] gives us an equivalent formulation of NMP in bipartite graphs:

Theorem 1.1.

The following statements are equivalent:

•

$G$ * with $|X|=k,|Y|=n$ has NMP.*

•

For any independent set $I$ in $G$ , $\frac{|I_{X}|}{k}+\frac{|I_{Y}|}{n}\leq 1$ .

•

There exists a multiplicity function $m:E\to\mathbb{N}_{0}=\mathbb{N}\cup\{0\}$ such that $\displaystyle\sum_{\begin{subarray}{c}e\ni x\\ e\in E\end{subarray}}m(e)$ (resp. $\displaystyle\sum_{\begin{subarray}{c}e\ni y\\ e\in E\end{subarray}}m(e)$ ) is equal for all $x\in X$ (resp. for all $y\in Y$ ).

It is easy to see that the problems posed above simply ask if the associated bipartite graphs have NMP by virtue of the third part of Theorem 1.1.

The Normalized Matching Property in bipartite graphs was introduced by Graham and Harper [11] and subsequently has been a focus of study in bipartite graphs in several papers (for instance [16, 24]) and some monographs as well (for instance [4, 7]). The notion also extends very naturally to finite ranked posets; for a ranked poset $P$ , let $L_{i}$ denote the set of all elements of $P$ with rank $i$ . Then we say that $P$ has NMP if for each $i$ , the bipartite graph of poset covering relations between $L_{i}$ and $L_{i+1}$ has NMP. NMP posets are objects of great interest specifically in related decomposition problems (see [12, 13, 14, 22, 23] for some decompositions results). As a concrete instance, the Griggs conjecture which states that any unimodal NMP poset admits a nested chain decomposition (see [14] or [25] for more details on what the definitions are) is still open - even for posets of rank $3$ - despite several attacks on the problem.

As it turns out, many interesting finite ranked posets arising from finite geometric structures have NMP. Indeed, the Boolean poset, the poset of affine flats in a finite projective $n$ -dimensional space and the poset of the subgroup lattice of abelian $p$ -groups all have NMP (see [21, 22, 23] respectively), i.e., in each of these posets, the associated bipartite graphs on the sets of elements of successive ranks within these posets have NMP. As is the case with Hall’s theorem for bipartite graphs, it is clear that graphs with “high density” are more likely to possess NMP. But in each of the instances listed above, the associated bipartite graphs are very sparse. This raises the following natural question: At what density does a typical bipartite graph have NMP?

To formulate the above question more precisely, we set up some asymptotic terminology and notation. Given functions $f,g$ , we write $f\gg g$ (resp. $f\ll g$ ) if $\displaystyle\lim_{n\to\infty}\frac{f(n)}{g(n)}\to\infty$ (resp. $\frac{f(n)}{g(n)}\to 0$ ). We also write $f=o(g)$ to denote that $f\ll g$ . We write $f=O(g)$ (resp. $f=\Omega(g)$ ) if there exists an absolute constant $C>0$ and $n_{0}$ such that for all $n\geq n_{0},|f(n)|\leq C|g(n)|$ (resp. if $|f(n)|\geq C|g(n)|$ ). If the constant $C$ involves a related parameter $\varepsilon$ , then we write $f=O_{\varepsilon}(g)$ (resp. $f=\Omega_{\varepsilon}(g)$ ) to indicate the dependence of the implicit constant on the parameter $\varepsilon$ .

To formalize the question posed above, we recall some standard terminology from the theory of random graphs. For a probability space $(\Omega,\mathbb{P})$ we say that an event $\mathcal{E}_{n}$ that depends on a parameter $n$ occurs with high probability (abbreviated as whp) if $\mathbb{P}(\mathcal{E}_{n})\to 1$ as $n\to\infty$ . A graph property $\mathcal{P}$ is simply a collection of graphs, and a graph property is called monotone if whenever $G\in\mathcal{P}$ and $G\subset H$ then $H\in\mathcal{P}$ as well. The Erdős-Rényi random graph model $\mathbb{G}(n,p)$ introduced in [9] is the random graph where the vertex set is the set $[n]:=\{1,\ldots,n\}$ and each pair $\{i,j\}$ is an edge with probability $p=p(n)$ independently. A monotone graph property $\mathcal{P}$ is said to have a threshold $p_{0}=p_{0}(n)$ if whenever $p\gg p_{0}$ then $\mathbb{G}(n,p)$ has property $\mathcal{P}$ whp, and if $p\ll p_{0}$ then whp $\mathbb{G}(n,p)$ does not have property $\mathcal{P}$ . A property $\mathcal{P}$ is said to have a sharp threshold $p_{0}(n)$ if for $\varepsilon>0$ and $p\geq(1+\varepsilon)p_{0}$ , $\mathbb{G}(n,p)$ has property $\mathcal{P}$ whp and for $p\leq(1-\varepsilon)p_{0}$ , $\mathbb{G}(n,p)$ does not have property $\mathcal{P}$ whp.

The seminal paper of Erdős and Rényi [9] established sharp thresholds for several very natural monotone graph properties. A theorem of Bollobas and Thomason [6] showed that every monotone graph property admits a threshold. However, not all graph properties admit sharp thresholds; for instance, the property “ $\mathbb{G}(n,p)$ contains a cycle” admits a threshold which is sharp on one side but not the other (see [15] for more on sharp thresholds). In fact, the problem of determining sharp thresholds (if the graph property admits one) is a very popular motif in the theory of random graphs.

For bipartite graphs, Erdős and Rényi also introduced the random bipartite model $\mathbb{G}(n,n,p)$ where the vertex set is partitioned into two sets $X,Y$ of size $n$ each, and each pair $\{x,y\}$ with $x\in X,y\in Y$ is in $\mathbb{G}(n,n,p)$ independently with probability $p$ . One of the first results in this model is the result that $\frac{\log n}{n}$ is a sharp threshold for the existence of a perfect matching in $\mathbb{G}(n,n,p)$ [10]. As observed earlier, if $k=n$ , NMP is the same as Hall’s condition for bipartite graphs, so it is natural to seek the threshold for NMP in a slightly more general model for bipartite random graphs, which is what the question previously posed seeks to do.

Suppose $k\leq n$ are positive integers, and let $0\leq p\leq 1$ . Let $\mathbb{G}(k,n,p)$ denote the random bipartite graph with the vertex partition given by $(X,Y)$ with $|X|=k,|Y|=n$ , and each pair $(x,y)\in X\times Y$ is an edge in $\mathbb{G}$ independently with probability $p$ . Here both $k$ and $n$ should be thought of as parameters growing to infinity with $n$ being a function of $k$ that always satisfies $n\geq k$ . Our first main result in this paper establishes a sharp threshold for NMP in the sense stated above:

Theorem 1.2.

Suppose $k\leq n(k)\leq\exp(o(k))$ , and let $0<\varepsilon,\delta<1$ . There exists $k_{0}=k_{0}(\varepsilon,\delta)$ such that for $k\geq k_{0}(\varepsilon,\delta)$

If $p\geq\frac{(1+\varepsilon)\log n}{k}$ then $\mathbb{P}[\mathbb{G}(k,n,p)\textrm{\ has\ NMP}]\geq 1-\delta$ . 2. 2.

If $p\leq\frac{(1-\varepsilon)\log n}{k}$ then $\mathbb{P}[\mathbb{G}(k,n,p)\textrm{\ has\ NMP}]\leq\delta$ .

In other words, $\mathbb{G}(k,n,p)$ has a sharp threshold for NMP at $p=\frac{\log n}{k}$ .

Note that if $n>\exp(k)$ or equivalently, if $\log n>k$ , then the expression for our threshold exceeds one. Also, for each fixed $p<1$ , if $C>1+\log(\frac{1}{1-p})$ and $n\geq\exp(Ck)$ , then a simple computation shows that the probability that $Y$ has at least one isolated vertex is bounded away from zero (this will be clear from the proof of Theorem 1.2; see Lemma 3.1). Hence, the range for $n$ in the statement of the theorem is essentially the widest possible one if one seeks a sharp threshold.

Let us now return to the problems at the beginning of this section. To check if a given bipartite graph has NMP is computationally simple: form a bigger new bipartite graph $G^{\prime}(X^{\prime},Y^{\prime})$ with $|X^{\prime}|=|Y^{\prime}|=nk$ with $X^{\prime}$ consisting of by $n$ copies of $X$ , $Y^{\prime}$ consisting of $k$ copies of $Y$ , and $x^{\prime}y^{\prime}$ being an edge in $G^{\prime}$ if and only if $xy$ was an edge in $G$ . Then it is straightforward to see that $G$ has NMP if and only if $G^{\prime}$ admits a perfect matching. Hence either problem admits a computationally simple solution. But let us relax our requirement and seek an answer only in an approximate sense: For the first problem, is it possible to replace each $\star$ entry with a non-negative integer such that with the exception of a negligible proportion of the rows/columns, the remaining rows and columns satisfy the aforementioned property? Or in the second problem, can we ignore a negligible proportion of elements from both sets $X,Y$ , so that the desired property holds for the remaining elements? Since either of the originally posed problems is equivalent to asking if a given bipartite graph has NMP, this approximate version asks if a given bipartite graph “almost” has NMP in a certain sense that we shall formalize below.

The bipartite graph considered in the second problem (with the subsets of $\mathbb{F}_{q}$ ) possesses certain regularity properties that are best described as “random-like” - as we shall soon see. Taking a cue from this, we impose the following reasonable hypotheses on bipartite graphs that we shall consider: If all the vertices of $X$ have “almost” the same degree, and suppose that no two vertices of $X$ have “too many” common neighbors in $Y$ (so that there isn’t a clustering of edges between some subsets of $X$ and $Y$ ), is there an affirmative answer to the approximate version for these problems?

To formulate this in more precise terms, we need the notion of a pseudorandom bipartite graph. The notion of pseudorandomness was first introduced by Thomason in the 80s [20] and pseudorandomness in graphs is a well-studied notion (see [18] for a definitive survey). One of the more popular and well-understood models for pseudorandomness in graphs is the notion of an $(n,d,\lambda)$ graph (see [2]). An $(n,d,\lambda)$ graph is a graph on $n$ vertices which is $d$ -regular and which satisfies the following property: If $d=\lambda_{1}\geq\lambda_{2}\geq\cdots\geq\lambda_{n}$ are the eigenvalues of $G$ then $|\lambda_{i}|\leq\lambda$ for all $i>1$ .

Pseudorandom graphs, as the name suggests, have some properties very reminiscent of random graphs, and the most well-known is the Expander-Mixing Lemma (see [2]): Suppose $G$ is an $(n,d,\lambda)$ graph. If $U,W\subset V(G)$ then $|e(U,W)-\frac{d|U||W|}{n}|\leq\lambda\sqrt{|U||W|}$ , where $e(U,W)$ denotes the number of edges of the form $uw$ with $u\in U$ and $w\in W$ .

As mentioned earlier, Thomason introduced the notion of pseudorandomness which is a little more general, and in particular, we shall - in this paper - confine our attention to the notion of pseudorandomness in bipartite graphs as proposed by Thomason in [21].

Definition 1.1.

Suppose $0<p<1$ , and $0\leq\varepsilon<1$ . A bipartite graph $G$ with vertex classes $X$ and $Y$ of sizes $k$ and $n$ respectively with $k\leq n$ is called Thomason pseudorandom with parameters $(p,\varepsilon)$ if every vertex in $X$ has degree at least $pn$ , and every pair of distinct vertices in $X$ have at most $p^{2}n(1+\varepsilon)$ neighbors in common.

At this juncture, a few remarks are in order. Thomason’s original definition in [21] actually only considers bipartite graphs with $|X|=|Y|=n$ . Secondly, Thomason’s definition in [21] is more in line with the original notion of pseudorandomness in [20]: A graph $G(X,Y)$ is pseudorandom with parameters $(p,\mu)$ for some $\mu\geq 0$ where the second condition states that every pair of vertices in $X$ have at most $p^{2}n+\mu$ common neighbors. The definition that we shall be using is a relaxation of the restriction that $|X|=|Y|$ , but also a restriction to the more natural and intuitive case where $\mu\leq\varepsilon p^{2}n$ .

Notions of pseudorandomness are usually “symmetric” or “global” in their definitions as in the definition in [20] or in the definition of an $(n,d,\lambda)$ graph. This latter notion is at first glance somewhat asymmetric in the sense that the conditions imposed on the degrees and codegrees are only for the vertices of $X$ . However, it is a simple exercise (which we shall not get into here) to show that these conditions also imply certain restrictions on the degrees and codegrees of the vertices of $Y$ as a consequence of the following analogue of the expander-mixing lemma (restricted to our setup):

Theorem 1.3 (Theorem 2 in [21]).

Let $G(X,Y)$ be a bipartite graph with $|X|=k\leq n=|Y|$ , which is Thomason pseudorandom with parameters $(p,\varepsilon)$ . Then for every subset $A\subseteq X$ of size at least $1/p$ and every subset $B\subseteq Y$ , with $|A|=a$ and $|B|=b$ ,

[TABLE]

Again, we remark that Thomason’s theorem in [21] is stated for pseudorandom bipartite graphs $G(X,Y)$ with $|X|=|Y|=n$ and parameters $(p,\mu)$ . But a glance at the proof there immediately tells us that the same proof works in our general setup as well. The interesting point is that this asymmetric definition of pseudorandomness also yields the aforementioned theorem. A heuristic and somewhat simplistic explanation for this is that we are restricting ourselves to bipartite graphs, and it is precisely due to the bipartite structure of the graph that the arguments go through.

Another reason why we prefer to work with this notion of pseudorandomness is that it is combinatorial in its definition; it only considers the degrees of the vertices and codegrees of pairs of vertices of $X$ , which is computationally easy to verify. In addition, it is a reasonably robust notion which also allows us to generate several non-trivial examples of Thomason pseudorandom graphs. While it is true that many notions of pseudorandomness do pass onto subgraphs, we did not find any concrete statement in the literature that established the same here for this notion. So we took it on ourselves to prove its robustness; see the lemma in the Appendix for a precise statement.

Pseudorandom graphs enjoy several very interesting properties. It is not hard to show that $(n,d,\lambda)$ graphs with $d-\lambda\geq 2$ are $d$ -edge connected and as a simple consequence, it follows that for even $n$ , $(n,d,\lambda)$ graphs have a perfect matching [18]. In the more general context, it is conceivable that Thomason pseudorandom graphs admit “almost-perfect” matchings, i.e., admit a perfect matching on at least $(1-o(1))|V|$ vertices under not-too-restrictive conditions. The second result of our paper proves a more general version of this statement for NMP for Thomason pseudorandom graphs.

Before we formally state our result, we need the following definition.

Definition 1.2 (NMP-Approximability).

Suppose $\varepsilon>0$ . For functions $f,g:\mathbb{R}^{+}\rightarrow\mathbb{R}^{+}$ such that $f(x),g(x)\to 0$ as $x\to 0$ , a bipartite graph $G(X,Y)$ is said to be $(f,g,\varepsilon)$ -NMP approximable if there are subsets $\mathrm{Del}_{X}\subseteq X$ and $\mathrm{Del}_{Y}\subseteq Y$ such that:

•

$\frac{|\mathrm{Del}_{X}|}{|X|}\leq f(\varepsilon)$ , $\frac{|\mathrm{Del}_{Y}|}{|Y|}\leq g(\varepsilon)$

•

The bipartite subgraph induced on the sets $X\setminus\mathrm{Del}_{X}$ and $Y\setminus\mathrm{Del}_{Y}$ has NMP.

We now state our second main result of the paper.

Theorem 1.4.

Suppose $0<\varepsilon<1$ , and let $\omega:\mathbb{N}\to\mathbb{R}^{+}$ be a non-negative valued function that satisfies $\omega(k)\to\infty$ as $k\to\infty$ . There exists an integer $k_{0}=k_{0}(\varepsilon,\omega)$ such that the following holds. Suppose $p\geq\frac{\omega(k)}{k}$ , $|X|=k,|Y|=n$ with $k_{0}<k\leq n$ , and suppose $G=G(X,Y)$ is a Thomason pseudorandom bipartite graph with parameters $(p,\varepsilon)$ . Then $G$ is $(f,g,\varepsilon)$ -NMP-approximable with

(a)

$f(x)=O(x)$ , $g(x)=O(\sqrt{x})$ if $n>\frac{k}{\sqrt{\varepsilon}}$ and 2. (b)

$f(x)=g(x)=O(\sqrt[4]{x}\log\left(\frac{1}{x}\right))$ * if $n\leq\frac{k}{\sqrt{\varepsilon}}$ .*

Note that in the statement of Theorem 1.4 the bounds $f=g=O(x^{1/4}\log(1/x))$ work for all $(k,n)$ . The first part of the theorem is a stronger conclusion when $n\gg k$ . At the level of generality of the statement of Theorem 1.4, it may in fact be necessary to delete some vertices from the graph in order to achieve NMP. Indeed, the definition of a Thomason pseudorandom graph does not preclude the existence of isolated vertices; in fact, one could add a few isolated vertices to $Y$ to get another pseudorandom graph with only slightly worse parameters! Also, on a less frivolous note, suppose $n=O(k)$ and $\omega(k)\ll\log k$ , and consider $\mathbb{G}(k,n,p)$ ; a consequence of the proof of the second item of Theorem 1.2 (which appears later in the paper as Lemma 3.1) shows that there are isolated vertices in $Y$ whp. Since $\mathbb{G}(k,n,p)$ is also Thomason pseudorandom whp it follows that over the sparser regime for $p$ (where Theorem 1.4 is applicable), the deletion of some vertices is indeed necessary to arrive at the conclusion of Theorem 1.4.

Theorem 1.4 essentially says that if we have a not-too-sparse pseudorandom bipartite graph, i.e., a Thomason pseudorandom graph with $p$ not too small, then we can remove a small fraction of vertices from both parts such that the graph induced by the remaining vertices has the normalized matching property. The sense of how small these sets are is described using the notion of NMP-Approximability defined above. As we shall see, the proof actually establishes an “approximate decomposition” theorem: the vertex set of any Thomason pseudorandom bipartite graph almost admits a decomposition into copies of what we call a Euclidean Tree - a small tree that arises canonically via the execution of the Euclidean algorithm. Furthermore, the entire process of obtaining $\mathrm{Del}_{X}$ and $\mathrm{Del}_{Y}$ is algorithmic (and efficient) in nature and we consider this to be a major feature of our argument. After the publishing of this article, we have learned that this notion of Euclidean Trees has been defined prior to our work in the context of graphic matroids333We thank Attila Sali for bringing this to our attention. (see [26]). So we find it quite interesting to see it reappear in the context of a seemingly unrelated problem.

The rest of the paper is organised as follows. The next section gives some preliminaries and sets up terminology and tools that will be of use in the latter sections. In Section 3 we prove Theorem 1.2, and in Section 4, we prove Theorem 1.4. The paper concludes with some remarks and open questions in Section 5, and an Appendix. As mentioned earlier, the lemma in the Appendix can serve as a generator of several examples of Thomason-pseudorandom graphs for which Theorem 1.4 is applicable. The main reason for including the Lemma is that most of the standard and well-studied examples of pseudorandom graphs that arise from algebraic structures/posets tend to have $|X|=|Y|$ , or even in the cases where $|X|\neq|Y|$ , the corresponding bipartite graphs are much sparser than the ones we need in our hypothesis.

2 Preliminaries

Suppose $G(X,Y,E)$ is a bipartite graph. For $U\subseteq X\cup Y$ , set $U_{X}:=U\cap X$ , $U_{Y}:=U\cap Y$ . For sets $A\subseteq X,B\subseteq Y$ , by $G(A,B)$ we shall mean the subgraph of $G$ induced by the vertex set $A\cup B$ . For a vertex $x$ , $d(x)$ shall denote its degree, and for sets $A\subseteq X,B\subseteq Y$ , $e(A,B)$ shall denote the number of edges between $A$ and $B$ .

We shall repeatedly make use of the Chernoff bound:

Theorem 2.1.

[Chernoff Bound] (As in [15]) Suppose $X\sim Bin(n,p)$ is a binomial random variable and $\lambda:=\mathbb{E}(X)=np$ . Then for $t>0$

[TABLE]

A natural question that arises in the context of NMP is: If $G(X,Y)$ has NMP, then does $G(Y,X)$ also have NMP, i.e., is it true that for all $T\subseteq Y,\ \frac{|N(T)|}{|X|}\geq\frac{|T|}{|Y|}$ ? This is not immediately obvious from the definition of NMP, but it is indeed the case, as can be immediately seen from the second characterization of Theorem 1.1 which is symmetric in $X$ and $Y$ .

We begin with a simple proposition that will be instrumental in our proof of Theorem 1.2 in Section 3. For a graph $G(X,Y)$ that does not have NMP we say that a set of vertices $S\subseteq X$ witnesses the violation of NMP for $G(X,Y)$ if $\frac{|N(S)|}{|Y|}<\frac{|S|}{|X|}$ .

Lemma 2.1.

Suppose $G(X,Y)$ with $|X|=k$ , $|Y|=n$ does not have NMP. Then, if $T\subset Y$ witnesses the violation of NMP for $G(Y,X)$ , then $X\setminus N(T)\subset X$ witnesses the violation of NMP for $G(X,Y)$ . Moreover, either there exists $S\subset X$ that witnesses the violation of NMP for $G(X,Y)$ with $|S|\leq\frac{k}{2}$ , or there exists $T\subset Y$ that witnesses the violation of NMP for $G(Y,X)$ with $|T|<\frac{n}{2}+\frac{n}{k}$ .

Proof.

If $T\subset Y$ witnesses the violation of NMP for $G(Y,X)$ , then

[TABLE]

where we subtracted both sides from $1$ and used the simple fact that $N(X\setminus N(T))\subseteq Y\setminus T$ in the final inequality. Now, to see the “moreover” part, as $G$ does not have NMP, first let $S$ be a minimal set that witnesses the violation of NMP for $G(X,Y)$ . By the minimality of $S$ , we have $|N(S)|\geq\frac{n}{k}(|S|-1)$ . If $|S|\leq\frac{k}{2}$ , then we are through, so suppose that $|S|>\frac{k}{2}$ . Let $T=Y\setminus N(S)$ . Then note that $|T|<\frac{n}{2}+\frac{n}{k}$ . But then by the argument above (which is symmetric in $X$ and $Y$ ), $T$ witnesses the violation of NMP for $G(Y,X)$ .

∎

We also take note of a couple of facts from literature on random graphs that will be useful in the proof of Theorem 1.2. By $d(x)$ (respectively $d(y)$ ) we mean the degree of vertex $x$ into $Y$ (respectively the degree of vertex $y$ into $X$ ) in $G(X,Y)=\mathbb{G}(k,n,p)$ .

Fact 2.2.

Let $p\geq\frac{(1+\varepsilon)\log n}{k}$ . For any fixed $r\in\mathbb{N}$ , in $G(X,Y)$ , $d(x)\geq r$ for all $x\in X$ and $d(y)\geq r$ for all $y\in Y$ whp.

This follows from the following well known result (see [5] for instance, chapter 3) that in $\mathbb{G}(n,n,p)$ if $p=\frac{\log n+(r-1)\log\log n+\omega(n)}{n}$ for any function $\omega(n)$ that goes to infinity with $n$ , then whp $\mathbb{G}(n,n,p)$ has minimum degree $r$ since the number of vertices of degree $r$ is approximately Poisson. The same argument extends to $\mathbb{G}(k,n,p)$ as well.

Fact 2.3.

Let $p\geq\frac{(1+\varepsilon)\log n}{k}$ and suppose $n\geq 2k$ . Then in $G(X,Y)$ , whp every $x\in X$ has degree at least $\frac{\varepsilon n\log n}{2k}$ .

This is an easy consequence of the Chernoff bound (Theorem 2.1). Indeed, since $\mathbb{E}[d(x)]=(1+\varepsilon)\frac{n\log n}{k}$ , it follows that

[TABLE]

We now introduce an important ingredient that is vital to the proof of Theorem 1.4. Suppose $\ell,L$ are positive integers with $gcd(\ell,L)=1$ . A tree will be called a left-right tree if the two color classes of its vertex set are labelled as “left” and “right” respectively. Since a connected bipartite graph admits a unique $2$ -coloring of its vertices, a left-right tree can be thought of a tree with a label on each vertex denoting its color class.

The Euclidean $\mathbf{(\ell,L)}$ -tree which we shall denote by $T_{\ell,L}$ , is a left-right tree on $\ell+L$ vertices with $\ell$ left vertices, and $L$ right vertices that is defined recursively as follows. If $\ell=1$ , $T_{1,L}$ is simply a star on $L+1$ vertices with one left vertex and $L$ right vertices. If $L=1$ , then $T_{\ell,1}$ is the star on $\ell+1$ vertices with one right vertex, and $\ell$ left vertices. In general, suppose $X=\{x_{1},\ldots,x_{\ell}\}$ and $Y=\{y_{1},\ldots,y_{L}\}$ are the left and right vertex sets respectively, and suppose $\ell<L$ . Let $M_{1}$ denote the matching consisting of the edges $\{x_{i},y_{i+L-\ell}\}$ for $1\leq i\leq\ell$ . We define $T_{\ell,L}=M_{1}\sqcup T_{\ell,L-\ell}$ where $\sqcup$ denotes an edge disjoint union, and $T_{\ell,L-\ell}$ is the corresponding Euclidean tree with left vertex set $X^{\prime}=X$ and right vertex set $Y^{\prime}=\{y_{1},\ldots,y_{L-\ell}\}$ . If $\ell>L$ then we define $M_{1}$ to be the matching $\{x_{i+\ell-L},y_{i}\}$ for all $1\leq i\leq L$ and define $T_{\ell,L}=M_{1}\sqcup T_{\ell-L,L}$ where $T_{\ell-L,L}$ is the Euclidean tree with left vertex set $X^{\prime}=\{x_{1},\ldots,x_{\ell-L}\}$ and right vertex set $Y^{\prime}=Y$ . A picture is worth a thousand words; see Figure 1 that illustrates the Euclidean tree $T_{3,7}$ , and Figure 2 that illustrates $T_{5,8}$ .

The following lemma conveys why Euclidean trees are relevant to us.

Lemma 2.2.

Suppose $T=T_{\ell,L}$ is a Euclidean tree. Then if $X,Y$ denote the sets of left and right vertices respectively, then $T$ as the bipartite graph $T(X,Y)$ has NMP. Moreover, so does the graph obtained by making several vertex-disjoint copies $T(X_{i},Y_{i})$ of $T$ i.e., the graph $\mathcal{T}(\mathcal{X},\mathcal{Y})$ where $\mathcal{X}=X_{1}\sqcup\cdots\sqcup X_{r}$ , $\mathcal{Y}=Y_{1}\sqcup\cdots\sqcup Y_{r}$ .

Proof.

First assume that $\ell<L$ . If $\ell=1$ , then $T$ is simply a star with $L$ leaves, and clearly, $T$ has NMP. Suppose by induction that Euclidean trees with fewer than $\ell+L$ vertices have NMP. Let $S\subseteq X$ . Then since $T=M_{1}\sqcup T_{\ell,L-\ell}$ , it follows that $N(S)=\{y_{j+L-\ell}:x_{j}\in S\}\sqcup N^{\prime}(S)$ where $N^{\prime}(S)$ is the set of neighbors of $S$ among $\{y_{1},\ldots,y_{\ell}\}$ . But since $T_{\ell,L-\ell}$ has NMP, we have $|N^{\prime}(S)|\geq\frac{L-\ell}{\ell}|S|$ , so that $|N(S)|\geq|S|+\frac{L-\ell}{\ell}|S|=\frac{L}{\ell}|S|$ and that completes the proof. If $\ell>L$ , then the above argument works with $\ell$ swapped with $L$ throughout and the fact that $T(X,Y)$ has NMP if and only if $T(Y,X)$ does. Finally, the observation that $\mathcal{T}(\mathcal{X},\mathcal{Y})$ has NMP follows immediately from the third (multiplicity function) characterization of NMP in Theorem 1.1. ∎

We now describe what we call the “Euclidean ( $\ell,L)$ -tree process” which details a realization of the graphs $T_{\ell,L}$ through a series of steps, which along with the corresponding terminology we build here will be relevant in Section 4 in the proof of Theorem 1.4. This description also justifies why we call them Euclidean trees.

Suppose $\ell<L$ . Consider the Euclidean algorithm on the pair $(\ell,L)$ as follows.

[TABLE]

If we set $r_{m+1}=L,r_{m}=\ell,r_{0}=0$ , then we may write the equalities above as $r_{i+1}=q_{i}r_{i}+r_{i-1}$ for $1\leq i\leq m$ . $m$ is referred to as the complexity of the Euclidean algorithm for the parameters $(\ell,L)$ . The following fact is well-known (see for instance, [17], page 360).

Fact 2.4.

The complexity of the Euclidean algorithm with input parameters $(\ell,L)$ is at most $2.078\log L+0.6723$ .

We now describe $T_{\ell,L}$ as the evolution of an inductive sequence of trees through $m$ stages ( $m$ as above), and in order to do that, we need some additional terminology. By an $X$ * $q$ -fan*, we mean the tree $T_{1,q}$ and by a $Y$ $q$ -fan, we mean $T_{q,1}$ . By an $X$ $q$ -thrill444The collective noun for fans is a thrill, so the nomenclature seemed appropriate. of size $r$ we mean a union of $r$ vertex disjoint $X$ $q$ -fans, and a $Y$ $q$ -thrill is defined analogously. For a fixed graph $F$ , an $F$ -factor in a graph $G$ is a spanning subgraph of $G$ consisting of vertex disjoint copies of $F$ . As an example, an $X$ $q$ -thrill admits a factoring by $X$ $q$ -fans.

By definition, $T_{\ell,L}$ is inductively obtained through a sequence of edge disjoint unions of matchings, until we finally terminate in a tree $T_{q,1}$ or $T_{1,q}$ , for some $q$ . We now invert this process.

Suppose $m$ as described above in the Euclidean algorithm is even (the odd case is analogous). Let $T_{1}:=T_{r_{\scriptscriptstyle 2},r_{\scriptscriptstyle 1}}=T_{r_{2},1}$ . Having inductively defined $T_{i-1}$ with left set $X^{(i-1)}$ , right set $Y^{(i-1)}$ and edge set $E_{i-1}$ , we define $T_{i}$ as follows. If $i$ is even, then the vertex set of $T_{i}$ has left set $X^{(i)}:=\{x_{1},\ldots,x_{r_{i}}\}$ , right set $Y^{(i)}=\{y_{1},\ldots,y_{r_{i+1}}\}$ , and the edges of $T_{i}$ consist of the edges of $T_{i-1}$ along with an additional $X$ $q_{i}$ -thrill of size $r_{i}$ between the vertices of $X^{(i-1)}$ and the vertices of $Y^{(i)}\setminus Y^{(i-1)}$ . If $i$ is odd, then $T_{i}$ has left vertex set $X^{(i)}:=\{x_{1},\ldots,x_{r_{i+1}}\}$ , right vertex set $Y^{(i)}:=\{y_{1},\ldots,y_{r_{i}}\}$ and the edges of $T_{i}$ consist of the edges of $T_{i-1}$ along with an additional $Y$ $q_{i}$ -thrill of size $r_{i}$ between the vertices of $X^{(i)}\setminus X^{(i-1)}$ and the vertices of $Y^{(i-1)}$ . In simpler terms, it is the same construction but with the roles of the left and right sets reversed as per the parity of $i$ . The main point is that the graphs $T_{i}$ are precisely the Euclidean trees $T_{r_{(i+1)},r_{i}}$ (or $T_{r_{i},r_{(i+1)}}$ depending on the parity of $i$ ) along with isolated vertices. While the inductive definition of the Euclidean tree $T_{\ell,L}$ appends one additional matching at each step, the Euclidean tree process accelerates this by adding a $q$ -thrill for an appropriate $q$ . In particular, $T_{m}$ is precisely $T_{\ell,L}$ and as we shall see in Section 4, it is particularly handy to think of $T_{\ell,L}$ as the end result of this evolving process. Figure 2 gives an illustration of this evolution for the Euclidean tree $T_{5,8}$ .

3 Threshold for NMP for $\mathbb{G}(k,n,p)$

In this section we prove Theorem 1.2, restated below for convenience. Throughout this section, we shall write $\mathbb{G}$ to denote $\mathbb{G}(k,n,p)$ . Unless stated otherwise, we shall assume $k\leq n\leq\exp(o(k))$ .

Theorem $\mathbf{1.2}$ .

Suppose $k\leq n(k)\leq\exp(o(k))$ , and let $0<\varepsilon,\delta<1$ . There exists $k_{0}=k_{0}(\varepsilon,\delta)$ such that for $k\geq k_{0}(\varepsilon,\delta)$

If $p\geq\frac{(1+\varepsilon)\log n}{k}$ then $\mathbb{P}[\mathbb{G}(k,n,p)\textrm{\ has\ NMP}]\geq 1-\delta$ . 2. 2.

If $p\leq\frac{(1-\varepsilon)\log n}{k}$ then $\mathbb{P}[\mathbb{G}(k,n,p)\textrm{\ has\ NMP}]\leq\delta$ .

We establish item 2 first i.e., that if $p$ is below the threshold then whp, $\mathbb{G}$ does not have NMP. The proof is straightforward as it simply shows the existence of an isolated vertex in $Y$ whp.

Lemma 3.1.

Suppose $n=n(k)$ be such that $k\leq n(k)$ for all $k\in\mathbb{N}$ . Let $0<\varepsilon<1$ . There exists $k_{0}=k_{0}(\varepsilon)$ such that for $k\geq k_{0}$ , if $p\leq\frac{(1-\varepsilon)\log n}{k}$ then $\mathbb{G}(k,n,p)$ does not have NMP whp.

Proof.

Let $G(X,Y)=\mathbb{G}$ and let $N$ denote the number of isolated vertices in $Y$ . Then $\mathbb{E}[N]=n(1-p)^{k}$ .

Claim 3.1.

Given $c>1$ , there exists a unique $x_{c}\in(0,1)$ such that for all $x\in(0,x_{c}]$ , $1-x\geq\exp(-cx)$ and equality holds only when $x=x_{c}$ . Moreover, as $c\rightarrow 1^{+}$ , $x_{c}\rightarrow 0^{+}$ .

The claim is a standard exercise in basic calculus, so we omit its proof.

Fix $c$ such that $1<c<\frac{1}{1-\varepsilon}$ . Since $p<\frac{(1-\varepsilon)\log n}{k}=o(1)$ , by the above claim, there exists $k$ sufficiently large such that $1-p\geq\exp(-cp)$ . Consequently, $\mathbb{E}[N]=n\cdot(1-p)^{k}\geq\exp(-cpk+\log n)=\exp(\alpha\log n)=n^{\alpha}$ which grows to infinity as $k$ does, where $\alpha=\alpha(\varepsilon)$ is defined to be $1-c(1-\varepsilon)>0$ . Now using the Chernoff bound (taking $t=\lambda=\mathbb{E}[N]$ in the second inequality in Theorem 2.1), we have

[TABLE]

for large $n$ . This concludes the proof. ∎

Lemma 3.1 establishes that the right threshold for having NMP in $\mathbb{G}$ must be at least as large as $\frac{\log n}{k}$ . The following is a heuristic argument that suggests that it is exactly $\frac{\log n}{k}$ . As mentioned in the Introduction, a classical result of Erdős-Rényi states that a sharp threshold for the existence of a perfect matching in a bipartite graph $\mathbb{G}(n,n)$ is $p=\frac{\log n}{n}$ . In our present situation, suppose $k$ divides $n$ . Replicate each vertex of $X$ by a factor of $n/k$ to obtain the set $X^{\prime}$ . Define the graph $G^{\prime}(X^{\prime},Y)$ as follows. If $x^{\prime}\in X^{\prime}$ arises from the replication of the vertex $x\in X$ , then $x^{\prime}y\in E(G^{\prime})$ if and only if $xy\in E(G)$ . It is a straightforward exercise to see that the original graph $G(X,Y)$ has NMP if and only if $G^{\prime}(X^{\prime},Y)$ satisfies Halls’ condition, or equivalently, $G$ has NMP if and only if $G^{\prime}$ has a perfect matching. If this new bipartite graph behaves likes $\mathbb{G}(n,n,p)$ (which it isn’t), then we need $p\sim\frac{\log n}{n}$ for the existence of a perfect matching. But since each vertex of $X$ has been blown up to $n/k$ copies, it is intuitive to expect that each vertex of $G$ behaves like the union of all these $n/k$ vertices bundled together, which suggests a threshold of $\frac{n}{k}\cdot\frac{\log n}{n}=\frac{\log n}{k}$ . While this argument is just a heuristic, it suggests what the correct threshold ought to be, as we next show is indeed the case by establishing the remaining (and main) item $1$ of Theorem 1.2.

Here is an overview of the proof. Lemma 3.2 proves the theorem when $n/k$ is large (i.e., grows to infinity with $k$ ), and this part of the proof only takes recourse to Theorem 1.1. The general case however is a little more delicate. The basic idea in the general case of the proof considers estimating the probability that there is a minimal set $S$ that violates the NMP condition. In that sense, our strategy follows a line of argument á la Erdős-Rényi but we need some additional ideas and more careful analysis to carry it through to fruition.

Lemma 3.2.

Suppose $n=k\omega(k)$ where the function $\omega(k)\geq 1$ for all $k\in\mathbb{N}$ and satisfies $\omega(k)\rightarrow\infty$ as $k\rightarrow\infty$ . Let $0<\varepsilon,\delta<1$ . Then there exists $k_{0}=k_{0}(\varepsilon,\delta)$ such that for $k\geq k_{0}(\varepsilon,\delta)$ , if $p\geq\frac{(1+\varepsilon)\log n}{k}$ , then $\mathbb{P}[\mathbb{G}(k,n,p)\textrm{\ has\ NMP}]\geq 1-\delta$ .

Proof.

Let $0<\varepsilon<1/5$ , and let $|X|=k\leq n=|Y|$ . Since NMP is a monotone property, it suffices to establish the lemma for $p=\frac{(1+\varepsilon)\log n}{k}$ .

Suppose $\mathbb{G}$ fails to have NMP. By Theorem 1.1, there exists an independent set $I=I_{X}\cup I_{Y}$ in $\mathbb{G}$ such that $\frac{|I_{X}|}{k}+\frac{|I_{Y}|}{n}>1.$ Thus, from the union bound, the probability that $\mathbb{G}$ does not have NMP is at most $\sum_{\ell=1}^{k}P_{\ell}$ where for $1\leq\ell\leq k$ , where

[TABLE]

Here, $P_{\ell}$ is an upper bound on the probability that there is a set $S\subseteq X$ of size $\ell$ and a set $T\subseteq Y$ of size $\left\lceil n\left(1-\frac{\ell}{k}\right)\right\rceil$ such that $S\cup T$ is an independent set. $P_{k}$ is an upper bound on the probability that $Y$ contains an isolated vertex.

We define $\varepsilon^{\prime}:=\varepsilon/2$ and split $\sum_{\ell}P_{\ell}$ into three cases according to whether $\ell$ is “small”, “intermediate”, or “large” and repeatedly make use of the well-known bounds $1+x\leq\exp(x)$ for all $x\in\mathbb{R}$ and the binomial coefficients $\binom{N}{K}\leq\left(\frac{eN}{K}\right)^{K}$ for all $K\leq N$ .

Small Case: $1\leq\ell\leq\varepsilon^{\prime}k$ .

Here, using $\binom{n}{\left\lceil n\left(1-\frac{\ell}{k}\right)\right\rceil}=\binom{n}{\left\lfloor\frac{n\ell}{k}\right\rfloor}$ followed by standard binomial coefficient bounds, (1) yields

[TABLE]

where to derive (4), we use the bounds $\left\lceil n\left(1-\frac{\ell}{k}\right)\right\rceil\geq n\left(1-\frac{\ell}{k}\right)$ , $1+\log\frac{k}{\ell}\leq 1+\log k\leq(1+\frac{\varepsilon}{8})\log k$ and $1+\frac{n}{k}\leq(1+\frac{\varepsilon}{8})\frac{n}{k}$ for large enough $k$ . This is where we crucially use our assumption that $n/k\rightarrow\infty$ as $k\rightarrow\infty$ . (5) follows by using the trivial fact that $\log k\leq\log n$ and taking out the common factor $\frac{n\ell}{k}\cdot\log n$ . (6) is obtained by using $\ell\geq 1$ , plugging in $\varepsilon^{\prime}=\varepsilon/2$ and working out that the expression in the square brackets in (5) is at most $-\varepsilon/8$ for small $\varepsilon$ . Finally, since $\frac{n}{k}>\frac{16}{\varepsilon}$ for large enough $k$ , it follows that $P_{\ell}<1/n^{2}$ in this case.

Intermediate Case: $\varepsilon^{\prime}k\leq\ell\leq(1-\varepsilon^{\prime})k$ .

Using the same expression for the upper bound on $P_{\ell}$ as in the previous case, we have

[TABLE]

Using the observation that in this case, $\frac{\ell}{k}(1-\frac{\ell}{k})\geq\varepsilon^{\prime}(1-\varepsilon^{\prime})$ and the trivial bound $1+\frac{n}{k}\leq\frac{2n}{k}$ , we obtain

[TABLE]

where the last inequality follows - setting $x=\ell/k$ - from the fact that $x\log\frac{1}{x}<0.5$ for all $0<x<1$ . Hence, $P_{\ell}<\frac{1}{n^{\varepsilon n/3}}.$

Large Case: $(1-\varepsilon^{\prime})k\leq\ell<k$ .

This case is completely analogous to the small case. First, observe

[TABLE]

for large enough $k$ (again using $n/k\rightarrow\infty$ as $k\rightarrow\infty$ ) and we have that $P_{\ell}$ is at most

[TABLE]

where in the last step we use the bound $1+\log\frac{k}{k-\ell}\leq 1+\log k\leq\left(1+\frac{\varepsilon}{8}\right)\log k$ for large enough $k$ . Consequently,

[TABLE]

To explain the last step, the expression within the square brackets evaluates to $\frac{\varepsilon}{512}(\varepsilon^{2}+280\varepsilon-64)$ which is at most $\frac{-199\varepsilon}{12800}<\frac{-\varepsilon}{128}$ when $0<\varepsilon<1/5$ . But $\frac{n}{k}>256/\varepsilon$ for sufficiently large $k$ and $n$ since $n/k\rightarrow\infty$ . Thus, we have $\sum_{\ell}P_{\ell}=o(1)$ and that completes the proof of the lemma. ∎

Note that the argument in the intermediate case does not require $k=o(n)$ and in fact shows the following (in light of Theorem 1.1, switching from the independent set viewpoint to the violation of NMP viewpoint):

Corollary 3.2.

Given $\varepsilon>0$ , for any $k\leq n$ large enough, and vertex sets $X$ and $Y$ of sizes $k$ and $n$ respectively, the probability that there exists $S\subset X$ with $\varepsilon^{\prime}k\leq|S|\leq(1-\varepsilon^{\prime})k$ for $\varepsilon^{\prime}=\varepsilon/2$ such that $S$ witnesses a violation of NMP for $G(X,Y)=\mathbb{G}(k,n,p)$ is at most $n^{-\Omega_{\varepsilon}(n)}$ .

Interestingly, the proof of Lemma 3.2 actually works out for all $n\geq k$ if one assumes $p\geq\frac{10\log n}{k}$ in the hypothesis instead of the sharper assumption on $p$ . This, combined with Lemma 3.1, already establishes that $\frac{\log n}{k}$ is a threshold for NMP. The additional ideas employed in the remainder of this section are essentially only required to show that $\frac{\log n}{k}$ is a sharp threshold.

Proof of Theorem 1.2..

In light of Lemma 3.2, it suffices to prove the theorem assuming $\frac{n}{k}\leq\log n$ . $\log n$ here may be replaced by any slow-growing (but unbounded) function of $k$ or $n$ without much change to the rest of the argument, but we stick to $\log n$ for convenience.

By Lemma 2.1 either there exists $S\subset X$ with $|S|\leq k/2$ that witnesses a violation of NMP for $G(X,Y)$ , or there exists $T\subset Y$ with $|T|<\frac{n}{2}+\frac{n}{k}$ that witnesses the violation of NMP for $G(Y,X)$ (of course, these cases need not be mutually exclusive; we merely use that combined, they exhaust the event that NMP is violated). The proof naturally splits into cases (labelled $X$ and $Y$ respectively) according to whether the set winessing the violation is a subset of $X$ or $Y$ . We shall show that either case occurs with low probability by exploiting certain properties of the minimal witness.

Case $X$ :

Define $\ell_{\min}$ to be the constant $\frac{18}{\varepsilon}$ if $1\leq\frac{n}{k}<2$ and $\frac{\varepsilon\log n}{2}$ if $2\leq\frac{n}{k}\leq\log n$ . In light of Facts 2.2 (for $r=\frac{36}{\varepsilon}$ if $1\leq\frac{n}{k}<2$ ) and 2.3, it follows that any minimal $S\subset X$ that witnesses the violation of NMP for $G(X,Y)$ must have size at least $|S|\geq\frac{k\delta(G)}{n}\geq\ell_{\min}$ whp where $\delta(G)$ denotes the minimum degree of the vertices in $X$ . The choice of the peculiar constant $r=\frac{36}{\varepsilon}$ will become clear later.

Suppose $S\subset X$ such that $\ell_{\min}\leq|S|=\ell\leq\varepsilon^{\prime}k$ where $\varepsilon^{\prime}=\frac{\varepsilon}{2}$ . We first claim that every $U\subset N(S)$ of size $\left\lceil\frac{n}{k}\right\rceil$ witnesses at least $2$ neighbors (as a set) in $S$ . Indeed, suppose there is a subset $U$ of $\lceil\frac{n}{k}\rceil$ vertices in $N(S)$ which are the neighbors of only one vertex $x$ in $S$ . Then by the minimality of $S$ , it follows that the set $S^{\prime}=S\setminus\{x\}$ satisfies $\frac{n}{k}|S|-\lceil\frac{n}{k}\rceil>|N(S^{\prime})|\geq\frac{n}{k}(|S|-1)$ which is a contradiction, and that proves the claim.

We divide case $X$ further into two subcases. First, we bound the probability that there exists $S\subset X$ of size $\ell$ for which $\frac{4\ell n\log n}{k^{2}}<1$ (notice that this clearly implies $\ell\leq\varepsilon^{\prime}k$ ) which witnesses a violation of NMP for $G(X,Y)$ . So fix a choice for $S\subset X$ of size $\ell$ , and $T\subset Y$ (which will represent $N(S)$ ) of size equal to some integer in the interval $[\frac{n\ell}{k}-\frac{n}{k},\frac{n\ell}{k})$ . Fix a partition of $T$ into sets of size $\left\lceil\frac{n}{k}\right\rceil$ . By size considerations, there are at least $t=\left\lfloor\frac{n(\ell-1)}{k\lceil n/k\rceil}\right\rfloor\geq\left\lfloor\frac{\ell-1}{1+(k/n)}\right\rfloor\geq\left\lfloor\frac{\ell-1}{2}\right\rfloor$ such parts, and by the observation above, each such part admits at least two neighbors in $S$ . We conclude that the probability that there exists $S\subset X$ with $|S|\leq\frac{k^{2}}{4n\log n}$ which witnesses a violation of NMP for $G(X,Y)$ is at most

[TABLE]

To see why, observe that there are $\binom{k}{\ell}$ choices for $S$ , at most $n/k$ values for $|N(S)|$ (since $S$ minimally witnesses a violation of NMP), each of which is at most $\lfloor\frac{n\ell}{k}\rfloor$ . The probability that $e(S,Y\setminus N(S))=0$ is at most $(1-p)^{\ell\left\lceil n\left(1-\frac{\ell}{k}\right)\right\rceil}$ , and finally, the last expression is a bound on the probability that each of the $t$ blocks of vertices has at least $2$ neighbors in $S$ . The condition on $\ell$ that we have imposed in this subcase simply translates to the observation that the quantity in the right-most parenthesis that is raised to $t$ is less than $1$ . So, we have

[TABLE]

for $n,k$ sufficiently large and where in the final step, we used the fact that an infinite geometric series is at most twice the first term, when the common ratio is small enough. This expression is clearly $o(1)$ when $\frac{n}{k}\geq 2$ (and so $\ell_{\min}=\frac{\varepsilon\log n}{2}$ ). Further, it is at most $\frac{k^{3}}{32\log^{3}n}\left(\frac{4e\log^{2}n}{n^{\varepsilon/6}}\right)^{18/\varepsilon}=O(\frac{1}{\log^{3}n})=o(1)$ when $1\leq\frac{n}{k}<2$ .

For the subcase $\frac{k^{2}}{4n\log n}\leq\ell\leq\varepsilon^{\prime}k$ , we simply bound (which we shall call $\Sigma_{2}$ ) the probability of a minimal $S$ whose size is in this range by the probability that $S\cup\overline{N(S)}$ is independent and sum over the entire range of $\ell$ again. First, observe that in this subcase,

[TABLE]

and thus,

[TABLE]

as before and we are through.

Finally, observe that the case $\varepsilon^{\prime}k\leq|S|\leq k/2$ follows immediately from Corollary 3.2.

Case $Y$ :

There is a minimal witness $T\subset Y$ with $|T|=s\leq\frac{n}{2}+\frac{n}{k}$ that witnesses the violation of NMP for $G(Y,X)$ . This time though, since $k\leq n$ it follows that $|N(T)|\leq\lfloor\frac{ks}{n}\rfloor$ , and that for every $x\in N(T)$ there are at least $2$ neighbors in $T$ . Now, define $s_{\min}\coloneqq\frac{12}{\varepsilon}$ . As earlier, by Fact 2.2, the minimal $T\subset Y$ that witnesses the violation of NMP for $G(Y,X)$ must have size at least $s_{\min}$ whp. Again, we split this into two subcases: $s_{\min}\leq s\leq\varepsilon^{\prime}n$ and $s\geq\varepsilon^{\prime}n$ where again $\varepsilon^{\prime}=\varepsilon/2$ .

Suppose $s_{\min}\leq s\leq\varepsilon^{\prime}n$ . Analogous to how we divided Case $X$ into two subcases, let us first assume that $s\leq\frac{k}{2\log n}$ which in particular, lets us assume that $sp<1$ . Then the probability that such a witness exists of size in this range is at most

[TABLE]

where to derive (9), we use $\lfloor\frac{ks}{n}\rfloor\geq\frac{ks}{n}-1$ in the exponent and the more crude bound $\left\lfloor\frac{ks}{n}\right\rfloor\geq\frac{ks}{2n}$ elsewhere, which is applicable since by assumption, $\left\lfloor\frac{ks}{n}\right\rfloor\geq|N(T)|\geq s_{\min}>1$ . We also subsequently drop the range $\frac{k}{2\log n}\geq s\geq s_{\min}$ in the sum for convenience. Next, if $\frac{k}{2\log n}\leq s\leq\varepsilon^{\prime}n$ , then we simply bound the probability of there being a witness of size in this range by the probability that $T\cup\overline{N(T)}$ is an independent set (i.e. the final parenthesis in the expression for $M_{1}$ above is dropped) and sum over this range of $s$ again. The calculations (for the accordingly defined expression $M_{2}$ ) are very similar to that of $\Sigma_{2}$ in case $X$ and are omitted here.

Finally, if $|T|>\varepsilon^{\prime}n$ , then note that $S=X\setminus N(T)$ has size $(1-\varepsilon^{\prime})k\geq|S|\geq\varepsilon^{\prime}k$ , and by Lemma 2.1, $S$ witnesses the violation of NMP for $G(X,Y)$ and is covered by Corollary 3.2. ∎

4 Normalized Matching Property in Pseudorandom Graphs

In this section, we prove Theorem 1.4 which is restated below for convenience. Suppose $0<p<1$ and $0<\varepsilon<1$ . Recall that a bipartite graph $G(X,Y)$ with $|X|=k\leq n=|Y|$ is called Thomason pseudorandom with parameters $(p,\varepsilon)$ if every vertex in $X$ has degree at least $pn$ , and if every pair of vertices in $X$ have at most $p^{2}n(1+\varepsilon)$ neighbors in common.

Theorem 1.4.

Suppose $0<\varepsilon<1$ , and let $\omega:\mathbb{N}\to\mathbb{R}^{+}$ be a non-negative valued function that satisfies $\omega(k)\to\infty$ as $k\to\infty$ . There exists an integer $k_{0}=k_{0}(\varepsilon,\omega)$ such that the following holds. Suppose $p\geq\frac{\omega(k)}{k}$ , $|X|=k,|Y|=n$ with $k_{0}<k\leq n$ , and suppose $G=G(X,Y)$ is a Thomason pseudorandom bipartite graph with parameters $(p,\varepsilon)$ . Then $G$ is $(f,g,\varepsilon)$ -NMP-approximable with

(a)

$f(x)=O(x)$ , $g(x)=O(\sqrt{x})$ if $n>\frac{k}{\sqrt{\varepsilon}}$ and 2. (b)

$f(x)=g(x)=O(\sqrt[4]{x}\log\left(\frac{1}{x}\right))$ * if $n\leq\frac{k}{\sqrt{\varepsilon}}$ .*

In what follows, $G=G(X,Y)$ is a Thomason pseudorandom graph with parameters $(p,\varepsilon)$ where $\varepsilon>0$ and $p\geq\frac{\omega(k)}{k}$ where $\omega(k)$ denotes a function that satisfies $\omega(k)\to\infty$ as $k\to\infty$ . As always, $|X|=k\leq n=|Y|$ , and $n,k$ are sufficiently large (depending on the choice of $\varepsilon$ and $\omega$ ). As in the proof of Theorem 1.2, we split the task of proving NMP-approximability into two cases: the first, in which $n$ is significantly larger than $k$ and the second, in which the two are comparable.

Here is a brief overview of the proof. Suppose that

[TABLE]

where the latter is the representation in reduced form i.e., $gcd(\ell,L)=1$ and $\ell,L\in\mathbb{N}$ . Our strategy of proof is to show that we can find small sets $D_{X}\subset X,D_{Y}\subset Y$ such that $G(X\setminus D_{X},Y\setminus D_{Y})$ admits a vertex decomposition into copies of the Euclidean tree $T_{\ell,L}$ . Since $T_{\ell,L}$ has NMP by Lemma 2.2, this establishes the NMP-approximability of $G$ . An essential ingredient in the proof of both cases is Lemma 4.1 (which appears below) which basically states: If $G(X,Y,E)$ satisfies that for every subset $A\subseteq X$ of size at least $1/p$ and every subset $B\subseteq Y$ , we have $|e(A,B)-p|A||B||\leq\sqrt{pn|A||B|(1+\varepsilon p|A|)}$ , then all large enough subsets of $X,Y$ admit an almost partition into $X$ -thrills or $Y$ -thrills (as the case may be).

The application of this lemma in the first case ( $n/k$ large) is straightforward, but in the second case, it does not apply directly. The principal issue in the second case emanates from the possibility that in the reduced form $\ell,L$ are still large; for instance if $n,k$ are coprime, then $(\ell,L)=(k,n)$ and Lemma 4.1 does not apply. To circumvent this difficulty, we pre-process the graph, by deleting a small portion from both $X,Y$ to get $X^{\prime},Y^{\prime}$ so that the reduced form $(\ell,L)$ for $(|X^{\prime}|,|Y^{\prime}|)$ satisfies $\ell,L=O_{\varepsilon}(1)$ . Lemma 4.1 then applies in a multi-step process that we describe in Lemma 4.2.

Lemma 4.1.

Let $\varepsilon>0$ and $q\in\mathbb{N}$ be such that $q=\left\lfloor\frac{n}{k}\right\rfloor$ or $q=O_{\varepsilon}(1)$ . Suppose $G(X,Y,E)$ satisfies the conclusion of Theorem 1.3. Let $U\subseteq X$ and $V\subseteq Y$ and define $d_{0}=2\varepsilon n$ . Then there exist subsets $A\subseteq U,B\subseteq V$ such that if $|U|=u,|V|=v,|A|=a,$ and $|B|=b$ , then

•

if $v=qu$ , then $G(U\setminus A,V\setminus B)$ is spanned by an $X$ $q$ -thrill where $a\leq d_{0}/q$ and $b\leq d_{0}$ ;

•

if $u=qv$ , then $G(U\setminus A,V\setminus B)$ is spanned by a $Y$ $q$ -thrill where $a\leq qd_{0}$ and $b\leq d_{0}$ .

Proof.

First, assume that $|V|=q|U|$ . Let $\mathcal{F}$ be a maximal $X$ $q$ -thrill in $G(U,V)$ and let $\mathcal{F}\cap U=\tilde{U}$ , i.e., let $\tilde{U}$ denote the set of all those vertices in $U$ which belong to a $q$ -fan in $\mathcal{F}$ . Similarly, let $\mathcal{F}\cap V=\tilde{V}$ and set $A:=U\setminus\tilde{U},B:=V\setminus\tilde{V}$ . Since $\mathcal{F}$ is an $X$ $q$ -thrill, $q(u-a)=v-b$ which gives $b=qa$ . Note that we may assume that $a>1/p$ as otherwise, the bounds on $a$ and $b$ hold trivially since $1/p<d_{0}/q$ for either assumption on $q$ .

By the maximality of $\mathcal{F}$ , no vertex in $A$ has more than $q-1$ neighbors in $B$ , implying $e(A,B)<qa$ . Since $a>1/p$ , the aforementioned observation coupled with Theorem 1.3 implies

[TABLE]

so that

[TABLE]

Plugging $b=qa$ yields

[TABLE]

which upon further simplification, yields the following quadratic inequality in $a$ :

[TABLE]

Since $pn-q>0$ for either assumption on $q$ for $n$ sufficiently large,

[TABLE]

It now suffices to show that (for either assumption on $q$ ) $d\leq d_{0}/q$ . Note that $\frac{2}{p}+\frac{2}{\varepsilon p}<\frac{4k}{\varepsilon\omega(k)}$ . If $q=O_{\varepsilon}(1)$ , then we have for large enough $k$ that $\omega(k)>4q/\varepsilon^{2}$ and therefore, $\frac{4k}{\varepsilon\omega(k)}\leq\frac{4n}{\varepsilon\omega(k)}<\frac{\varepsilon n}{q}.$ If $q=\left\lfloor\frac{n}{k}\right\rfloor$ , then for large enough $k$ , we have that $\omega(k)>4/\varepsilon^{2}$ and therefore, $\frac{4k}{\varepsilon\omega(k)}\leq\varepsilon k\leq\frac{\varepsilon n}{q}.$

Now, assume that $u=qv$ . This case proceeds analogously to the previous one, with only minor changes at appropriate places. Let $\mathcal{F}$ now be a maximal $Y$ $q$ -thrill and let $\tilde{U}=\mathcal{F}\cap U$ and $\mathcal{F}\cap V=\tilde{V}$ . Define $A$ and $B$ as in the previous case. Then by the maximality of $\mathcal{F}$ , no vertex in $B$ has more than $q-1$ neighbors in $A$ , implying $e(A,B)<qb$ . Further, we have $a=qb$ . By Theorem 1.3, assuming $a>1/p$ as earlier, we have

[TABLE]

Upon plugging in $a=bq$ and working out as before, we obtain the quadratic inequality

[TABLE]

which is identical to (15) except with $b$ in place of $a$ and $q\varepsilon$ in place of $\varepsilon$ . Thus, it follows that $b<\frac{2}{p}+\varepsilon n+\frac{2}{\varepsilon pq}\leq\frac{2}{p}+\varepsilon n+\frac{2}{\varepsilon p}=d$ , therefore $a\leq qd$ . This implies the claimed bounds in terms of $d_{0}$ as before.∎

A few remarks are in order.

Though we have slightly stronger bounds on $a$ and $b$ in the second case (when $u=qv$ ), we simply use the stated bounds for the sake of ease of calculations later. 2. 2.

When $\varepsilon=0$ (for instance in the pseudorandom graphs that arise from the point-hyperplane incidences of projective geometries), the calculations above in fact yield $a<\frac{1}{p}+\sqrt{\frac{n}{pq}}$ when $v=qu$ and something analogous when $u=qv$ . In particular, the sizes of the deleted parts are considerably smaller in this case. 3. 3.

If $U\subset X^{\prime}\subset X,V\subset Y^{\prime}\subset Y$ then the conclusions of Lemma 4.1 hold even for the graph $G(X^{\prime},Y^{\prime})$ with the same parameters $(p,\varepsilon)$ since the lemma directly applies to the pair $(U,V)$ as a subset of $(X,Y)$ . This is vitally of use in the way we apply the Lemma in the proof of Theorem 1.4 part (b).

Proof of Theorem 1.4 part (a).

Suppose $n=qk+r$ , where $q=\left\lfloor\frac{n}{k}\right\rfloor$ and $r$ is an integer such that $0\leq r<k$ . Choose an arbitrary subset $C_{Y}\subset Y$ of size $r$ and define $Y_{1}=Y\setminus C_{Y}$ . Apply Lemma 4.1 to the sets $U=X$ and $V=Y_{1}$ to obtain $A\subset X$ and $B\subset Y_{1}$ such that $G(X\setminus A,Y\setminus B)$ is spanned by an $X$ $q$ -thrill and therefore has NMP (by Lemma 2.2). Define $\mathrm{Del}_{X}=A$ and $\mathrm{Del}_{Y}=C_{Y}\cup B$ so that

[TABLE]

and

[TABLE]

∎

Lemma 4.2.

Suppose $L/\ell$ is representation in reduced form of $n/k$ , suppose $L,\ell=O_{\varepsilon}(1)$ and let $d_{0}=2\varepsilon n$ . There exist subsets $D_{X}\subset X,D_{Y}\subset Y$ with $|D_{X}|\leq\ell md_{0}$ and $|D_{Y}|\leq Lmd_{0}$ , such that $G(X\setminus D_{X},Y\setminus D_{Y})$ admits a $T_{\ell,L}$ -factor. Here, $m$ is the complexity of the Euclidean algorithm for the parameters $(\ell,L)$ as defined in Section 2.

Proof of Lemma 4.2.

Partition both $X$ and $Y$ arbitrarily into “blocks”, each of size $t=gcd(k,n)$ . Let the blocks be denoted by $X_{1},\ldots,X_{\ell}$ and $Y_{1},\ldots,Y_{L}$ respectively. We shall refer to the $X_{i}$ blocks as left blocks and the $Y_{j}$ blocks as right blocks. Let $r_{i},q_{j}$ be the remainders and quotients as defined in Section 2. We shall now replicate the Euclidean- $(\ell,L)$ process with the vertices being replaced by these blocks, which we shall carry out in $m$ stages, beginning with stage $1$ .

In the rest of the proof of Lemma 4.2 we assume that $m$ is even; the $m$ odd case is completely analogous. We also define the sets $\mathcal{X}^{(i)}$ and $\mathcal{Y}^{(i)}$ analogous to the sets $X^{(i)}$ and $Y^{(i)}$ in the definition of the Euclidean tree (see Section 2) as follows. If $i$ is even,

[TABLE]

and if $i$ is odd, then

[TABLE]

We also assume that $\mathcal{X}^{(0)}=\mathcal{Y}^{(0)}=\emptyset$ .

We induct on $m$ . At stage $i$ , we apply Lemma 4.1 to appropriately defined sets $U_{i}$ and $V_{i}$ to obtain sets $A_{i}\subset U_{i}$ and $B_{i}\subset V_{i}$ such that $G(U_{i}\setminus A_{i},V_{i}\setminus B_{i})$ is spanned by an $X$ $q_{i}$ -thrill or a $Y$ $q_{i}$ -thrill (depending on whether $i$ is even or odd respectively). In fact, it will turn out that $U_{i}$ and $V_{i}$ are large subsets of $\mathcal{X}^{(i)}$ and $\mathcal{Y}^{(i)}\setminus\mathcal{Y}^{(i-1)}$ respectively, when $i$ is even (and something analogous when $i$ is odd). We denote the set of deleted vertices from $X$ and $Y$ at the end of stage $i$ by $D^{X}_{i}$ and $D^{Y}_{i}$ respectively, and these are obtained by modifying $A_{i}$ and $B_{i}$ suitably, with the help of $D^{X}_{i-1}$ and $D^{Y}_{i-1}$ . We then show that $G_{i}=G(\mathcal{X}^{(i)}\setminus D^{X}_{i},\mathcal{Y}^{(i)}\setminus D^{Y}_{i})$ admits a $T_{i}$ -factor, where $T_{i}=T_{r_{i},r_{(i+1)}}$ as was defined in Section 2. By controlling the sizes of $D^{X}_{i}$ and $D^{Y}_{i}$ (which we denote by $d^{X}_{i}$ and $d^{Y}_{i}$ respectively) the Lemma follows by plugging in $i=m$ because $r_{m}=\ell$ and $r_{m+1}=L$ .

Let us get to the details now. For starters, we apply Lemma 4.1 to the “first” $r_{1}$ right blocks (recall that $r_{1}=1$ ) and the “first” $r_{2}$ left blocks. More precisely, we apply Lemma 4.1 to $U_{1}=\mathcal{X}^{(1)}=X_{1}\sqcup\cdots\sqcup X_{r_{2}}$ and $V_{1}=\mathcal{Y}^{(1)}=Y_{r_{1}}=Y_{1}$ so that $|U_{1}|=t\cdot r_{2}=t\cdot q_{1}r_{1}=q_{1}|V_{1}|$ . We obtain sets $A_{1}\subset U_{1}$ and $B_{1}\subset V_{1}$ such that $G(U_{1}\setminus A_{1},V_{1}\setminus B_{1})$ is spanned by a $Y$ $q_{1}$ -thrill. This terminates stage $1$ with $D^{X}_{1}:=A_{1}$ and $D^{Y}_{1}:=B_{1}$ ; consequently, by Lemma 4.1 $d^{X}_{1}\leq q_{1}d_{0}$ and $d^{Y}_{1}\leq d_{0}$ . This establishes the following:

$G_{1}=G(\mathcal{X}^{(1)}\setminus D^{X}_{1},\mathcal{Y}^{(1)}\setminus D^{Y}_{1})$ admits a $T_{1}$ -factor, with $d^{X}_{1}\leq q_{1}d_{0}$ and $d^{Y}_{1}\leq d_{0}$ .

Suppose now that for some $1<i\leq m$ , $G_{i-1}=G(\mathcal{X}^{(i-1)}\setminus D^{X}_{i-1},\mathcal{Y}^{(i-1)}\setminus D^{Y}_{i-1})$ admits a $T_{i-1}$ -factor, and

(1)

if $i$ is even, then $d^{X}_{i-1}\leq(i-1)\cdot r_{i}d_{0}$ and $d^{Y}_{i-1}\leq(i-1)\cdot r_{i-1}d_{0}$ . 2. (2)

if $i$ is odd, then $d^{X}_{i-1}\leq(i-1)\cdot r_{i-1}d_{0}$ and $d^{Y}_{i-1}\leq(i-1)\cdot r_{i}d_{0}$ .

We shall show that there exist subsets $D^{X}_{i}\subset X$ and $D^{Y}_{i}\subset Y$ such that $G_{i}$ admits a $T_{i}$ -factor, and furthermore,

(a)

if $i$ is even, then $|D^{X}_{i}|=d^{X}_{i}\leq ir_{i}d_{0}$ and $|D^{Y}_{i}|=d^{Y}_{i}\leq ir_{i+1}d_{0}$ , 2. (b)

if $i$ is odd, then $|D^{X}_{i}|=d^{X}_{i}\leq ir_{i+1}d_{0}$ and $|D^{Y}_{i}|=d^{Y}_{i}\leq ir_{i}d_{0}$ ,

which would establish the induction step.

Suppose $i$ is even. Let $S^{Y}_{i}$ be an arbitrary subset of $Y_{r_{(i-1)}+1}\sqcup\cdots\sqcup Y_{r_{(i+1)}}$ of size $q_{i}\cdot d^{X}_{i-1}$ . Define

[TABLE]

Since $r_{i+1}-r_{i-1}=q_{i}r_{i}$ we have $|V_{i}|=t(r_{i+1}-r_{i-1})-q_{i}d^{X}_{i-1}=q_{i}|U_{i}|$ , so by Lemma 4.1, we obtain sets $A_{i}\subset U_{i}$ and $B_{i}\subset V_{i}$ with $|A_{i}|\leq d_{0}/q_{i}$ and $|B_{i}|\leq d_{0}$ such that $G(U_{i}\setminus A_{i},V_{i}\setminus B_{i})$ is spanned by an $X$ $q_{i}$ -thrill.

By assumption, $G_{i-1}$ admits a $T_{i-1}$ -factor i.e., $G_{i-1}$ is spanned by vertex-disjoint copies of $T_{i-1}$ . Define $\mathrm{CORRUPT}^{X}_{i}$ to be the set of all those vertices in $\mathcal{X}^{(i-1)}\setminus D^{X}_{i-1}$ which belong to one of the above copies of $T_{i-1}$ that also contains at least one vertex from $A_{i}$ . Obviously, $A_{i}\subseteq\mathrm{CORRUPT}^{X}_{i}$ . Similarly, we define $\mathrm{CORRUPT}^{Y}_{i}$ as the set of vertices in $\mathcal{Y}^{(i-1)}\setminus D^{Y}_{i-1}$ which belong to a copy of $T_{i-1}$ that contains at least one vertex from $A_{i}$ . We refer to such copies of $T_{i-1}$ in $G_{i-1}$ (that contain at least one vertex from $A_{i}$ ) as corrupt copies. Define

[TABLE]

as the set of those vertices of $G_{i-1}$ that get “corrupted” due to the introduction of further deletions during stage $i$ (i.e. the set $A_{i}$ ). In other words, $\mathrm{CORRUPT}_{i}$ is the set of vertices touched by the corrupt copies. See Figure 3 for an illustration of the induction step.

Define

[TABLE]

and set $d^{X}_{i}\coloneqq|D^{X}_{i}|,d^{Y}_{i}\coloneqq|D^{Y}_{i}|$ . Note that every corrupt copy of $T_{i-1}$ in $G_{i-1}$ has $r_{i}$ vertices in $X$ and $r_{i-1}$ vertices in $Y$ . Therefore, we have the bounds

[TABLE]

Putting things together, we obtain the recurrences

[TABLE]

By the induction hypothesis we have $d^{X}_{i-1}\leq(i-1)\cdot r_{i}d_{0}$ and $d^{Y}_{i-1}\leq(i-1)\cdot r_{i-1}d_{0}$ . Therefore,

[TABLE]

and

[TABLE]

where in the final step, we use $r_{i+1}-r_{i-1}=q_{i}r_{i}$ and the fact that $1+r_{i-1}\leq r_{i}<r_{i+1}$ .

We now prove that $G_{i}$ admits a $T_{i}$ -factor. Recall from the preliminaries that if $T_{i-1}=T_{r_{i},r_{(i-1)}}$ is the Euclidean tree with left vertices $x_{1},\ldots,x_{r_{i}}$ and right vertices $y_{1},\ldots,y_{r_{(i-1)}}$ , then $T_{i}=T_{r_{i},r_{(i+1)}}$ is constructed on left vertices $x_{1},\ldots,x_{r_{i}}$ and right vertices $y_{1},\ldots,y_{r_{(i+1)}}$ , by adding to $T_{i-1}$ an $X$ $q_{i}$ -thrill of size $r_{i}$ between $x_{1},\ldots,x_{r_{i}}$ and $y_{r_{(i-1)}+1},\ldots,y_{r_{(i+1)}}$ . By Lemma 4.1, $G(\mathcal{X}^{(i)}\setminus D^{X}_{i},(\mathcal{Y}^{(i)}\setminus\mathcal{Y}^{(i-1)})\setminus D^{Y}_{i})$ is spanned by an $X$ $q_{i}$ -thrill. This, along with the copies of $T_{i-1}$ that span $G_{i-1}$ , gives us the desired $T_{i}$ -factoring of $G_{i}$ .

The proof of the inductive step when $i$ is odd i.e., $(2)\Rightarrow(b)$ is completely analogous ( $X$ swapped with $Y$ everywhere). The only small difference that arises is in the recurrences for $d^{X}_{i}$ and $d^{Y}_{i}$ because of the slightly different bounds for $|A_{i}|$ and $|B_{i}|$ given by Lemma 4.1 in this case. In particular, by following the same line of argument as in the proof of $(1)\Rightarrow(a)$ , we obtain, in this case

[TABLE]

But then, by using the trivial bound $q_{i}+r_{i-1}\leq r_{i+1}$ , we obtain the desired estimates $d^{X}_{i}\leq i\cdot r_{i+1}d_{0}$ and $d^{Y}_{i}\leq i\cdot r_{i}d_{0}$ .

Thus, we have shown that there exist subsets $D_{X}=D^{X}_{m}\subset X,$ $D_{Y}=D^{Y}_{m}\subset Y$ such that $G(X\setminus D_{X},Y\setminus D_{Y})$ admits a $T_{\ell,L}$ -factor and consequently has NMP. Furthermore we have

[TABLE]

∎

We are now in a position to prove Theorem 1.4 part (b).

Proof of Theorem 1.4 part (b).

Suppose $G$ is a Thomason pseudorandom bipartite graph with parameters $(p,\varepsilon)$ and with vertex classes $X$ and $Y$ of sizes $k$ and $n$ respectively with $\frac{n}{k}\leq\frac{1}{\sqrt{\varepsilon}}$ .

Set $\alpha\coloneqq\sqrt[4]{\varepsilon^{3}}$ and $\eta\coloneqq\sqrt[4]{\varepsilon}$ and consider the interval $[n(1-\alpha),n]$ . Since its length is $\alpha n$ , there is an integer $N\in I$ such that $N$ is a multiple of $\lfloor\alpha n\rfloor$ . Also, since $\eta k\geq\alpha n$ , there is an integer $K$ in the interval $J=[k(1-2\eta),k(1-\eta)]$ such that $K$ is a multiple of $\lfloor\alpha n\rfloor$ . With $K$ and $N$ as defined above (note that $K\leq N$ ), simply pick a subset $C_{X}\subset X$ of size $k-K$ and $C_{Y}\subset Y$ of $n-N$ arbitrarily and define a new graph $G^{\prime}=G(X\setminus C_{X},Y\setminus C_{Y})$ . Observe that if $L/\ell$ is the representation in reduced from of $N/K$ , then $L\leq\frac{1}{\sqrt[4]{\varepsilon^{3}}}=O_{\varepsilon}(1)$ . Applying Lemma 4.2 to $G^{\prime}$ (see Remark $3$ after Lemma 4.1), we obtain subsets $D_{X}\subset X\setminus C_{X}$ and $D_{Y}\subset Y\setminus C_{Y}$ such that $G(X\setminus\mathrm{Del}_{X},Y\setminus\mathrm{Del}_{Y})$ has NMP, where $\mathrm{Del}_{X}=C_{X}\cup D_{X}$ and $\mathrm{Del}_{Y}=C_{Y}\cup D_{Y}$ . By Fact 2.4 and the trivial bounds $K\leq n$ and $N\leq n$ , we have

[TABLE]

and similarly,

[TABLE]

and that completes the proof. ∎

5 Concluding Remarks

•

The main engine in the proof of Theorem 1.4 comes from Lemma 4.1 which is the place the pseudorandomness is used in an explicit form. The rest of the proof of the theorem including the inductive argument uses this in a black-box manner. Hence, if we had an equivalent statement to Lemma 4.1 for other models of pseudorandomness - call it Lemma 4.1* (say), then the rest of the proof of Theorem 1.4 can run through with the error estimates being dictated by Lemma 4.1* instead. The content of Lemma 4.1 uses the notion of Thomason pseudorandomess explicitly only when we evoke Theorem 1.3 which is basically a statement that estimates how much the difference between $e(A,B)$ and the expected number of edges, if the graph were random, viz., $p|A||B|$ can be. For $(n,d,\lambda)$ graphs, the analogue of this theorem is the expander-mixing lemma which provides precisely such an estimate.

We illustrate this by returning to problem 2 that was stated in the introduction. For $\varepsilon>0$ , and $q$ a sufficiently large prime power, let $H$ be a multiplicative subgroup of $\mathbb{F}_{q}^{*}$ of order at least $q^{1/2+\varepsilon}$ . Consider the Sum-Cayley graph $\Gamma_{q}(H)$ whose vertex set is $\mathbb{F}_{q}$ and vertices $x,y$ are adjacent if and only if $x+y\in H$ . A result of Alon and Bourgain (see [1]) states that that $\Gamma_{q}(H)$ is a $(q,|H|,q^{1/2})$ graph, i.e., it is a regular graph on $q$ vertices, with degree $|H|$ , and every non-trivial eigenvalue of $\Gamma_{q}(H)$ is at most $q^{1/2}$ . If $G$ is the bipartite graph described in the introduction following the description of problem 2, then it is not difficult to show that for any $A\subset X,B\subset Y$ we have $|e(A,B)-\frac{|A||B||H|}{q}|<\sqrt{q|A||B|}$ by using the expander-mixing lemma. Then, via the argument in the proof of Lemma 4.1 we have: If $X,Y\subset\mathbb{F}_{q}$ with $|Y|=10|X|$ , $|X|\geq q/100$ , and let $H$ is a subgroup of $\mathbb{F}_{q}^{*}$ of size at least $q^{1/2+\varepsilon}$ , then there exists $A\subset X,B\subset Y$ with $|A|\leq O(q^{1-\varepsilon})$ , and $|B|=10|A|$ such that $G(X\setminus A,Y\setminus B)$ has NMP. Consequently, every element of $Y\setminus B$ can be labeled by some element of $X\setminus A$ such that each label appears $10$ times, and further, for each $y\in Y$ labeled $x$ , the sum $x+y\in H$ . This answers in the affirmative, the approximate version of problem 2. One could pose more general questions of the same kind, but without the additional constraint that $|Y|$ is a multiple of $|X|$ . For instance, suppose $X,Y\subset\mathbb{F}_{q}$ and $|Y|=\frac{3}{2}|X|$ (say), with $|X|\geq\Omega(q)$ , and let $H$ be a subgroup of $\mathbb{F}_{q}^{*}$ of size at least $q^{1/2+\varepsilon}$ . Then one can similarly show that there exist subsets $\mathrm{Del}_{X}\subset X,\mathrm{Del}_{Y}\subset Y$ with $|\mathrm{Del}_{X}|\leq f(\varepsilon)|X|,|\mathrm{Del}_{Y}|\leq g(\varepsilon)|Y|$ such that if $X^{\prime},Y^{\prime}$ are the remaining sets, then one may form a star-array $\mathcal{A}$ of dimension $|X^{\prime}|\times|Y^{\prime}|$ whose rows and columns are labeled by the elements of $X^{\prime},Y^{\prime}$ respectively with the property that if the $(x,y)^{th}$ element of $\mathcal{A}$ is a star, then $x+y\in H$ . Furthermore, each row of $\mathcal{A}$ has precisely $3$ stars, and each column has precisely $2$ stars.

•

For a bipartite graph $G(X,Y)$ with $|X|=|Y|$ that admits a perfect matching, the Max-Min Greedy Matching problem that was introduced in [8] goes as follows. Given permutations $\sigma,\pi$ of the vertices of $X$ and $Y$ respectively, the vertices of $X$ are processed according to $\sigma$ , and each $x\in X$ is matched to its earliest available neighbor in $Y$ according to $\pi$ . If $M_{G}[\sigma,\pi]$ denote the size of the resulting greedy matching, determine $\rho[G]\coloneqq\frac{\max_{\pi}\min_{\sigma}|M_{G}[\sigma,\pi]|}{|X|}$ . This problem admits a natural generalization. Suppose $G(X,Y)$ is a bipartite graph, with $|X|=k,|Y|=n$ , with $k\leq n$ , and suppose $r=\lfloor n/k\rfloor$ . As before, let $\sigma,\pi$ be permutations of the vertices of $X$ and $Y$ respectively. We process the vertices of $X$ according to $\sigma$ and for each $x\in X$ , we choose its first $r$ neighbors in $Y$ that have not been already chosen by some previous vertex of $X$ according to $\pi$ . Let $m^{(r)}_{G}[\sigma,\pi]$ denote the number of vertices of $X$ for which one can choose $r$ such neighbors. Then determine $\rho_{r}[G]\coloneqq\frac{\max_{\pi}\min_{\sigma}m^{(r)}_{G}[\sigma,\pi]}{|X|}$ . Our proof of Lemma 4.1 can easily be adapted to establish the following: Suppose $\varepsilon>0$ , and let $\omega$ be a function such that $\omega(k)\to\infty$ as $k\to\infty$ . Then there exists $k_{0}=k_{0}(\varepsilon)$ such that whenever $n\geq k>k_{0}$ and $G(X,Y)$ is a $(p,\varepsilon)$ -Thomason pseudorandom bipartite graph with $|X|=k,|Y|=n$ , and $p\geq\frac{\omega(k)}{k}$ , then $\rho_{r}[G]\geq 1-O(\varepsilon)$ .

•

Our proof of Theorem 1.2 on closer examination reveals that $\mathbb{G}(k,n,p)$ does not have NMP whp for $p=\frac{\log n-\omega(n)}{k}$ for any arbitrary function $\omega$ that goes to infinity. However, to prove the existence of NMP with high probability, our proof cannot extend beyond $p=\frac{\log n+O(\sqrt{\log n})}{n}$ . While it is possible to improve (using our methods) our result to prove that $\mathbb{G}(k,n,p)$ has NMP whp for $p=\frac{\log n+f(n)}{k}$ for some $f=o(\log n)$ , the question of whether there is a sharp threshold for NMP of the form $p=\frac{\log n+\omega(n)}{k}$ remains open.

•

As remarked in the Introduction, our proof of Theorem 1.4 shows that $f(x)=g(x)=O(x^{1/4}\log(1/x))$ works uniformly for all pairs $(k,n)$ . Is it possible to improve this to $f(x)=g(x)=O(x)$ uniformly over all $(k,n)$ ?

•

We make a final remark pertaining to a remark following the statement of Theorem 1.4 in the Introduction. As we noted, the definition of Thomason pseudorandomness does not preclude the existence of isolated vertices unless a more symmetric definition of pseudorandomness is adopted. In that case, it would be interesting to see if one can arrive at a stronger conclusion than the statement of Theorem 1.4.

Appendix: Robustness of Thomason pseudorandomness

Lemma.

Let $0<\varepsilon<\frac{1}{2}$ , and $k\leq n$ be positive integers. Suppose $G(X,Y)$ is a Thomason pseudorandom bipartite graph with parameters $(p_{0},\varepsilon_{0})$ with $|X|=k,|Y|=n$ , and suppose $p_{0}\geq\frac{1}{\sqrt{k}}$ . Then, for a given integer $D$ satisfying $\frac{\alpha}{2}n\leq D\leq\alpha n$ for $\alpha=\varepsilon^{3}$ , there exist subsets $C_{X}\subseteq X$ and $C_{Y}\subseteq Y$ such that

•

$|C_{Y}|=D$ * and $|C_{X}|\leq\eta k$ , where $\eta=2\exp(-\frac{C}{\varepsilon})$ for some fixed constant $C$ ,*

•

the subgraph induced by the sets $X\setminus C_{X}$ and $Y\setminus C_{Y}$ is Thomason pseudorandom with parameters $(p_{1},\varepsilon_{1})$ where $p_{1}=p_{0}(1-\varepsilon)$ and $\varepsilon_{1}\leq 5(\varepsilon_{0}+3\varepsilon)$ .

Proof.

Let $\eta=2\exp(-\frac{C}{\varepsilon})$ where $C$ shall be specified later. Let $T\subseteq Y$ be a uniformly random subset of $Y$ of size $D$ . Then by the tail bound of the hypergeometric distribution (see [19]) we have, for every $t\geq 0$ ,

[TABLE]

for every vertex $u\in X$ . Now, fix $t=\varepsilon p_{0}(\frac{n}{D}-1)$ . Call a vertex $u\in X$ bad with respect to $T$ if

[TABLE]

Then by equation 16, the expected number of bad vertices is at most $2ke^{-2t^{2}D}$ . Fix a set $C_{Y}\subseteq Y$ of size $D$ for which the set of bad vertices (which we shall call $C_{X}$ ) has size at most $2ke^{-2t^{2}D}$ .

Now, for a vertex $x\in X$ , let $N^{\prime}(x)=N(x)\cap(Y\setminus C_{Y})$ . Then for $x\in X\setminus C_{X}$ , as $x$ is not a bad vertex,

[TABLE]

where the inequality follows from the hypothesis (see Definition 1.1) that $G$ is Thomason pseudorandom. Also note that for any distinct vertices $u,v\in X\setminus C_{X}$ ,

[TABLE]

which follows since

[TABLE]

where the last inequality follows from the fact that $n-D\geq n(1-\alpha)$ . The required codegree bound then follows from the given condition on $\varepsilon_{1}$ .

It remains only to check is that $2e^{-2t^{2}D}\leq\eta$ . To see this, observe that $t=\varepsilon p_{0}(\frac{n}{D}-1)\geq\varepsilon p_{0}(\frac{1}{\alpha}-1)$ and also note that $\varepsilon<\frac{1}{2}\Rightarrow 1-\alpha>\frac{7}{8}$ . Thus,

[TABLE]

where we may take the constant $C=\frac{49}{64}=0.765625$ in the definition of $\eta$ . ∎

One interesting consequence of the proof of the lemma is that if we seek $\eta=\textrm{poly}(\varepsilon)$ then one has a randomized algorithm to choose a set $T\subset Y$ and a related $BAD(T)\subset X$ with $|T|=D,|BAD(T)|\leq\eta k$ such that deleting these sets from $Y,X$ respectively results in another Thomason pseudorandom graph with only slightly worse parameters.

It is known (see [21]) that bipartite graphs arising from the point-hyperplane incidence structure of a projective geometry of dimension $d$ over a finite field $\mathbb{F}_{q}$ is Thomason pseudorandom with parameters $p=n^{-1/2}(1+o(1))$ and $\varepsilon=0$ . More generally, one can take the point-block incidence structure arising from a symmetric block design as the “seed” Thomason pseudorandom graph which upon the application of the lemma above gives us several other examples of Thomason pseudorandom graphs with parameters that are relevant in Theorem 1.4.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] N Alon, J. Bourgain, Additive Patterns in Multiplicative Subgroups, Geom. Funct. Anal. , 24 (2014), No. 3, 721-739.
2[2] N. Alon, J. Spencer, The Probabilistic Method , 4th ed., Wiley, 2016.
3[3] I. Anderson, Some problems in combinatorial number theory, Ph D thesis, University of Nottingham, United Kingdom, (1967).
4[4] I. Anderson, Combinatorics of Finite Sets , Dover Publications, Mineola, NY (2002).
5[5] B. Bollobás, Random Graphs , 2nd ed., Cambridge University Press (2001).
6[6] B. Bollobás, A. Thomason, Threshold functions, Combinatorica , 7 (1986), 35-38.
7[7] K. Engel, Sperner theory in Encyclopedia of Mathematics , Cambridge University Press (1997).
8[8] A. Eden, U. Feige, M. Feldman, Max-min greedy matching, Net Econ’ 19 , Proc. 14th Workshop on the Economics of Networks, Systems and Computation, Article 10, (2019)

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The Normalized Matching Property in Random and Pseudorandom Bipartite Graphs

Abstract

1 Introduction

Theorem 1.1**.**

Theorem 1.2**.**

Definition 1.1**.**

Theorem 1.3** (Theorem 2 in [21]).**

Definition 1.2** (NMP-Approximability).**

Theorem 1.4**.**

2 Preliminaries

Theorem 2.1**.**

Lemma 2.1**.**

Proof.

Fact 2.2**.**

Fact 2.3**.**

Lemma 2.2**.**

Proof.

Fact 2.4**.**

3 Threshold for NMP for G(k,n,p)\mathbb{G}(k,n,p)G(k,n,p)

Theorem 1.2\mathbf{1.2}1.2.

Lemma 3.1**.**

Proof.

Claim 3.1**.**

Lemma 3.2**.**

Proof.

Small Case: 1≤ℓ≤ε′k1\leq\ell\leq\varepsilon^{\prime}k1≤ℓ≤ε′k.

Intermediate Case: ε′k≤ℓ≤(1−ε′)k\varepsilon^{\prime}k\leq\ell\leq(1-\varepsilon^{\prime})kε′k≤ℓ≤(1−ε′)k.

Large Case: (1−ε′)k≤ℓ<k(1-\varepsilon^{\prime})k\leq\ell<k(1−ε′)k≤ℓ<k.

Corollary 3.2**.**

Proof of Theorem 1.2..

Case XXX:

Case YYY:

4 Normalized Matching Property in Pseudorandom Graphs

Theorem 1.4.

Lemma 4.1**.**

Proof.

Proof of Theorem 1.4 part (a).

Lemma 4.2**.**

Proof of Lemma 4.2.

Proof of Theorem 1.4 part (b).

5 Concluding Remarks

Appendix: Robustness of Thomason pseudorandomness

Lemma.

Proof.

Theorem 1.1.

Theorem 1.2.

Definition 1.1.

Theorem 1.3 (Theorem 2 in [21]).

Definition 1.2 (NMP-Approximability).

Theorem 1.4.

Theorem 2.1.

Lemma 2.1.

Fact 2.2.

Fact 2.3.

Lemma 2.2.

Fact 2.4.

3 Threshold for NMP for $\mathbb{G}(k,n,p)$

Theorem $\mathbf{1.2}$ .

Lemma 3.1.

Claim 3.1.

Lemma 3.2.

Small Case: $1\leq\ell\leq\varepsilon^{\prime}k$ .

Intermediate Case: $\varepsilon^{\prime}k\leq\ell\leq(1-\varepsilon^{\prime})k$ .

Large Case: $(1-\varepsilon^{\prime})k\leq\ell<k$ .

Corollary 3.2.

Case $X$ :

Case $Y$ :

Lemma 4.1.

Lemma 4.2.