Some new results in random matrices over finite fields

Kyle Luh; Sean Meehan; Hoi H. Nguyen

arXiv:1907.02575·math.CO·December 9, 2020

Some new results in random matrices over finite fields

Kyle Luh, Sean Meehan, Hoi H. Nguyen

PDF

TL;DR

This paper investigates the distribution properties of random matrices over finite fields, providing new characterizations of random walks and analyzing eigenvalue and polynomial divisibility probabilities, revealing universal behaviors.

Contribution

It introduces novel characterizations of random walks with large discrepancy and extends universality results for matrix eigenvalues and polynomial divisibility over finite fields.

Findings

01

Distribution of ranks of random matrices over F_p analyzed

02

Probability of eigenvalue-free matrices characterized

03

Divisibility of characteristic polynomials by irreducible polynomials studied

Abstract

In this note we give various characterizations of random walks with possibly different steps that have relatively large discrepancy from the uniform distribution modulo a prime p, and use these results to study the distribution of the rank of random matrices over F_p and the equi-distribution behavior of normal vectors of random hyperplanes. We also study the probability that a random square matrix is eigenvalue-free, or when its characteristic polynomial is divisible by a given irreducible polynomial in the limit n to infinity in F_p. We show that these statistics are universal, extending results of Stong and Neumann-Praeger beyond the uniform model.

Equations588

P (M \in G L (n, F_{q})) = \frac{( q ^{n} - 1 ) ( q ^{n} - q ) \dots ( q ^{n} - q ^{n - 1} )}{q ^{n^{2}}} = i = 1 \prod n (1 - q^{- i}) .

P (M \in G L (n, F_{q})) = \frac{( q ^{n} - 1 ) ( q ^{n} - q ) \dots ( q ^{n} - q ^{n - 1} )}{q ^{n^{2}}} = i = 1 \prod n (1 - q^{- i}) .

P (M \mbox ha sr ank n - k) = \frac{1}{q ^{k^{2}}} \frac{\prod _{i = 1}^{n} ( 1 - q ^{- i} ) \prod _{i = k + 1}^{n} ( 1 - q ^{- i} )}{\prod _{i = 1}^{k} ( 1 - q ^{- i} ) \prod _{i = 1}^{n - k} ( 1 - q ^{- i} )} .

P (M \mbox ha sr ank n - k) = \frac{1}{q ^{k^{2}}} \frac{\prod _{i = 1}^{n} ( 1 - q ^{- i} ) \prod _{i = k + 1}^{n} ( 1 - q ^{- i} )}{\prod _{i = 1}^{k} ( 1 - q ^{- i} ) \prod _{i = 1}^{n - k} ( 1 - q ^{- i} )} .

q \to \infty lim n \to \infty lim P (M \mbox i se i g e n v a l u e - f r ee) = 1/ e .

q \to \infty lim n \to \infty lim P (M \mbox i se i g e n v a l u e - f r ee) = 1/ e .

n \to \infty lim P (λ_{z - 1} (M) = λ) = r = 1 \prod \infty (1 - \frac{1}{q ^{r}}) \frac{1}{q ^{\sum_{i} (λ_{i}^{'})^{2}} \prod _{i} ( 1/ q ) _{m_{i} (λ)}},

n \to \infty lim P (λ_{z - 1} (M) = λ) = r = 1 \prod \infty (1 - \frac{1}{q ^{r}}) \frac{1}{q ^{\sum_{i} (λ_{i}^{'})^{2}} \prod _{i} ( 1/ q ) _{m_{i} (λ)}},

n \to \infty lim P (M (Z_{p}^{n}) / Z_{p}^{n} ≃ B) = M_{p} (λ) = \frac{\prod _{k = 1}^{\infty} ( 1 - p ^{- k} )}{∣ Aut ( B ) ∣} .

n \to \infty lim P (M (Z_{p}^{n}) / Z_{p}^{n} ≃ B) = M_{p} (λ) = \frac{\prod _{k = 1}^{\infty} ( 1 - p ^{- k} )}{∣ Aut ( B ) ∣} .

n \to \infty lim P (λ_{ϕ} (M) = λ) = M_{q^{d e g (ϕ)}} (λ)

n \to \infty lim P (λ_{ϕ} (M) = λ) = M_{q^{d e g (ϕ)}} (λ)

P (ϕ (x) ∣ det (M - x)) = P (λ_{ϕ} (M) \neq = \emptyset) \to 1 - i = 1 \prod \infty (1 - q^{i d e g (ϕ)}) .

P (ϕ (x) ∣ det (M - x)) = P (λ_{ϕ} (M) \neq = \emptyset) \to 1 - i = 1 \prod \infty (1 - q^{i d e g (ϕ)}) .

n \to \infty lim P (\land_{i = 1}^{k} λ_{ϕ_{i}} = λ_{i}) = i = 1 \prod k n \to \infty lim P (λ_{ϕ_{i}} = λ_{i}) .

n \to \infty lim P (\land_{i = 1}^{k} λ_{ϕ_{i}} = λ_{i}) = i = 1 \prod k n \to \infty lim P (λ_{ϕ_{i}} = λ_{i}) .

n \to \infty lim P (M \mbox ha sr ank n - k) = \frac{1}{p ^{k^{2}}} \frac{\prod _{i = k + 1}^{\infty} ( 1 - p ^{- i} )}{\prod _{i = 1}^{k} ( 1 - p ^{- i} )} .

n \to \infty lim P (M \mbox ha sr ank n - k) = \frac{1}{p ^{k^{2}}} \frac{\prod _{i = k + 1}^{\infty} ( 1 - p ^{- i} )}{\prod _{i = 1}^{k} ( 1 - p ^{- i} )} .

a max P (ξ = a) \leq 1 - α .

a max P (ξ = a) \leq 1 - α .

P (rank (M) = n - d)) = \frac{1}{p ^{d^{2}}} \frac{\prod _{i = d + 1}^{\infty} ( 1 - p ^{- i} )}{\prod _{i = 1}^{d} ( 1 - p ^{- i} )} + O (e^{- n^{c^{'}}}),

P (rank (M) = n - d)) = \frac{1}{p ^{d^{2}}} \frac{\prod _{i = d + 1}^{\infty} ( 1 - p ^{- i} )}{\prod _{i = 1}^{d} ( 1 - p ^{- i} )} + O (e^{- n^{c^{'}}}),

a \in F_{p} sup ∣ P (\sum ξ_{i} w_{i} = a) - 1/ p ∣ \leq exp (- n^{c}),

a \in F_{p} sup ∣ P (\sum ξ_{i} w_{i} = a) - 1/ p ∣ \leq exp (- n^{c}),

∣ P (w_{i} = 0) - 1/ p ∣ \leq O (exp (- n^{c^{'}})) .

∣ P (w_{i} = 0) - 1/ p ∣ \leq O (exp (- n^{c^{'}})) .

∣ P (\exists w = (w_{1}, \dots, w_{n}) \in W_{n - 1}^{⊥} : w_{i} = a \land w_{j} = 1) - 1/ p ∣ \leq O (exp (- n^{c^{'}})) .

∣ P (\exists w = (w_{1}, \dots, w_{n}) \in W_{n - 1}^{⊥} : w_{i} = a \land w_{j} = 1) - 1/ p ∣ \leq O (exp (- n^{c^{'}})) .

P (\land_{a = 0}^{p - 1} (∣ n_{a} / n - 1/ p ∣ \leq δ / p) \geq 1 - e^{- c δ^{2} n / p} .

P (\land_{a = 0}^{p - 1} (∣ n_{a} / n - 1/ p ∣ \leq δ / p) \geq 1 - e^{- c δ^{2} n / p} .

\Big{|}{\mathbf{P}}(\phi(x)|\det(M-x))-[1-\prod_{i=1}^{\infty}(1-p^{-i\deg(\phi)})]\Big{|}=O(\exp(-cn^{1-\varepsilon}/p^{2})),

\Big{|}{\mathbf{P}}(\phi(x)|\det(M-x))-[1-\prod_{i=1}^{\infty}(1-p^{-i\deg(\phi)})]\Big{|}=O(\exp(-cn^{1-\varepsilon}/p^{2})),

p \to \infty lim n \to \infty lim P (M \mbox i se i g e n v a l u e - f r ee) = 1/ e .

p \to \infty lim n \to \infty lim P (M \mbox i se i g e n v a l u e - f r ee) = 1/ e .

R_{1} 0 \dots 0 0 R_{2} \dots 0 00 \dots \dots \dots \dots \dots \dots 00 \dots R_{k} .

R_{1} 0 \dots 0 0 R_{2} \dots 0 00 \dots \dots \dots \dots \dots \dots 00 \dots R_{k} .

R = C (ϕ_{i}^{λ_{1}}) 0 \dots 0 0 C (ϕ_{i}^{λ_{2}}) \dots 0 00 \dots \dots \dots \dots \dots \dots 00 \dots C (ϕ_{i}^{λ_{m}}),

R = C (ϕ_{i}^{λ_{1}}) 0 \dots 0 0 C (ϕ_{i}^{λ_{2}}) \dots 0 00 \dots \dots \dots \dots \dots \dots 00 \dots C (ϕ_{i}^{λ_{m}}),

C (ϕ) := 000 \dots 0 - a_{0} 100 \dots 0 - a_{1} 010 \dots \dots - a_{2} 001 \dots 0 \dots \dots \dots \dots \dots 0 - a_{d - 2} 000 \dots 1 - a_{d - 1} .

C (ϕ) := 000 \dots 0 - a_{0} 100 \dots 0 - a_{1} 010 \dots \dots - a_{2} 001 \dots 0 \dots \dots \dots \dots \dots 0 - a_{d - 2} 000 \dots 1 - a_{d - 1} .

M = 10 \dots 0 11 \dots 0 00 \dots \dots \dots \dots \dots \dots 00 \dots 1

M = 10 \dots 0 11 \dots 0 00 \dots \dots \dots \dots \dots \dots 00 \dots 1

Z_{M a t (n, q)} := \frac{1}{∣ G L ( n , q ) ∣} α \in M a t (n, q) \sum ϕ, ∣ λ_{ϕ} (α) ∣ > 0 \prod x_{ϕ, λ_{ϕ} (α)} .

Z_{M a t (n, q)} := \frac{1}{∣ G L ( n , q ) ∣} α \in M a t (n, q) \sum ϕ, ∣ λ_{ϕ} (α) ∣ > 0 \prod x_{ϕ, λ_{ϕ} (α)} .

1 + n = 1 \sum \infty Z_{M a t (n, q)} u^{n} = ϕ \prod [1 + n \geq 1 \sum λ ⊢ n \sum x_{ϕ, λ} \frac{u ^{n d e g (ϕ)}}{q ^{d e g (ϕ) \sum_{i} (λ_{i}^{'})^{2} \prod_{i \geq 1} (\frac{1}{q ^{d e g (ϕ)}})_{m_{i} (λ_{ϕ})}}}] .

1 + n = 1 \sum \infty Z_{M a t (n, q)} u^{n} = ϕ \prod [1 + n \geq 1 \sum λ ⊢ n \sum x_{ϕ, λ} \frac{u ^{n d e g (ϕ)}}{q ^{d e g (ϕ) \sum_{i} (λ_{i}^{'})^{2} \prod_{i \geq 1} (\frac{1}{q ^{d e g (ϕ)}})_{m_{i} (λ_{ϕ})}}}] .

ϕ \prod [1 + n \geq 1 \sum λ ⊢ n \sum \frac{u ^{n d e g (ϕ)}}{q ^{d e g (ϕ) \sum_{i} (λ_{i}^{'})^{2} \prod_{i \geq 1} (\frac{1}{q ^{d e g (ϕ)}})_{m_{i} (λ_{ϕ})}}}] = (1 - u)^{- 1} .

ϕ \prod [1 + n \geq 1 \sum λ ⊢ n \sum \frac{u ^{n d e g (ϕ)}}{q ^{d e g (ϕ) \sum_{i} (λ_{i}^{'})^{2} \prod_{i \geq 1} (\frac{1}{q ^{d e g (ϕ)}})_{m_{i} (λ_{ϕ})}}}] = (1 - u)^{- 1} .

1+\sum_{n=1}^{\infty}\frac{H_{n,q}}{|GL(n,q)|}u^{n}=\Big{(}1+\sum_{i=1}^{\infty}\sum_{\lambda\vdash i}\frac{u^{i}}{q^{\sum_{i}(\lambda_{i}^{\prime})^{2}\prod_{i\geq 1}(\frac{1}{q})_{m_{i}(\lambda_{\phi})}}}\Big{)}^{1-q}(1-u)^{-1},

1+\sum_{n=1}^{\infty}\frac{H_{n,q}}{|GL(n,q)|}u^{n}=\Big{(}1+\sum_{i=1}^{\infty}\sum_{\lambda\vdash i}\frac{u^{i}}{q^{\sum_{i}(\lambda_{i}^{\prime})^{2}\prod_{i\geq 1}(\frac{1}{q})_{m_{i}(\lambda_{\phi})}}}\Big{)}^{1-q}(1-u)^{-1},

\frac{H_{n,q}}{|GL(n,q)|}\to\Big{(}1+\sum_{i=1}^{\infty}\frac{q^{i(i-1)}}{[q]_{i}}\Big{)}^{1-q}\text{ as }n\to\infty.

\frac{H_{n,q}}{|GL(n,q)|}\to\Big{(}1+\sum_{i=1}^{\infty}\frac{q^{i(i-1)}}{[q]_{i}}\Big{)}^{1-q}\text{ as }n\to\infty.

\Big{(}1+\sum_{i=1}^{\infty}\frac{q^{i(i-1)}}{[q]_{i}}\Big{)}^{1-q}\to 1/e\text{ as }q\to\infty,

\Big{(}1+\sum_{i=1}^{\infty}\frac{q^{i(i-1)}}{[q]_{i}}\Big{)}^{1-q}\to 1/e\text{ as }q\to\infty,

Q = {a_{0} + x_{1} a_{1} + \dots + x_{r} a_{r} ∣ M_{i} \leq x_{i} \leq M_{i}^{'} and x_{i} \in Z for all 1 \leq i \leq r}

Q = {a_{0} + x_{1} a_{1} + \dots + x_{r} a_{r} ∣ M_{i} \leq x_{i} \leq M_{i}^{'} and x_{i} \in Z for all 1 \leq i \leq r}

Φ : (x_{1}, ..., x_{r}) \to a_{0} + x_{1} a_{1} + \dots + x_{r} a_{r} .

Φ : (x_{1}, ..., x_{r}) \to a_{0} + x_{1} a_{1} + \dots + x_{r} a_{r} .

Q_{t} = {a_{0} + x_{1} a_{1} + \dots + x_{r} a_{r} ∣ t M_{i} \leq x_{i} \leq t M_{i}^{'} and x_{i} \in Z for all 1 \leq i \leq r}

Q_{t} = {a_{0} + x_{1} a_{1} + \dots + x_{r} a_{r} ∣ t M_{i} \leq x_{i} \leq t M_{i}^{'} and x_{i} \in Z for all 1 \leq i \leq r}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Some new results in random matrices over finite fields

Kyle Luh

Center of Mathematical Sciences and Applications

Harvard University

20 Garden St.

Cambridge, MA 02138 USA

[email protected]

,

Sean Meehan

Department of Mathematics

The Ohio State University

231 W 18th Ave

Columbus, OH 43210 USA

[email protected]

and

Hoi H. Nguyen

Department of Mathematics

The Ohio State University

231 W 18th Ave

Columbus, OH 43210 USA

[email protected]

Abstract.

In this note we give various characterizations of random walks with possibly different steps that have relatively large discrepancy from the uniform distribution modulo a prime $p$ , and use these results to study the distribution of the rank of random matrices over ${\mathbb{F}}_{p}$ and the equi-distribution behavior of normal vectors of random hyperplanes. We also study the probability that a random square matrix is eigenvalue-free, or when its characteristic polynomial is divisible by a given irreducible polynomial in the limit $n\to\infty$ in ${\mathbb{F}}_{p}$ . We show that these statistics are universal, extending results of Stong and Neumann-Praeger beyond the uniform model.

2010 Mathematics Subject Classification:

15B52, 20G40

1. Introduction

Let $q$ be a prime power, and let $\operatorname{Mat}(n,q)$ ( $GL(n,q)$ ) be the group of all $n\times n$ (resp. non-singular) matrices with entries in the field of $q$ elements. Given a (class) function $f$ that depends on $M$ (such as the rank of $M$ , or the factors of its characteristic polynomial, etc), it is natural to study the behavior of $f$ for a “typical” matrix, such as for one sampled uniformly at random, we call this the uniform model.

1.1. Some statistics for the uniform model

Our first example is on the rank of $M$ . By exposing the columns of $M$ one by one, it is not hard to show that the probability that $M$ belongs to $GL(n,q)$ is exactly

[TABLE]

More generally, for $0\leq k\leq n$ we can show that

[TABLE]

Our next example is the probability that $M$ does not have eigenvalue in ${\mathbb{F}}_{q}$ (equivalently, $M$ does not have one dimensional invariant subspace). Beautiful results by Stong [36] and Neumann-Praeger [25] (see also [15]) showed that this probability tends to the derangement probability of a random permutation. We have

[TABLE]

More generally, Stong showed that in the $q\to\infty$ limit, the probability that the characteristic polynomial of $M$ factors into $n_{i}$ degree $i$ irreducible factors is the same as the probability of an element of $S_{n}$ factors into $n_{i}$ cycles of degree $i$ , and Hansen and Schmutz [19] also obtained similar results for joint cycle structures.

In a different direction, random matrices over finite field is also a source to generate random partitions. For instance it follows from [16] (and the references therein) for the uniform model:

[TABLE]

where $\lambda_{z-1}(M)$ is the partition corresponding to $z-1$ in the rational canonical form of a uniform matrix $M$ . We refer the reader to Section 2 for a precise definition of $\lambda_{\phi}(M)$ .

The measure above had been studied extensively in Number Theory in the context of the Cohen-Lenstra heuristics. Indeed, assume that $B$ is a $p$ -group, $B={\mathbb{Z}}/p^{\lambda_{1}}{\mathbb{Z}}\times\dots\times{\mathbb{Z}}/p^{\lambda_{m}}{\mathbb{Z}}$ , Friedman and Washington [14] showed that for a Haar random matrix $M$ in ${\mathbb{Z}}_{p}$

[TABLE]

We also refer the reader to [42, 43] and the references therein for related results.

In the spirit of (3), Fulman [15, 16] also showed for the uniform model that as $n\to\infty$ , for any fixed irreducible polynomial $\phi$ ,

[TABLE]

and in particularly,

[TABLE]

Moreover, it was also shown that these statistics are asymptotically independent for different $\phi$ in the sense that for any different irreducible polynomials $\phi_{1},\dots,\phi_{k}$

[TABLE]

We invite the reader to Section 2 for a useful tool to deduce Equations (2), (3) and (4).

1.2. Our main results

Motivated by the universality phenomenon in Random Matrix Theory, we wonder if the above statistics also hold for other models of $M$ . While there have been many results addressing universality of random matrices in characteristic zero (to study the spectral behavior of various models of random matrices), we have not seen much in the literature addressing universality behavior in the finite fields setting. In fact, to the best of our knowledge, although there had been partial results such as [1, 2, 3, 6, 10, 12, 20, 21, 35], universality results of matrices in finite fields only appeared very recently in [23, 24, 27, 30, 42, 43]. For instance, regarding the rank distribution, a simple consequence of results of Maples [23, 24] (see also [27]) and of Wood [42] showed

Theorem 1.3.

Let $\alpha>0$ be fixed. Assume that $M=(m_{ij})$ is a random matrix where $m_{ij}$ are iid copies of a random $\alpha$ -balanced distribution in ${\mathbb{F}}_{p}$ . Then we have

[TABLE]

Here we say that a random variable $\xi$ in ${\mathbb{F}}_{p}$ is $\alpha$ -balanced 111For simplicity, our notion here is weaker than those from [23, 24, 27, 30] in that $\alpha$ is fixed. if

[TABLE]

The method of [23, 24] (see also [27]) relies on a swapping technique from [9, 38, 4], and can yield exponentially small error bound of type $\exp(-c\alpha n)$ in Theorem 1.3. However this technique is quite delicate, and does not seem to extend to other interesting models of matrices such as symmetric matrices. The method of [42, 43] mainly rely on the moment method, which extends rather easily to matrices of entries over ${\mathbb{Z}}/r{\mathbb{Z}}$ for composite $r$ (and to control other algebraic statistics beside the ranks), but one has to assume $r$ to be sufficiently small.

One of the main goals of this note is to provide three alternative methods, which we will call the “arithmetic approach” (after [28, 40]),“geometric approach” (after [32]) and “combinatorial approach” (after [8]). Although the error bounds obtained by these methods are usually of subexponential type (rather than exponential type), we believe that the methods will be extremely useful for the study of random matrices in finite fields. For instance the methods can be adapted to matrices with constraints such as symmetric matrices and antisymmetric matrices [31] to answer a question from [5]. To highlight a result of this method, we show in Section 6 the following result

Theorem 1.4 (Rank distribution).

Assume that $0\leq d\leq n^{c}$ for a sufficiently small constant $c$ . Assume that $p\leq\exp(n^{c})$ . Then for a random $n\times n$ matrix $M$ with entries being iid copies of an $\alpha$ -balanced random variable $\xi$ in ${\mathbb{F}}_{p}$ we have

[TABLE]

where $c^{\prime}$ is another (sufficiently small) positive constant depending on $c$ and $\alpha$ .

Note that we can also establish similar rank distribution for rectangular matrices of size $(n+u)\times n$ for a fixed $u$ by a similar method. These results are not new and weaker than existing results in the literature (see for instance [23, 24] and [27, Theorem A.4], [30, Theorem 5.3].) However, as mentioned, our approach is new and seems to be robust. (For instance it can be used to prove Theorems 1.5, 1.6, 1.7 and 1.8 below.) More precisely, to establish Theorem 1.4 we will analyze the normal vectors ${\mathbf{w}}=(w_{1},\dots,w_{n})$ of random subspaces for which we will show that the random sum $\sum_{i}\xi_{i}w_{i}$ spreads out in ${\mathbb{F}}_{p}$ uniformly very quickly.

Theorem 1.5 (Non-structure of the normal vectors).

With the same assumption as in Theorem 1.4, let $X_{1},\dots,X_{n-d},X_{n-d+1}$ be the first $n-d+1$ column vectors of $M$ . Let ${\mathbf{w}}=(w_{1},\dots,w_{n})$ be any non-zero vector that is orthogonal to $W_{n-d}=\langle X_{1},\dots,X_{n-d}\rangle$ . Then with probability at least $1-\exp(-\Theta(n))$ with respect to $X_{1},\dots,X_{n-d}$ we have

[TABLE]

where $\xi_{i}$ are iid copies of $\xi$ .

We will prove the above result by showing that with very high probability the normal vectors do not have any structure (in any arithmetic, geometric, or combinatorial sense). On the other hand, we will also show that the normal vectors actually behave like a uniform vector in ${\mathbb{F}}_{p}^{n}$ . This can be seen as a discrete analog of [29] where it was shown that normalized normal vectors of a random hyperplane in ${\mathbb{R}}^{n}$ behave like a uniform vector on the unit sphere.

Theorem 1.6 (Uniformity of the normal vectors).

With the same assumption as in Theorem 1.4, and conditioning on the event that the subspace $W_{n-1}$ generated by $X_{1},\dots,X_{n-1}$ has full rank, we have

•

For each $i\in\{1,\cdots,n\}$ ,

[TABLE]

•

For each $i\neq j$ , and for any $a\in{\mathbb{F}}_{p}$ we have

[TABLE]

•

Furthermore, with $n_{a}$ being the number of $i$ such that $w_{i}=a$ , if we assume $\delta<1$ and $\delta^{-2}p=o(n/\log n)$ then

[TABLE]

where $c>0$ is absolute.

We need $p$ to be smaller than $n$ in Eq. (6) so that $n_{a}$ is not vanishing on average. We remark that the above result holds trivially for the uniform model (when $X_{1},\dots,X_{n-1}$ a chosen uniformly at random from ${\mathbb{F}}_{p}^{n}$ ) as in this case ${\mathbf{w}}$ is distributed as a uniform vector. However, it is not clear at all as to why ${\mathbf{w}}$ also behaves like a random uniform vector even when the $X_{i}$ are sampled differently.

In the above results there is a natural connection between the ranks and the normal vectors. Somewhat more surprisingly, we show that these quantities can also be used to study the characteristics polynomials. Namely we can obtain the following analog of Equation (4).

Theorem 1.7 (Divisibility of the characteristic polynomials).

With the same assumption as in Theorem 1.4, let $\varepsilon<1$ be any fixed constant. For a prime $p$ and fixed polynomial $\phi$ such that $C_{\phi}\leq p\leq n^{(1-\varepsilon)/2}$ and $\phi$ is irreducible over ${\mathbb{F}}_{p}$ we have

[TABLE]

where $c$ and the implied constant depend on $\varepsilon$ and $C_{\phi}$ is a constant that depends only on the degree of $\phi$ .

Also, we will show the following analog of Equation (2).

Theorem 1.8 (Universality for eigenvalue-free matrices).

Assume that $M$ is as in Theorem 1.4. We have

[TABLE]

Thus, for instance, our result works for the following simple-looking model of random matrices of (mean zero) integral entries. Let $A\subset{\mathbb{Z}}$ be a finite deterministic set of integers (such as $A=\{-1,0,3\}$ or $A=\{-7,-1,0,1,2\}$ , etc.), and let $p$ be a prime so that the projection $\pi(A)$ of $A$ onto ${\mathbb{Z}}/p{\mathbb{Z}}$ is not a single point. Let $\xi$ be the image of the uniform measure on $A$ under $\pi$ , then with the same notations as above we have the following result.

Corollary 1.9.

Among all $|A|^{n^{2}}$ matrices whose entries are all in $A$ , there are

•

$(\frac{1}{p^{d^{2}}}\frac{\prod_{i=d+1}^{\infty}(1-p^{-i})}{\prod_{i=1}^{d}(1-p^{-i})}+O(e^{-n^{c^{\prime}}}))$ -portion of them have rank $n-d$ in ${\mathbb{F}}_{p}^{n}$ ;

•

$(1-\prod_{i=1}^{\infty}(1-p^{-i\deg(\phi)})+O(\exp(-cn^{1-o(1)}/p^{2})))$ -portion of them have characteristic polynomial divisible by a given irreducible polynomial $\phi(x)$ for primes $p\lesssim n^{1-o(1)}$ ;

•

$(e^{-1}+o(1))$ -portion of them are eigenvalue-free as $n\to\infty$ and $p\to\infty$ ;

•

the normal vectors satisfy Theorem 1.5 and 1.6.

1.10. Notation

We write ${\mathbf{P}}$ for probability and ${\mathbf{E}}$ for expected value. For an event $\mathcal{E}$ , we write $\bar{\mathcal{E}}$ for its complement. We write $\exp(x)$ for the exponential function $e^{x}$ . We use $[n]$ to denote $\{1,\dots,n\}$ .

For a given index set $J\subset[n]$ and a vector ${\mathbf{x}}=(x_{1},\dots,x_{n})$ , we write ${\mathbf{x}}|_{J}$ to be the subvector of ${\mathbf{x}}$ of components indexed from $J$ . Similarly, if $H$ is a subspace then $H|_{J}$ is the subspace spanned by ${\mathbf{x}}|_{J}$ for ${\mathbf{x}}\in H$ .

For a vector ${\mathbf{w}}=(w_{1},\dots,w_{n})$ we let $\operatorname{supp}({\mathbf{w}})=\{i\in[n]|w_{i}\neq 0\}$ . We will also write ${\mathbf{x}}\cdot{\mathbf{w}}$ for the dot product $\sum_{i=1}^{n}x_{i}w_{i}$ . We say ${\mathbf{w}}$ is a normal vector for a subspace $H$ if ${\mathbf{x}}\cdot{\mathbf{w}}=0$ for every ${\mathbf{x}}\in H$ .

For $I,J\subset[n]$ , the matrix $M_{I\times J}$ is the submatrix of the rows and columns indexed from $I$ and $J$ respectively. Sometimes we will also write $M_{n}$ for $M_{n\times n}$ if there is no confusion.

We write $\|.\|_{{\mathbb{R}}/{\mathbb{Z}}}$ to be the distance to the nearest integer. Sometimes, for a matrix $M$ we write ${\mathbf{r}}_{i}(M)$ and ${\mathbf{c}}_{i}(M)$ for this $i$ -th row and column respectively. We write $X=O(Y)$ , $Y=\Omega(X)$ , $X\lesssim Y$ , or $Y\gtrsim X$ if $|X|\leq CY$ for some fixed $C$ . We also write that $X\asymp Y$ if $X\lesssim Y$ and $Y\gtrsim X$ .

Our paper is organized as follows. We will first discuss tools to prove Equations (2), (3) and (4) in Section 2. We will present our characterization methods in Sections 3, 4, 5, and then use these results to prove Theorem 1.5 in Section 6. We will present a short proof of Theorem 1.4 in Section 7 and of Theorem 1.6 in Section 8. The remaining two sections are reserved to prove Theorem 1.7 and Theorem 1.8 respectively.

2. The uniform model

In this part we discuss the method to prove Equations (2), (3) and (4). Although this is not the main goal of the note, we would like to present it here for pedagogical purposes, as for most of the cases the universal statistics are computed from the uniform model. We refer the reader to [16] for a comprehensive survey on the method and its other applications.

We first introduce a simple representative for $\operatorname{Mat}(n,q)$ (or $GL(n,q)$ ) modulo the conjugacy action of $GL(n,q)$ . To motivate the formulas, let us first introduce a simpler variant for the permutation groups $S_{n}$ . For a permutation $\pi$ let $n_{i}(\pi)$ be the number of cycles of length $i$ of $\pi$ . The cycle index of a subgroup $G$ of $S_{n}$ is defined as $\frac{1}{|G|}\sum_{\pi\in G}\prod_{i\geq 1}x_{i}^{n_{i}(\pi)}$ . The function $f(u,{\mathbf{x}})=1+\sum_{n\geq 1}\frac{u^{n}}{n!}\sum_{\pi\in S_{n}}\prod_{i\geq 1}x_{i}^{n_{i}(\pi)}$ is called the cycle index generating function of the symmetric groups and Pólya’s result shows that $f(u,{\mathbf{x}})=\prod_{m\geq 1}e^{\frac{x_{m}u^{m}}{m}}$ . This formula is useful in the study of (conjugacy) class functions of permutations.

For matrices over ${\mathbb{F}}_{q}$ , the cycle index generating functions can be described by first giving some information on the conjugacy classes. Let $\lambda$ be a partition of some non-negative integer $|\lambda|$ into integer parts $\lambda_{1}\geq\lambda_{2}\geq\dots\geq 0$ . In what follows $m_{i}(\lambda)$ denotes the number of parts of $\lambda$ of size $i$ , $\lambda^{\prime}$ is the partition dual to $\lambda$ , and $(u/q)_{i}$ denotes $(1-u/q)\cdots(1-u/q^{i})$ .

Recall that we define the characteristic polynomial of an $n\times n$ matrix $M$ as $\det(M-x)$ . Assume that the irreducible decomposition of the characteristic polynomial of a matrix $M$ has the form $\det(M-x)=\prod_{i=1}^{k}\phi_{i}(x)^{\lambda_{\phi_{i}}}$ . The rational canonical form of the conjugacy class containing $M$ is a matrix of form

[TABLE]

where each matrix $R_{i}$ has the form

[TABLE]

and $\sum_{i}\lambda_{i}=\lambda_{\phi_{i}}$ . Also we have the constraint that $\sum_{i=1}^{k}\lambda_{\phi_{i}}deg(\phi_{i})=n$ . Here for $\phi(x)=\sum_{i=0}^{d-1}a_{i}x^{i}+x^{d}$ , the companion matrix $C(\phi)$ is defined as

[TABLE]

In other words, we have the decomposition of ${\mathbb{F}}_{q}^{n}=\oplus_{\phi}V_{\phi}$ where the characteristic polynomial of $\alpha$ on $V_{\phi}$ is $\phi^{k}$ , and furthermore $V_{\phi}=\oplus_{i}V_{i}$ where $V_{i}$ are cyclic subspaces with dimension $\lambda_{i}\deg(\phi)$ .

Note that in the data given above, each irreducible polynomial $\phi$ is assigned a partition $\lambda_{\phi}(\alpha)$ . For example for $M=I_{n}$ , then $\lambda_{z-1}=(1,1,\dots,1)$ , and $\lambda_{\phi}=\emptyset$ for all other $\phi$ ; while for

[TABLE]

then $\lambda_{z-1}=(2,1,\dots,1)$ , and $\lambda_{\phi}=\emptyset$ for all other $\phi$ .

To introduce the cycle index formula for $Mat(n,q)$ , let $x_{\phi,\lambda}$ be variables corresponding to pairs of polynomials and partitions. Define

[TABLE]

Beautiful results of Kung [22] and Stong [36] showed that

[TABLE]

Note that one can also define $Z_{GL(n,q)}$ similarly. The above formula allows one to study class functions for matrices over ${\mathbb{F}}_{q}$ , for which we now give a proof for Equation (2), a proof for Equations (3) and (4) can be done similarly by specifying the variables $x_{\phi,\lambda}$ appropriately.

Proof.

(of Equation (2)) In the cycle index formula above, by specializing the variables $x_{\phi,\lambda}$ we may count different subsets of $Z_{Mat(n,q)}$ . For instance if we set $x_{\phi,\lambda}=1$ we get everything so,

[TABLE]

We want to count matrices with no fixed subspace. In terms of $x_{\phi,\lambda}$ this is the same as $x_{\phi,\lambda}=0$ for linear $\phi$ and $x_{\phi,\lambda}=1$ otherwise. Making this assignment and using (7) we have,

[TABLE]

where $H_{n,q}$ is the number of derangements in $GL(n,q)$ . Now the $n^{th}$ coefficient of this generating function is going to the first term in the product evaluated at $u=1$ and by a result of Fine-Herstein [13] we have (with $[q]_{i}=\prod_{k=0^{i-1}}(q^{i}-q^{k})$ ),

[TABLE]

Some cursory analysis (using the fact that the asymptotic behavior of the sum is determined by its first term) shows

[TABLE]

as desired. ∎

3. Structures of vectors in ${\mathbb{F}}_{p}^{n}$ : an almost optimal characterization

Let $G$ be an (additive) abelian group. A set $Q$ is a generalized arithmetic progression (GAP) of rank $r$ if it can be expressed in the form

[TABLE]

for some elements $a_{0},\ldots,a_{r}$ of $G$ , and for some integers $M_{1},\ldots,M_{r}$ and $M^{\prime}_{1},\ldots,M^{\prime}_{r}$ . One can think of Q as the image of an integer box $B=\{(x_{1},...,x_{r})\in{\mathbb{Z}}^{r}|M_{i}\leq x_{i}\leq M_{i}^{\prime}\}$ under the linear map

[TABLE]

Given $Q$ with a representation as above, the numbers $a_{i}$ are generators of $Q$ , the numbers $M_{i}$ and $M_{i}^{\prime}$ are dimensions of $Q$ , and ${\operatorname{Vol}}(Q):=|B|$ is the volume of $Q$ associated to this presentation (i.e. this choice of $a_{i},M_{i},M_{i}^{\prime}$ ). We say that $Q$ is proper for this presentation if the above linear map is one to one, or equivalently if $|Q|=|B|$ . For an integer $t\geq 1$ , we let $Q_{t}$ denote the dilation of $Q$ by $t$ , i.e.

[TABLE]

and we say $Q$ is $t$ -proper if $Q_{t}$ is also proper. If $-M_{i}=M_{i}^{\prime}$ for all $i\geq 1$ and $a_{0}=0$ , we say that $Q$ is symmetric for this presentation. A coset progression in $G$ is a set of type $H+Q$ , where $H$ is a subgroup of $G$ .

Our main result here is that, if a random walk in ${\mathbb{Z}}/p{\mathbb{Z}}$ does not spread out evenly fast, then the steps must be arithmetically correlated (and vice versa).

Theorem 3.1 (Arithmetic structure, characterization I).

Let $\varepsilon<1$ and $C$ be positive constants. Suppose $\mu$ is a random variable that is $\alpha$ -balanced taking values in ${\mathbb{Z}}/p{\mathbb{Z}}$ and that ${\mathbf{w}}=(w_{1},\cdots,w_{n})\in({\mathbb{Z}}/p{\mathbb{Z}})^{n}$ is such that

[TABLE]

where $\mu_{1},\cdots,\mu_{n}$ are independent and identically distributed copies of $\mu$ and $p$ is an odd prime possibly depending on $n$ . Then for any $n^{\epsilon/2}\leq n^{\prime}\leq n$ , there is a set $W^{\prime}$ of $n-n^{\prime}$ components $w_{i}$ such that one of the following holds.

•

For $p\lesssim n^{C}$ , there exists a GAP of rank one $Q$ that contains $W^{\prime}$ , where

[TABLE]

•

For $p\gtrsim n^{C}$ , there exists a proper symmetric GAP $Q$ of rank $r=O_{C,\epsilon}(1)$ that contains $W^{\prime}$ , where

[TABLE]

Note that our characterization is almost optimal in the sense that it nearly implies the backward direction: if ${\mathbf{w}}$ satisfies the conclusion of the theorem, then $\rho=\Omega(n^{-C})$ . It also implies that for $\rho\lesssim n^{-1/2+o(1)}$ (for any $p$ ), recovering a result by Maples (see Theorem 4.1. This is because that if a positive portion of the $w_{i}$ are non-zero, then the set $Q$ must have size at least $1$ and $r\geq 1$ .) Our presentation here follows from [30] with some modifications (as in [30] we focused only on large $p$ , and on the quantity $\sup_{a\in{\mathbb{Z}}/p{\mathbb{Z}}}{\mathbf{P}}(\mu_{1}w_{1}+\cdots+\mu_{n}w_{n}=a)$ rather than on $\rho$ as above.) We will make use of two results from [39] by Tao and Vu. The first result allows one to pass from coset progressions to proper coset progressions without any substantial loss.

Theorem 3.2.

[39, Corollary 1.18]** There exists a positive integer $C_{1}$ such that the following statement holds. Let $Q$ be a symmetric coset progression of rank $d\geq 0$ and let $t\geq 1$ be an integer. Then there exists a $t$ -proper symmetric coset progression $P$ of rank at most $d$ such that we have

[TABLE]

We also have the size bound

[TABLE]

The second result, which is directly relevant to us, says that as long as $|kX|$ grows slowly compared to $|X|$ , then it can be contained in a structure. This is a long-range version of the Freiman-Ruzsa theorem.

Theorem 3.3.

[39, Theorem 1.21]** There exists a positive integer $C_{2}$ such hat the following statement holds: whenever $d,k\geq 1$ and $X\subset G$ is a non-empty finite set such that

[TABLE]

then there exists a proper symmetric coset progression $H+Q$ of rank $0\leq d^{\prime}\leq d-1$ and size $|H+Q|\geq 2^{-2^{C_{2}d^{2}2^{6d}}}k^{d^{\prime}}|X|$ and $x,x^{\prime}\in G$ such that

[TABLE]

Note that any GAP $Q=\{a_{0}+x_{1}a_{1}+\dots+x_{r}a_{r}:-N_{i}\leq x_{i}\leq N_{i}\hbox{ for all }1\leq i\leq r\}$ is contained in a symmetric GAP $Q^{\prime}=\{x_{0}a_{0}+x_{1}a_{1}+\dots+x_{r}a_{r}:-1\leq x_{0}\leq 1,-N_{i}\leq x_{i}\leq N_{i}\hbox{ for all }1\leq i\leq r\}$ . Thus, by combining Theorem 3.3 with Theorem 3.2 we obtain the following

Corollary 3.4.

Whenever $d,k\geq 1$ and $X\subset G$ is a non-empty finite set such that

[TABLE]

then there exists a 2-proper symmetric coset progression $H+P$ of rank $0\leq d^{\prime}\leq d$ and size $|H+P|\leq 2^{d}(C_{1}d)^{3d^{2}/2}2^{d2^{C_{2}d^{2}2^{6d}}}|kX|$ such that

[TABLE]

Proof.

(of Theorem 3.1) First, for convenience we will pass to symmetric distributions. Let $\psi=\mu-\mu^{\prime}$ be the symmetrization and let $\psi^{\prime}$ be a lazy version of $\psi$ so that

[TABLE]

Notice that $\psi^{\prime}$ is symmetric as $\psi$ is symmetric. We can check that $\max_{x}{\mathbf{P}}(\psi=x)\leq 1-\alpha$ , and so

[TABLE]

We assume that ${\mathbf{P}}(\psi^{\prime}=t_{j})={\mathbf{P}}(\psi^{\prime}=-t_{j})=\beta_{j}/2>0$ for $1\leq j\leq l,t_{j}\neq 0$ , and that ${\mathbf{P}}(\psi^{\prime}=0)=\beta_{0}$ , where $t_{j_{1}}\pm t_{j_{2}}\neq 0\ \operatorname{mod}\ p$ for all $1\leq j_{1},j_{2}\leq l$ and $j_{1}\neq j_{2}$ . Denote $S=\mu_{1}w_{1}+\dots+\mu_{n}w_{n}$ . Consider $a\in{\mathbb{Z}}/p{\mathbb{Z}}$ where ${\mathbf{P}}(S=a)$ is maximum (or minimum). Using the standard notation $e_{p}(x)$ for $\exp(2\pi\sqrt{-1}x/p)$ , we have

[TABLE]

So

[TABLE]

By independence

[TABLE]

It follows that

[TABLE]

where we made the change of variable $x\rightarrow x/2$ (in ${\mathbb{Z}}/p{\mathbb{Z}}$ ) and used the triangle inequality.

By convexity, we have that $|\sin\pi z|\geq 2\|z\|$ for any $z\in{\mathbb{R}}$ , where $\|z\|:=\|z\|_{{\mathbb{R}}/{\mathbb{Z}}}$ is the distance of $z$ to the nearest integer. Thus,

[TABLE]

Hence for each $w_{i}$

[TABLE]

Consequently, we obtain a key inequality

[TABLE]

Large level sets. Now we consider the level sets $S_{m}:=\{x|x\neq 0\wedge\sum_{i=1}^{n}\sum_{j=1}^{l}\beta_{j}\|\frac{xt_{j}w_{i}}{p}\|^{2}\leq m\}$ . We have

[TABLE]

As $\sum_{m\geq 1}\exp(-m)<1$ , there must be a large level set $S_{m}$ such that

[TABLE]

In fact, since $\rho\geq n^{-C}$ , we can assume that $m=O(\log n)$ . The bound $|S_{m}|\geq\exp(m-2)\rho p$ guarantees that $S_{m}$ is non-empty. Now we consider two cases.

Case 1. We assume $p\lesssim n^{C}$ . We know that $S_{m}$ is non-empty, and hence there exists $x_{0}\neq 0$ so that

[TABLE]

Set

[TABLE]

Then by definition of $\xi$ , we have

[TABLE]

Thus we can rewrite the above as

[TABLE]

Thus there exists an index $j_{0}$ so that $\beta_{j_{0}}\sum_{i=1}^{n}\|\frac{x_{0}t_{j_{0}}w_{i}}{p}\|^{2}\leq 2\alpha^{-1}m\beta_{j_{0}}$ , that is

[TABLE]

So, for most $w_{i}$

[TABLE]

More precisely, by averaging, the set of $w_{i}$ satisfying (15) has size at least $n-n^{\prime}$ . We call this set $W^{\prime}$ . The set $\{w_{1},\dots,w_{n}\}\backslash W^{\prime}$ has size at most $n^{\prime}$ and this is the exceptional set that appears in Theorem 3.1. By definition, for $w_{i}$ from this set we have

[TABLE]

Hence we have seen that, after a dilation by $x_{0}t_{j_{0}}$ , $W^{\prime}$ belongs to the arithmetic progression $P$ of rank one and of size $O(p\sqrt{(\log n)/n^{\prime}})$ ,

[TABLE]

Notice that in this case we don’t have to assume $n^{\prime}\geq n^{\varepsilon/2}$ .

Case 2. We assume $p\gtrsim n^{C}$ . By double-counting we have

[TABLE]

So, for most $w_{i}$ ,

[TABLE]

for some large constant $C_{0}$ .

By averaging, the set of $w_{i}$ satisfying (17) has size at least $n-n^{\prime}$ . We call this set $W^{\prime}$ . The set $\{w_{1},\dots,w_{n}\}\backslash W^{\prime}$ has size at most $n^{\prime}$ and this is the exceptional set that appears in Theorem 3.1. In the rest of the proof, we are going to show that $W^{\prime}$ is a dense subset of a proper GAP.

Since $\|\cdot\|$ is a norm, by the triangle inequality, we have for any $a\in kW^{\prime}$

[TABLE]

More generally, for any $k^{\prime}\leq k$ and $a\in k^{\prime}W^{\prime},$

[TABLE]

Dual sets. Define

[TABLE]

where the constant $200$ is ad hoc and any sufficiently large constant would do. We have

[TABLE]

To see this, define $T_{a}:=\sum_{x\in S_{m}}\sum_{j=1}^{l}\beta_{j}\cos\frac{2\pi at_{j}x}{p}$ . Using the fact that $\cos 2\pi z\geq 1-100\|z\|^{2}$ for any $z\in{\mathbb{R}}$ , we have, for any $a\in S_{m}^{\ast},$

[TABLE]

One the other hand, using the basic identity $\sum_{a\in{\mathbb{Z}}/p{\mathbb{Z}}}\cos\frac{2\pi az}{p}=p{\mathbf{I}}_{z=0}$ , we have (taking into account that $t_{j_{1}}\neq t_{j_{2}}\ \operatorname{mod}\ p$ )

[TABLE]

Equation (20) then follows from the last two estimates and averaging.

Next, for a properly chosen constant $c_{1}$ we set

[TABLE]

By (19) we have $\cup_{k^{\prime}=1}^{k}k^{\prime}W^{\prime}\subset S_{m}^{\ast}$ . Next, set

[TABLE]

We have $kW^{{}^{\prime\prime}}\subset S_{m}^{\ast}\cup\{0\}$ . This results in the critical bound

[TABLE]

We are now in a position to apply Corollary 3.4 with $X$ as the set of distinct elements of $W^{{}^{\prime\prime}}$ . As $k=\Omega(\sqrt{\frac{\alpha_{n}^{\prime}n^{\prime}}{m}})=\Omega(\sqrt{\frac{\alpha_{n}^{\prime}n^{\prime}}{\log n}})$ ,

[TABLE]

It follows from Corollary 3.4 that $kX$ is a subset of a 2-proper symmetric coset progression $H+P$ of rank $r=O_{C,\epsilon_{0}}(1)$ and cardinality

[TABLE]

Now we use the special property of ${\mathbb{Z}}/p{\mathbb{Z}}$ that it has only trivial proper subgroup. As $|kX|=O(n^{C})$ , and as $p\gtrsim n^{C}$ , the only way that $|kX|\gtrsim|H+P|$ is that $H=\{0\}$ . Consequently, $kX$ is now a subset of $P$ , a 2-proper symmetric GAP of rank $r=O_{C,\epsilon_{0}}(1)$ and cardinality

[TABLE]

To this end, we apply the following dividing trick from [28, Lemma A.2].

Lemma 3.5.

Assume that $0\in X$ and that $P=\{\sum_{i=1}^{r}x_{i}a_{i}:|x_{i}|\leq N_{i}\}$ is a 2-proper symmetric GAP that contains $kX$ . Then $X\subset\{\sum_{i=1}^{r}x_{i}a_{i}:|x_{i}|\leq 2N_{i}/k\}$ .

Combining (23) and Lemma 3.5 we thus obtain a GAP $Q$ that contains $X$ and

[TABLE]

concluding the proof. ∎

Before concluding the section, we record here an elementary but useful result beyond the polynomial regime.

Theorem 3.6 (degenerate case).

Let $\varepsilon<1$ and $\delta$ be positive constants such that $\delta<\varepsilon$ . Let $p$ be an odd prime number. Suppose $\mu$ is a random variable that is $\alpha$ -balanced taking values in ${\mathbb{Z}}/p{\mathbb{Z}}$ . Also, assume that ${\mathbf{w}}=(w_{1},\cdots,w_{n})\in({\mathbb{Z}}/p{\mathbb{Z}})^{n}$ is such that

[TABLE]

where $\mu_{1},\cdots,\mu_{n}$ are independent and identically distributed copies of $\mu$ . Then for any $n^{\epsilon/2}\leq n^{\prime}\leq n$ , there is a set $W^{\prime}$ of $n-n^{\prime}$ components $w_{i}$ and then a GAP of rank one $Q$ that contains $W^{\prime}$ , where

[TABLE]

We note that be bound on $Q$ above is very close the the trivial bound $p$ . The result is effective for not too large $p$ .

Proof.

We proceed as in the proof of Theorem 3.1 until (12) that

[TABLE]

We recall the level sets $S_{m}=\{x|x\neq 0\wedge\sum_{i=1}^{n}\sum_{j=1}^{l}\beta_{j}\|\frac{xt_{j}w_{i}}{p}\|^{2}\leq m\}$ . We have

[TABLE]

As $\sum_{m\geq 1}\exp(-m)<1$ , there must be a large level set $S_{m}$ such that

[TABLE]

In fact, since $\rho\geq\exp(-n^{\delta})$ , we can assume that $m=O(n^{\delta})$ . The bound $|S_{m}|\geq\exp(m-2)\rho p$ guarantees that $S_{m}$ is non-empty. Our next step is almost identical to the proof of the first part of Theorem 3.1. As $S_{m}$ is non-empty, there exists $x_{0}\neq 0$ so that

[TABLE]

With $\alpha^{\prime}$ as in (14) we have $\alpha^{\prime}\geq\alpha/2$ , and we can rewrite the above as

[TABLE]

Thus there exists an index $j_{0}$ so that $\beta_{j_{0}}\sum_{i=1}^{n}\|\frac{x_{0}t_{j_{0}}w_{i}}{p}\|^{2}\leq 2\alpha^{-1}m\beta_{j_{0}}$ , that is

[TABLE]

So, with $W^{\prime}$ be the set of $w_{i}$ such that $\|\frac{x_{0}t_{j_{0}}w_{i}}{p}\|^{2}\leq\frac{2\alpha^{-1}m}{n^{\prime}}$ then $W^{\prime}$ has at least $n-n^{\prime}$ elements. By definition, for $w_{i}\in W^{\prime}$ we have $\|\frac{x_{0}t_{j_{0}}w_{i}}{p}\|^{2}=O(\frac{m}{n^{\prime}}).$ and this implies that after a dilation by $x_{0}t_{j_{0}}$ the set $W^{\prime}$ belongs to the arithmetic progression $P$ of rank one

[TABLE]

Notice that the size of $P$ is bounded by $O(p\sqrt{(m)/n^{\prime}})=O(p\sqrt{n^{\delta}/n^{\prime}})$ as desired.

∎

4. Structures of vectors in ${\mathbb{F}}_{p}^{n}$ : a geometric approach

From now on, for simplicity we will assume our random variables are iid Bernoulli (taking values $\pm 1$ with probability 1/2) and $p\geq 3$ , the general $\alpha$ -balanced case can be treated almost identically (see Remark 4.9.)

Let ${\mathbf{w}}=(w_{1},\dots,w_{n})$ be a non-zero vector in ${\mathbb{F}}_{p}^{n}$ , where $p$ is a prime. We first cite a result of Erdős-Littlewood-Offord type from [23]

Theorem 4.1.

Let $c_{nsp}>0$ be a constant, and assume that

[TABLE]

Then

[TABLE]

where the implied constant depends on $c_{nsp}$ , and where $X=(x_{1},\dots,x_{n})$ and $x_{i}$ are iid Bernoulli.

In what follows, if not specified, we always assume our deterministic vector ${\mathbf{w}}$ to satisfy the non-sparsity property (25). We remark that this non-sparsity property passes to all other dilations $t{\mathbf{w}}$ of ${\mathbf{w}}$ in ${\mathbb{F}}_{p}^{n}$ for non-zero $t$ .

As mentioned in the introduction, our treatment in this section is motivated by the work of Rudelson and Vershynin (in characteristic zero) [32] and we hope to develop a “geometric” characterization of the steps of our random walk in ${\mathbb{F}}_{p}$ if the walk spreads out slowly. This task is not straightforward; as we will see, there are many simple concepts in characteristic zero that are hard to find natural (and equally useful) analogs in the finite field setting (for instance, the notion of compressible and incompressible vectors).

In some situations, if ${\mathbf{w}}=(w_{1},\dots,w_{n})$ is a vector in ${\mathbb{F}}_{p}^{n}$ , then by viewing ${\mathbb{F}}_{p}$ as the interval ${\mathbf{I}}_{p}=[-(p-1)/2,(p-1)/2]$ in ${\mathbb{Z}}$ , we will consider the components $w_{i}$ as integers from this interval. We then write ${\mathbf{w}}^{\prime}$ as the vector in ${\mathbb{R}}^{n}$

[TABLE]

Definition 4.2.

Let $0<\gamma<1$ and $\kappa$ be given. Let ${\mathbf{w}}^{\prime}=(w_{1}^{\prime},\dots,w_{n}^{\prime})\in{\mathbb{R}}^{n}$ be a non-zero vector in $\frac{1}{p}{\mathbb{Z}}^{n}$ where $\|{\mathbf{w}}^{\prime}\|_{\infty}\leq 1/2$ . We denote by $\mathbf{ULCD}_{\gamma,\kappa}({\mathbf{w}}^{\prime})$ to be the smallest (infimum) positive integer $L$ such that

[TABLE]

where ${\operatorname{dist}}(L{\mathbf{w}}^{\prime},{\mathbb{Z}}^{n})$ denotes the smallest Euclidean distance from $L{\mathbf{w}}^{\prime}$ to an element of ${\mathbb{Z}}^{n}$ .

Throughout this paper, $\gamma<1$ is an absolute constant (such as $\gamma=1/8$ ), and $\kappa\leq n^{c}$ , for some positive constant $c\leq 1/2$ to be chosen.

This definition is in characteristic zero. Here we used the notion of $\mathbf{ULCD}$ (compared to the notion of $\mathbf{LCD}$ from [32]) to emphasize that $(w_{1}^{\prime},\dots,w_{n}^{\prime})$ is not normalized (i.e. its $\ell_{2}$ -norm might not be unit). Notice that

[TABLE]

Furthermore, if $\mathbf{ULCD}_{\gamma,\kappa}=p$ then by definition we would have for all $1\leq L\leq p-1$ that

[TABLE]

Remark 4.3.

Note that if for some $T>1$ we have $|w_{i}^{\prime}|\leq 1/2T$ for all $i$ , then

[TABLE]

This is because otherwise, $\|Lw_{i}^{\prime}\|_{{\mathbb{R}}/{\mathbb{Z}}}=|Lw_{i}^{\prime}|$ and hence $\sum_{i}\|Lw_{i}^{\prime}\|_{{\mathbb{R}}/{\mathbb{Z}}}^{2}=\sum_{i}\|Lw_{i}^{\prime}\|_{2}^{2}$ , which cannot be smaller than $\gamma^{2}\|L{\mathbf{w}}^{\prime}\|_{2}^{2}$ by definition as $\gamma<1$ .

Our result below says that if $\mathbf{ULCD}_{\gamma,\kappa}({\mathbf{w}}^{\prime})$ is large then the concentration probability is small. In our notation $t{\mathbf{w}}$ is another vector in ${\mathbb{F}}_{p}^{n}$ , which again can be viewed as a vector in ${\mathbf{I}}_{p}^{n}=[-(p-1)/2,(p-1)/2]^{n}$ . We then define $(t{\mathbf{w}})^{\prime}$ as $\frac{1}{p}t{\mathbf{w}}$ accordingly in this projection to characteristic zero.

Theorem 4.4 (Geometric structure, characterization II).

Let $p\geq 3$ be a prime, and let $C>0$ be an arbitrary constant. Let ${\mathbf{w}}=(w_{1},\dots,w_{n})$ be a non-zero vector in ${\mathbb{Z}}^{n}$ , where $|w_{i}|\leq p/2$ , and let ${\mathbf{w}}^{\prime}=\frac{1}{p}{\mathbf{w}}=(\frac{w_{1}}{p},\dots,\frac{w_{n}}{p})$ . Then

(1)

If there is no non-zero $t\in{\mathbb{F}}_{p}$ such that $\|(t{\mathbf{w}})^{\prime}\|_{2}<\kappa$ then we have

[TABLE] 2. (2)

Otherwise, assume that $1\leq\|{\mathbf{w}}^{\prime}\|_{2}\leq\kappa$ and ${\mathbf{w}}$ satisfies (25), with $\kappa\leq C\sqrt{n}$ and

[TABLE]

Then

[TABLE]

where the implied constants depend on $C,\gamma,c_{nsp}$ , and where $X=(x_{1},\dots,x_{n})$ and $x_{i}$ are iid Bernoulli in the concentration definition $\rho(w)$ .

Corollary 4.5.

Assume that there exists a quantity $\rho\gtrsim\exp(-\Theta(\kappa^{2}))$ such that $\rho({\mathbf{w}})\geq\rho$ , then there exists a dilation $t{\mathbf{w}}$ of ${\mathbf{w}}$ , where $t\in{\mathbb{F}}_{p}$ non-zero, so that with ${\mathbf{w}}^{\prime}=t{\mathbf{w}}$ we have $\|{\mathbf{w}}^{\prime}\|_{2}<\kappa$ and there exists $L=L({\mathbf{w}})\geq 1$ such that

[TABLE]

and

[TABLE]

We next deduce another elementary but useful result, which will be used later on in the applications.

Corollary 4.6.

Assume that ${\mathbf{w}}$ has at least $m$ non-zero coordinates, and $p<\sqrt{m}$ . We then have

[TABLE]

Proof.

As $t{\mathbf{w}}^{\prime}$ has at least $m$ non-zero coordinates for any non-zero $t$ , we have that

[TABLE]

and we are in scenario (1) of Theorem 4.4. ∎

We now present a proof of Theorem 4.4.

Proof.

(of Theorem 4.4) Write $e_{p}(x)=e^{2\pi ix/p}$ , then for any $r\in{\mathbb{F}}_{p}$ we have

[TABLE]

So

[TABLE]

where we used the fact that $|\sin\pi z|\geq 2\|z\|_{{\mathbb{R}}/{\mathbb{Z}}}$ for any $z\in{\mathbb{R}}$ , where $\|z\|_{{\mathbb{R}}/{\mathbb{Z}}}$ is the distance of $z$ to the nearest integer, and that

[TABLE]

From here, (1) follows as $\|tw_{l}^{\prime}\|_{{\mathbb{R}}/{\mathbb{Z}}}=\|(tw)_{l}^{\prime}\|_{{\mathbb{R}}/{\mathbb{Z}}}.$

We are now in the assumption of (2). For each integer $m$ , let $T(m,p/2)$ be the (level) set of $t\in{\mathbb{F}}_{p}$ corresponding to $m$ ,

[TABLE]

By the non-sparsity of ${\mathbf{w}}$ , we can show that $T(c_{nsp}n/64,p/2)$ is not all of ${\mathbb{F}}_{p}$ (we can show this by using the fact that if $w_{i}\neq 0$ then $\sum_{t\in{\mathbb{F}}_{p}}\|\frac{tw_{i}}{p}\|_{{\mathbb{R}}/{\mathbb{Z}}}=\sum_{t\in{\mathbb{F}}_{p}}\|\frac{t}{p}\|_{{\mathbb{R}}/{\mathbb{Z}}}$ ). Thus

[TABLE]

Our next claim shows that the level sets consist of well separated intervals.

Claim 4.7 (spacing of the level sets).

Assume that $m<\kappa^{2}/4<c_{nsp}n/64$ and $s_{1}<s_{2}\in T(m,p/2)$ and

[TABLE]

Then

[TABLE]

Consequently we have

[TABLE]

Proof.

(of Claim 4.7) Assume that $t_{1},t_{2}\in T(m,p/2)$ , then by the triangle inequality,

[TABLE]

Thus

[TABLE]

where in the last estimate we used $|t_{2}-t_{1}|>\gamma^{-1}\kappa/\|{\mathbf{w}}^{\prime}\|_{2}$ . Thus by the definition of $\mathbf{ULCD}_{\gamma,\kappa}$ we must have

[TABLE]

∎

Next, we will need a Cauchy-Davenport-type bound on size of sumsets in ${\mathbb{F}}_{p}$ . Observe from the Cauchy-Schwarz inequality, $k(\|x_{1}\|_{{\mathbb{R}}{\mathbb{Z}}}^{2}+\dots+\|x_{k}\|_{{\mathbb{R}}/{\mathbb{Z}}}^{2})\geq\|x_{1}+\dots+x_{k}\|_{{\mathbb{R}}/{\mathbb{Z}}}^{2}$ , and so

[TABLE]

where we view these sets as subsets of ${\mathbb{F}}_{p}$ . Hence, by Cauchy-Davenport’s inequality in ${\mathbb{F}}_{p}$ ([37]) we have that

[TABLE]

Thus for all $m\leq\min\{c_{nsp}\kappa^{2}/64,c_{nsp}n/64\}$ , by choosing $k=\lfloor\sqrt{\min\{c_{nsp}\kappa^{2}/64,c_{nsp}n/64\}/m}\rfloor$ we have

[TABLE]

where we used $|T(c_{nsp}n/64,p/2)|-1<p$ .

We deduce

[TABLE]

completing the theorem proof. ∎

Remark 4.8.

Under the assumption of (2) of Theorem 4.4, we have actually shown a stronger estimate that

[TABLE]

We note that Theorem 4.1 can be deduced from Theorem 4.4 by setting $\kappa=C\sqrt{n}$ with sufficiently large $C$ ; for this there is a dilation ${\mathbf{w}}=t{\mathbf{w}}$ with $\|{\mathbf{w}}^{\prime}\|_{2}$ of order $\sqrt{n}$ but $\|{\mathbf{w}}^{\prime}\|_{2}<\kappa$ . We then just apply (2) of Theorem 4.4, noting that $L\geq 1$ .

Remark 4.9.

When $x_{i}$ are iid copies of an $\alpha$ -balanced random variable, then by Equation (12), by convexity, and by the fact that $1-\alpha/2\leq\sum_{j=1}^{l}\beta_{j}\leq 1$ we have

[TABLE]

It thus boils down to study the bounds for concentration probability of ${\mathbf{w}}$ , for which we have done in the proof of Theorem 4.4.

4.10. Some properties of ULCD

Roughly speaking, our next result is similar to Theorem 4.4, but instead of working with the concentration event $X\cdot{\mathbf{w}}=r$ we are working with a coarser event that $X\cdot{\mathbf{w}}$ belongs to an arc in ${\mathbb{F}}_{p}$ . We find it more convenient to write in mod 1 as follows.

Theorem 4.11 (anti-concentration modulo one).

Assume that $0<a_{1},\dots,a_{n}<1$ . Assume that

[TABLE]

Then for any

[TABLE]

we have

[TABLE]

where $\xi_{i}$ are iid Bernoulli.

Note that we need this result because at some point we need to pass to characteristic zero, and take distance to ${\mathbb{Z}}$ . A key difference of this bound compared to the classical small ball estimate (say studied in [32]) is that we are looking at the balls modulo one, rather than with respect to the whole real line.

Proof.

(of Theorem 4.11) Let $\mu$ be the distribution of $\sum_{i}a_{i}\xi_{i}$ modulo one, where we can write $\mu=\mu_{1}\ast\dots\ast\mu_{n}$ , and where $\mu_{i}(a_{i})=\mu(-a_{i})=1/2$ . Let $L_{0}=\lfloor 1/\varepsilon\rfloor$ . We use the Erdős-Turán inequality,

[TABLE]

As $\xi_{i}$ are iid Bernoulli, bounding the cosine as in the proof of Theorem 4.4 we have

[TABLE]

Now by definition of $\mathbf{ULCD}_{\gamma,\kappa}$ , as $L_{0}\leq 1/\varepsilon<\mathbf{ULCD}_{\gamma,\kappa}((a_{1},\dots,a_{n}))/2$ , for any $k\leq L_{0}$ we have

[TABLE]

and so

[TABLE]

Summing over all $k\leq L_{0}$ we have

[TABLE]

as desired. ∎

It is remarked that the bound above depends on $\|{\mathbf{a}}\|_{2}$ , which becomes almost meaningless if $\|{\mathbf{a}}\|_{2}$ is small, say of order $O(1)$ . To avoid this situation, we will need to consider vectors ${\mathbf{w}}^{\prime}$ that have large size and large $\mathbf{ULCD}_{\gamma,\kappa}({\mathbf{w}}^{\prime})$ at the same time.

Our next result roughly says that a non-sparse vector cannot have very small $\mathbf{ULCD}$ , at least with respect to ${\mathbb{F}}_{p}$ with not too large and not too small $p$ . To be more precise, we have the following.

Remark 4.12.

As we will be working with vectors ${\mathbf{w}}$ satisfying (25), we easily see that for any $t\neq 0$ in ${\mathbb{F}}_{p}$

[TABLE]

Notice that this quantity is larger than $\kappa^{2}$ if $p\leq n^{1/2}/\kappa$ , and in this case the first part of 4.4 holds, and hence automatically

[TABLE]

As such, in what follows we will be working with

[TABLE]

Lemma 4.13 (LCD and size in fields of small order).

Assume that $\kappa=n^{c}$ for a positive constant $c<1/16$ . Assume that $p$ is a prime smaller than $\exp(c\kappa^{2})$ , and $w\in{\mathbb{F}}_{p}^{n}$ is a vector satisfying (25) and such that $\rho({\mathbf{w}})\geq 2\exp(-\kappa^{2}/2)$ . Then there exists $t\in{\mathbb{F}}_{0}$ so that with ${\mathbf{w}}=t{\mathbf{w}}$ we have $\|{\mathbf{w}}^{\prime}\|_{2}$ has order $\kappa$ and either $\mathbf{ULCD}_{\gamma,\kappa}({\mathbf{w}}^{\prime})=p$ (in which case we can apply (26)) or else

[TABLE]

We remark that this result is perhaps the most important one in our treatment, as it allows us to assume that the $\mathbf{ULCD}$ to be sufficiently large to make sense of the bounds. In characteristic zero, this bound is straightforward if the vector is incompressible (being far from sparse vectors).

Before proving this lemma, we first need the following simple statement.

Claim 4.14.

Assume that ${\mathbf{w}}\in{\mathbb{F}}_{p}^{n}$ is a non-zero vector satisfying (25) and such that $\rho({\mathbf{w}})\geq 2\exp(-\kappa^{2}/2)$ , with $\kappa=o(\sqrt{n})$ . Then there exists $t\in{\mathbb{F}}_{0}={\mathbb{F}}_{p}\setminus 0$ so that with ${\mathbf{w}}=t{\mathbf{w}}$ we have

[TABLE]

Proof.

(of Claim 4.14) As ${\mathbf{w}}$ satisfies (25) and $\rho({\mathbf{w}})\geq 2\exp(-\kappa^{2}/2)$ , (1) of Theorem 4.4 does not apply, and so there is a fiber ${\mathbf{w}}=t{\mathbf{w}}$ such that $\|{\mathbf{w}}^{\prime}\|_{2}<\kappa$ . If $\|{\mathbf{w}}^{\prime}\|_{2}\geq\kappa/2$ the we would be done. Otherwise we just consider the sequence ${\mathbf{w}},2{\mathbf{w}},3{\mathbf{w}}$ , etc. By the triangle inequality (where we recall that $(t{\mathbf{w}})^{\prime}=\frac{1}{p}(tw_{1}(\ \operatorname{mod}\ p),\dots,tw_{n}(\ \operatorname{mod}\ p))$ ) we have

[TABLE]

On the other hand, by (25) $\sum_{k\in{\mathbb{F}}_{p}}\|(k{\mathbf{w}})^{\prime}\|_{2}$ has order $\sqrt{n}p$ , so there must exist a smallest $k_{0}\geq 2$ such that $\|((k_{0}+1){\mathbf{w}})^{\prime}\|_{2}\geq\kappa$ . It then follows that $\kappa/2\leq\|(k_{0}{\mathbf{w}})^{\prime}\|_{2}<\kappa$ . ∎

Proof.

(of Lemma 4.13) Assume that we are not in the first case, and also assume to the contrary that we are not in the second case either. We will iterate the following process, which will then result in a contradiction. Set

[TABLE]

We start from any ${\mathbf{u}}_{1}={\mathbf{w}}^{\prime}=(w_{1}/p,\dots,w_{n}/p)$ in the fiber $t{\mathbf{w}}$ of ${\mathbf{w}}$ with $\kappa/2\leq\|{\mathbf{u}}_{1}\|_{2}<\kappa$ .

Step 1: Let $D_{1}=\mathbf{ULCD}({\mathbf{u}}_{1})$ , then $2\leq D_{1}\leq\min\{\kappa^{1+\beta},p-1\}$ . Let ${\mathbf{u}}_{1}^{\prime}=D_{1}{\mathbf{u}}_{1}(=(D_{1}{\mathbf{w}})^{\prime})$ , then we have

[TABLE]

Step 2: If this vector has norm smaller than $\kappa/2$ , then we use Claim 4.14 to dilate appropriately by $C_{1}\geq 2$ so that $\kappa/2\leq\|C_{1}{\mathbf{u}}_{1}^{\prime}\|_{2}\leq\kappa$ , and set

[TABLE]

We then return to Step 1 and iterate the process, note that while the $D_{i}$ are bounded by $\kappa^{1+\varepsilon}$ , we don’t have such a bound for the $C_{i}$ .

Now for each $1\leq t\leq p-1$ we can always write

[TABLE]

where $r_{1}<D_{1},s_{1}<C_{1},r_{2}<D_{2},s_{2}<C_{2},\dots$ . Indeed, to verify this we first divide $t$ by $D_{1}$ and get a remainder $r_{1}$ ; we then divide the quotient by $C_{1}$ to get a remainder $s_{1}$ , and then divide the new quotient by $D_{2}$ , etc until the last step. Now as $t\leq p-1\leq 2^{\kappa^{2}}$ (this is where we require $p$ to be small), and as $C_{i},D_{i}\geq 2$ , we must stop the division process after $\kappa^{2}$ steps.

Next we we analyze the norm of $\|t{\mathbf{u}}_{1}\|_{2}=\|(t{\mathbf{w}})^{\prime}\|_{2}$ . We write, with $t_{1}=t$

[TABLE]

Thus by the triangle inequality, and as $r_{1}\leq D_{1}-1<\kappa^{1+\beta}$ we have

[TABLE]

We next consider

[TABLE]

By the triangle inequality

[TABLE]

where in the last estimate we used the fact that $0\leq s_{1}\leq C_{1}-1$ and $C_{1}$ is the largest integer so that $\kappa/2\leq\|C_{1}{\mathbf{u}}_{1}^{\prime}\|_{2}\leq\kappa$ (where we recall that $\|{\mathbf{u}}_{1}^{\prime}\|_{2}\leq\kappa$ , and the role of $C_{1}$ was only to dilate this vector if its norm was much smaller than this, as in the proof of Claim 4.14). The analysis for $\|t_{2}{\mathbf{u}}_{2}\|_{2}$ and other terms can be done similarly.

Adding all the bounds, we hence obtain

[TABLE]

Now as this is true for all $t\in{\mathbb{F}}_{p}$ , we thus have

[TABLE]

On the other hand, by (25), as $w^{\prime}$ has at least $c_{nsp}n$ non-zero entries, the left hand side can be shown to be at least $c_{nsp}np/64$ , which is a contradiction if $\kappa\leq n^{c}=n^{1/4-\beta}$ . ∎

With the same proof, we record the following corollary which will be used later.

Corollary 4.15 (ULCD cannot be small).

Assume that $p\leq\exp(c\kappa^{2})$ and that $\kappa=n^{c}$ for $c<1/16$ . Assume that ${\mathbf{w}}\in{\mathbb{F}}_{p}^{m}$ , for $m\geq\kappa^{4+2\varepsilon}$ , and ${\mathbf{w}}$ has at least $\kappa^{4+2(1/4-c)}$ non-zero components. Then we either have either $\|(t{\mathbf{w}})^{\prime}\|_{2}>\kappa$ for all $t$ , or there exists such ${\mathbf{w}}^{\prime}=(t{\mathbf{w}})^{\prime}$ such that $\kappa/2\leq\|{\mathbf{w}}^{\prime}\|_{2}<\kappa$ and that

[TABLE]

5. Structures of vectors in ${\mathbb{F}}_{p}^{n}$ : a combinatorial approach

Now we present our third characterization. Let $\mu$ be an $\alpha$ -balanced distribution in ${\mathbb{F}}_{p}$ . For simplicity, we again assume $\mu$ to be Bernoulli $\pm 1$ , the general $\alpha$ -balanced case can be treated almost identically as in the previous two sections. Our goal here is the following.

Theorem 5.1 (Combinatorial structure, characterization III).

Let $k\geq 1$ be an integer. Let $f:{\mathbb{Z}}^{+}\to{\mathbb{Z}}^{+}$ be any function such that $f(x)\leq x/100$ . For any non-zero vector ${\mathbf{w}}=(w_{1},\cdots,w_{n})\in{\mathbb{F}}_{p}^{n}$ we have

[TABLE]

where $\mu_{1},\cdots,\mu_{n}$ are independent and identically distributed copies of $\mu$ , and where $R_{k}({\mathbf{w}})$ is the number of solutions to $\pm w_{i_{1}}\pm\dots\pm w_{i_{2k}}=0(\ \operatorname{mod}\ p)$ , and $k\leq n/f(|\operatorname{supp}({\mathbf{w}})|)$ .

Our approach here is somewhat similar to [8], which in turn follows the original approach of Halász in [18]). However, the key difference here is that we are estimating the deviation of $\sup_{a\in{\mathbb{Z}}/p{\mathbb{Z}}}|{\mathbf{P}}(\mu_{1}w_{1}+\cdots+\mu_{n}w_{n}=a)$ from $1/p$ rather then giving an upper bound for $\sup_{a\in{\mathbb{Z}}/p{\mathbb{Z}}}|{\mathbf{P}}(\mu_{1}w_{1}+\cdots+\mu_{n}w_{n}=a)$ as in [8].

Proof.

(of Theorem 5.1) We follow the proof of Theorem 4.4 until Equation (27) that

[TABLE]

where $k=\Theta(\sqrt{f(|\operatorname{supp}({\mathbf{w}})|)/m})$ .

Denote $T^{\prime}:=\{l\in{\mathbb{F}}_{p},|\sum_{j=1}^{n}\cos(2\pi lw_{j}/p)|\geq n-100f(|\operatorname{supp}({\mathbf{w}})|)\}$ . Then we see that

[TABLE]

By Markov’s inequality,

[TABLE]

By expanding out the RHS and summing over $l\in{\mathbb{F}}_{p}$ instead, we can bound the RHS from above by

[TABLE]

The rest can be completed as in Theorem 4.4:

[TABLE]

as claimed. ∎

6. Non-structures of normal vectors

In this section we use the three characterizations above to establish Theorem 1.5. First, it is easy to show that normal vectors are non-sparse with high probability.

Lemma 6.1.

There exists an absolute constant $c_{nsp}>0$ such that with probability $1-\exp(-\Theta(n))$ any normal vector $w$ of $span(X_{1},\dots,X_{n-1})$ satisfies (25).

Proof.

This follows from Odylzko’s lemma, see for instance [30, 27]. ∎

Now we use the results from Sections 3, 4, and 5 to show that the normal vectors cannot have any structure. In our first proposition, we use the structure from Section 3.

Proposition 6.2 (Normal vectors cannot have additive structures).

Let $C>0$ . Let $X_{1},\dots,X_{n-d}$ be the first $n-d$ columns of a matrix $M$ whose entries are iid copies of a $\alpha$ -balanced random variable, where $d\leq cn$ for some sufficiently small constant $c$ . Let ${\mathbf{w}}$ be any non-zero vector that is orthogonal to $X_{1},\dots,X_{n-d}$ . Then with probability at least $1-\exp(-\Theta(n))$ , the vector ${\mathbf{w}}$ cannot have structure as in the conclusion of Theorem 3.1. In particular, we have

[TABLE]

where the implied constant depends on $C$ .

Proof.

(of Proposition 6.2) First of all, from Lemma 6.1, with a loss of $\exp(-\Theta(n))$ in probability we can assume that ${\mathbf{w}}$ is not sparse. Assume that

[TABLE]

where $C\geq 1/2$ . Also, by Corollary 4.6, it suffices to assume $p\gtrsim n^{1/2}.$

For convenience, let $p^{\prime}=\min\{p,\rho^{-1}\}$ , and so $p^{\prime}\gtrsim n^{1/2}$ . Then by Theorem 3.1, we have a generalized arithmetic progression $P$ of rank $O(1)$ in ${\mathbb{F}}_{p}$ and of size $O(1+\min\{\rho^{-1}/n^{\varepsilon},p/n^{\varepsilon}\})=O(p^{\prime}/n^{\varepsilon})$ that contains all but $n^{2\varepsilon}$ entries of ${\mathbf{w}}$ . Note that the number of ways to choose such a $P$ is bounded by

[TABLE]

Given $P$ , the number of vectors ${\mathbf{w}}$ whose $n-n^{2\varepsilon}$ components are from $P$ is at most

[TABLE]

provided that $p\leq\exp(n^{1-2\varepsilon})$ .

Given ${\mathbf{w}}$ for which $\rho({\mathbf{w}})\geq\rho$ , the probability that ${\mathbf{w}}$ is orthogonal to $X_{1},\dots,X_{n-d}$ is bounded by

[TABLE]

provided that $d\leq cn$ for some small positive constant $c$ .

Taking union bound over only $p^{O(1)}\rho^{-O(1)}$ choices of $P$ , we obtain the claim. ∎

We next use the result from Section 4 to show that the random normal vector does not have small $\mathbf{ULCD}_{\gamma,\kappa}$ .

Proposition 6.3 (Normal vectors cannot have small ULCD).

Assume that $p\leq\exp(c\kappa^{2})$ and that $\kappa=n^{c}$ for $c<1/16$ . Let $X_{1},\dots,X_{n-d}$ be the first $n-d$ columns of a random $(-1,1)$ Bernoulli matrix, where $d\leq n^{c}$ . Let ${\mathbf{w}}$ be any non-zero vector that is orthogonal to $X_{1},\dots,X_{n-d}$ . Then with probability at least $1-\exp(-\Theta(n))$ , we have

[TABLE]

with some $c^{\prime}$ depending on $c$ and $\gamma$ . In particular, Theorem 1.5 holds.

We remark that in the above theorem we assume $M$ to be a $(-1,1)$ Bernoulli matrix. Our treatment also works for other integral matrices 222Here the random entries of $M$ take value in ${\mathbb{Z}}$ , although in our results we view $M$ as a matrix of entries from ${\mathbb{Z}}/p{\mathbb{Z}}$ (or ${\mathbb{F}}_{p}$ ) via the natural map ${\mathbb{Z}}\to{\mathbb{Z}}/p{\mathbb{Z}}$ . with $\|M\|_{2}=O(\sqrt{n})$ but it does not seem to extend to $\alpha$ -balanced ensemble as in Proposition 6.2 (although Theorem 4.4 holds for this setting). The main reason is that at some point in the proof we pass to a net of vectors in ${\mathbb{R}}^{n}$ , and then under the action of $M$ the size of this net will blow up if $M$ has large norm, see (28).

Proof.

(of Proposition 6.3) We will show that with high probability, there does not exist ${\mathbf{w}}$ in the fiber of $t{\mathbf{w}}$ such that $\kappa/2\leq\|{\mathbf{w}}^{\prime}\|_{2}<\kappa$ and that

[TABLE]

To do this, we divide this range into $O(\kappa^{2})$ dyadic intervals $(D_{i},D_{i+1}=2D_{i})$ . For $D=D_{i}$ , let

[TABLE]

Lemma 6.4 (Size of the approximating net).

Let $c_{0}>0$ be given sufficiently small compared to $c$ (where $\kappa=n^{c}$ ). $S_{D}$ accepts a $O(\kappa/D)$ -net ${\mathcal{N}}$ of size $D(C\kappa D/\sqrt{n})^{n}$ if $\kappa D\geq c_{0}\sqrt{n}$ and of size $Dn^{c_{0}n}$ if $\kappa D<c_{0}\sqrt{n}$ and such that ${\mathcal{N}}\subset S_{D}$ .

Before proving this result by following [32], let use introduce a fact that will be useful to our nets.

Fact 6.5.

Assume that ${\mathcal{S}}$ accepts a $\delta$ -net ${\mathcal{U}}$ of size $|{\mathcal{N}}|$ , then $S$ also accepts a $2\delta$ -net ${\mathcal{U}}^{\prime}$ such that ${\mathcal{U}}^{\prime}\subset{\mathcal{S}}$ and which has size at most $|{\mathcal{N}}|$ .

Proof.

By throwing away vectors from ${\mathcal{U}}$ if needed, we assume that each $u\in{\mathcal{U}}$ $\delta$ -approximates at least one vector $s^{\prime}$ from ${\mathcal{S}}$ . Let ${\mathcal{N}}^{\prime}$ be a collection of such $s^{\prime}$ (we choose an arbitrary $s^{\prime}$ from ${\mathcal{S}}$ that is $\delta$ -approximated by any $u$ .) Thus ${\mathcal{N}}^{\prime}\subset{\mathcal{S}}\mbox{ and }|{\mathcal{N}}^{\prime}|\leq|{\mathcal{N}}|$ . Now for any $s\in{\mathcal{S}}$ , there exists $u\in{\mathcal{U}}$ such that $\|u-s\|_{2}\leq\delta$ , and also by definition there also exists $s^{\prime}\in{\mathcal{U}}^{\prime}$ such that $\|u-s^{\prime}\|_{2}\leq\delta$ . Thus we have $\|s-s^{\prime}\|_{2}\leq 2\delta$ , so ${\mathcal{U}}^{\prime}$ is a $2\delta$ -net of ${\mathcal{S}}$ . ∎

Proof of Lemma 6.4.

By taking union bound over a small number of choices (at most $O(\kappa\times(D/\kappa))=O(D)$ choices) we assume that for some $T\in\kappa/D\cdot{\mathbb{Z}}$ we have

[TABLE]

By definition, as $\|L{\mathbf{w}}^{\prime}\|_{{\mathbb{R}}/{\mathbb{Z}}}\leq\kappa$ and $D\leq L\leq 2D$ , there exists ${\mathbf{p}}\in{\mathbb{Z}}^{n}$ such that

[TABLE]

This implies that

[TABLE]

and hence

[TABLE]

Thus

[TABLE]

Now as $\|{\mathbf{w}}^{\prime}\|_{2}<T+2\kappa/L$ , we also have $\|{\mathbf{p}}/L\|_{2}\leq T+3\kappa/L$ and so

[TABLE]

Let ${\mathcal{N}}$ be the collection of vectors $T\frac{{\mathbf{p}}}{\|{\mathbf{p}}\|_{2}}$ , where $T$ ranges over $O(D)$ choices in the set $\kappa/D\cdot{\mathbb{Z}}$ , and ${\mathbf{p}}$ ranges over all integer vectors in ${\mathbb{Z}}^{n}$ satisfying $\|{\mathbf{p}}\|_{2}\leq 4D\kappa$ .

Now we bound the size of ${\mathcal{N}}$ basing on the magnitude of $\kappa D$ .

Case 1. If $\kappa D\geq c_{0}\sqrt{n}$ , then the number of integral vectors ${\mathbf{p}}$ of norm at most $3\kappa D$ is known to be bounded by $(C\kappa D/\sqrt{n})^{n}$ , and so

[TABLE]

Case 2. If $\kappa D\leq c_{0}\sqrt{n}$ , where $c_{0}$ is sufficiently small, then all but $O((\kappa D)^{2})$ entries of ${\mathbf{p}}$ are zero. So the number of such vectors ${\mathbf{p}}$ is bounded by $\binom{n}{(\kappa D)^{2}}(O(1))^{(\kappa D)^{2}}$ , and so

[TABLE]

Finally, we can always assume ${\mathcal{N}}$ to consist of vectors from $S_{D}$ by using Fact 6.5. ∎

Now we use the obtained net to show that normal vectors in iid matrices cannot have small $\mathbf{ULCD}$ .

For short, the method below works as follows: for ${\mathbf{w}}^{\prime}$ (viewed as vectors in ${\mathbf{Q}}^{n}$ ) we have $M{\mathbf{w}}^{\prime}\in{\mathbb{Z}}^{n}$ , where ${\mathbf{w}}^{\prime}=(w_{1}/p,\dots,w_{n}/p)$ . Then we approximate this vector by an element from the obtained net, and then pass to consider the probability from each net element. After approximation, we have that $M{\mathbf{u}}^{\prime}$ is close to ${\mathbb{Z}}^{n}$ in $\ell_{2}$ -norm, and so we can apply the classical Erdős-Turán bound.

Now we complete the proof of the proposition. Assume otherwise, then by the argument above, by passing to an appropriate $t{\mathbf{w}}$ , we can assume that $\kappa/2\leq\|{\mathbf{w}}^{\prime}\|_{2}<\kappa$ , and that ${\mathbf{w}}^{\prime}\in S_{D}$ for some $D_{i}$ from $O(\kappa^{2})$ dyadic intervals. As ${\mathbf{w}}$ is orthogonal to $X_{1},\dots,X_{n-1}$ in ${\mathbb{F}}_{p}$ , we then have the following key property for ${\mathbf{w}}^{\prime}=\frac{1}{p}{\mathbf{w}}$

[TABLE]

where $M$ is the $n\times(n-d)$ matrix formed by $X_{1},\dots,X_{n-1}$ .

By Lemma 6.4, there exists ${\mathbf{u}}^{\prime}\in{\mathcal{N}}$ such that

[TABLE]

It is well known that $\|M\|=O(\sqrt{n})$ with probability at least $1-\exp(-\Theta(n))$ (We note that this is the only place where we used $M=O(\sqrt{n})$ to prevent the net from expanding), and so we will condition on this event. We then have

[TABLE]

Therefore,

[TABLE]

Let ${\mathcal{E}}$ be this event, whose probability will be bounded shortly. By Theorem 4.11, as obviously $\kappa/D>1/D$ , we have

[TABLE]

where in the last estimate we used the fact that $D\leq\exp(c^{\prime}\kappa^{2})$ with sufficiently small $c^{\prime}$ .

By Lemma A.1 we thus have for some absolute positive constant $C^{\prime}$

[TABLE]

Putting together using union bound over all ${\mathbf{u}}^{\prime}$ from the net, as $\kappa=n^{c}$ , we obtain in the case $\kappa D\geq c_{0}\sqrt{n}$ a bound

[TABLE]

Note that here we have to assume $\kappa=o(n^{1/4})$ at least.

Also, in the second case that $\kappa D<c_{0}\sqrt{n}$ , noting that $D\geq k^{5/4-c}$

[TABLE]

assuming that $c_{0}$ is sufficiently large compared to $c$ , and that $c\leq 1/16$ . ∎

In our last result of this subsection, by using the terminology of Section 5, we show the following.

Proposition 6.6 (Normal vectors cannot have combinatorial structure).

Assume that $p\leq\exp(c\kappa^{2})$ and that $\kappa=n^{c}$ for $c<1/16$ . Let $X_{1},\dots,X_{n-d}$ be the first $n-d$ columns of a matrix $M$ whose entries are iid copies of an $\alpha$ -balanced random variable, where $d\leq c^{\prime\prime}n$ for some sufficiently small constant $c^{\prime\prime}$ . Let ${\mathbf{w}}$ be any non-zero vector that is orthogonal to $X_{1},\dots,X_{n-d}$ . Then with probability at least $1-\exp(-\bar{c}n)$ , we have that

[TABLE]

where $c^{\prime\prime},\bar{c}$ and $\hat{c}$ are constants that only depend on $c$ and $\alpha$ .

Note that this result holds for $\alpha$ -balanced ensembles where we don’t have to assume $\|M\|_{2}=O(\sqrt{n})$ .

Let $Z=(z_{1},\dots,z_{n})$ be any vector in ${\mathbb{F}}_{p}^{n}$ . We first record the following elementary relation (where we recall $\rho(.)$ from Theorems 3.1 and 4.1).

Fact 6.7.

For any $I\subset[n]$ we have

[TABLE]

Proof.

It suffices to show this for $I=[k]$ . We first write

[TABLE]

and hence

[TABLE]

where we note that both sides are non-negative.

We can bound the minimum in a similar fashion

[TABLE]

and so

[TABLE]

where we note that both sides are non-positive.

Putting this together, we thus obtain

[TABLE]

completing the proof. ∎

We next need the following key definitions and results from [8].

Definition 6.8.

For an ${\mathbf{a}}\in{\mathbb{F}}_{p}^{n}$ , $k\in\mathbb{N}$ and $\delta\in[0,1]$ , we define $R_{k}^{\delta}({\mathbf{a}})$ to be the number of solutions to

[TABLE]

that satisfy $|\{i_{1},\dots,i_{2k}\}|\geq(1+\delta)k$ .

We will make use of the observation from [8] that $R_{k}({\mathbf{a}})$ is never much larger than $R_{k}^{\delta}({\mathbf{a}})$ .

Lemma 6.9 (Lemma 1.6, [8]).

For all integers $k,n$ with $k\leq n/2$ and any prime $p$ , ${\mathbf{a}}\in{\mathbb{F}}_{p}^{n}$ and $\delta\in(0,1)$ ,

[TABLE]

As we will have the occassion to deal with subsets of vectors which we consider as vectors in their own right, we introduce the notation $|{\mathbf{a}}|$ to mean the dimension of a vector ${\mathbf{a}}$ . By ${\mathbf{b}}\subset{\mathbf{a}}$ we mean that ${\mathbf{b}}$ is a truncation of ${\mathbf{a}}$ . The key technical result in [8] is the following combinatorial lemma, which helps control the number of vectors with many “local” arithmetic relations.

Lemma 6.10.

[8, Theorem 1.7]** Denote

[TABLE]

Then

[TABLE]

At this point, we fix $\delta=1/2$ , $k=\lceil n^{1/8}\rceil$ and define

[TABLE]

So roughly speaking, this is the set of ${\mathbf{a}}$ which are not arithmetically rich. As Lemma 6.10 suggests, this set captures most of the vectors. More precisely we have the following (see also [8]).

Corollary 6.11.

If $p\leq\exp(c\kappa^{2})$ and $\kappa=n^{c}$ for $c<1/16$ , then for $t\geq n^{1/16}$

[TABLE]

Proof.

(of Corollary 6.11) We can assume that $t\leq p$ , otherwise the statement is trivially true as the left-hand side is zero. Fix a subset $S\subset[n]$ with $|S|\geq n^{1/4}$ and enumerate the vectors ${\mathbf{a}}$ with $\operatorname{supp}({\mathbf{a}})=S$ . By assumption, ${\mathbf{a}}\notin H_{t}$ so the restriction ${\mathbf{a}}|_{S}$ of ${\mathbf{a}}$ to the set $S$ is an element of ${\mathbf{B}}_{k,n^{1/4},\geq t}(|S|)$ . Therefore, Lemma 6.10 guarantees that the number of possible choices for ${\mathbf{a}}|_{S}$ is at most

[TABLE]

where the second inequality follows from our assumption that $t\leq p$ . We obtain the final result by summing over all subsets $S$ . ∎

The next lemma is a simple consequence of Theorem 5.1.

Lemma 6.12.

Suppose that ${\mathbf{a}}\in H_{t}$ . If $p\leq\exp(c\kappa^{2})$ and $\kappa=n^{c}$ for $c<1/16$ , then if $t\geq n^{1/16}$ , there exists a constant $C>0$ such that

[TABLE]

Proof.

(of Lemma 6.12) Let ${\mathbf{b}}$ be a subvector of ${\mathbf{a}}$ with $|\operatorname{supp}({\mathbf{b}})|\geq n^{1/4}$ and $R_{k}^{\delta}({\mathbf{b}})\leq t2^{2k}|b|^{2k}/p$ . In the notation of Theorem 5.1, if we let $f(x)=\sqrt{x}$ then

[TABLE]

This expression is dominated by the first term by our bound on the range of $p$ and our choice of $k$ . We recall that

[TABLE]

to finish the proof. ∎

Now we complete one of our main results of the section.

Proof.

(of Proposition 6.6) We let ${\mathcal{V}}$ denote the vectors in ${\mathbb{F}}_{p}^{n}$ with support larger than $c_{nsp}n$ and ${\mathcal{W}}$ the set of non-zero vectors ${\mathbf{w}}\in F_{p}^{n}$ that such that

[TABLE]

By Corollary 4.6, we can assume $p\gtrsim\sqrt{n}$ . Observe that

[TABLE]

By Lemma 6.1,

[TABLE]

Therefore, it suffices to focus on the vectors in ${\mathcal{V}}$ . Note that any ${\mathbf{w}}\in{\mathcal{V}}$ must reside in $H_{p}$ since $R_{k}^{1/2}\leq 2^{2k}|{\mathbf{b}}|^{2k}$ .

There are two cases to consider.

Case 1. We begin with vectors in ${\mathcal{V}}\cap{\mathcal{W}}\cap H_{n^{1/16}}$ . Let ${\mathbf{a}}$ be such a vector, then by definition of ${\mathcal{W}}$ and By Lemma 6.12,

[TABLE]

Because of the lower bound, by Theorem 3.6 there exists a generalized arithmetic progression $P$ of rank one in ${\mathbb{F}}_{p}$ and of size $O(p/n^{\varepsilon})$ that contains all but $n^{2\varepsilon}$ entries of ${\mathbf{a}}$ . Note that the number of ways to choose such a $P$ is bounded by $p^{O(1)}$ . For a fixed $P$ , the number of vectors ${\mathbf{a}}$ with at least $n-n^{2\varepsilon}$ components in $P$ is at most

[TABLE]

The probability that any ${\mathbf{a}}\in{\mathcal{V}}\cap{\mathcal{W}}\cap H_{n^{1/16}}$ with at least $n-n^{2\varepsilon}$ components in $P$ is orthogonal to $X_{1},\dots,X_{n-d}$ is bounded by

[TABLE]

provided that $d\leq cn$ for some small constant $c$ . Finally, we take a union bound over $p^{O(1)}$ choices of $P$ to conclude the proof.

Case 2. We address the remaining vectors. Let $\tau=n^{1/16}$ . We show that no vector in $({\mathcal{W}}\cap{\mathcal{V}})\setminus H_{n^{1/16}}$ is orthogonal to $X_{1},\dots,X_{n-d}$ . We can now partition $({\mathcal{W}}\cap{\mathcal{V}})\setminus H_{n^{1/16}}$ as

[TABLE]

where $J$ is the smallest integer such that $2^{J}\tau\geq p$ . Clearly, $J\leq\kappa^{2}$ . We then have

[TABLE]

Combining Corollary 6.11 and Lemma 6.12, we have (noting trivially that $2^{j}\tau\geq n^{1/16}$ )

[TABLE]

where the last line follows from small enough $c^{\prime\prime}$ .

Combining the above estimates, we can conclude that

[TABLE]

as desired. ∎

7. Distribution of ranks revisited

In this section we give a short proof for Theorem 1.4. We start with a high-dimensional lemma, which, in some sense, is a discrete analog of [33] where they considered distance of a random vector to a subspace of condimension $d$ in ${\mathbb{R}}^{n}$ .

Lemma 7.1.

Assume that $H$ is a subspace in ${\mathbb{F}}_{p}^{n}$ of codimension $d$ , and such that for any ${\mathbf{w}}\in H$ we have $\rho({\mathbf{w}})\leq\delta$ . Then

[TABLE]

Proof.

(of Lemma 7.1) Let ${\mathbf{v}}_{1},\dots,{\mathbf{v}}_{d}$ be a basis of $H$ . Our assumption says that for any $t_{1},\dots,t_{d}$ , not all zero, we have

[TABLE]

Note that in ${\mathbb{F}}_{p}^{n}$

[TABLE]

We have

[TABLE]

Now by our assumption

[TABLE]

Thus we have

[TABLE]

completing the proof. ∎

Now we apply Propositions 6.2, 6.3 and 6.6 to prove the following.

Lemma 7.2.

Assume that $p\leq\exp(c\kappa^{2})$ and that $\kappa=n^{c}$ for $c<1/16$ and $0\leq d,u\leq n^{c}$ . There exists an event ${\mathcal{E}}_{d}$ with probability ${\mathbf{P}}({\mathcal{E}})\geq 1-e^{n^{c^{\prime}}}$ such that the following holds

[TABLE]

where $W_{n-u}$ is the subspace generated by $X_{1},\dots,X_{n-u}$ .

Assume this Lemma, we can then complete Theorem 1.4 by direct calculations, or by applying [30, Theorem 5.3], we leave it for the reader as an exercise.

Proof.

(of Lemma 7.2) We have seen from Propositions 6.3 and 6.6 that there is an event ${\mathcal{E}}$ with ${\mathbf{P}}({\mathcal{E}})\geq 1-e^{-n^{c^{\prime}}}$ such that for any ${\mathbf{w}}\in W_{n-u}$ we have

[TABLE]

Now conditioning on this event, if ${\operatorname{rank}}(W_{n-u})=n-u-d$ then the codimension of $W_{n-u}$ is $u+d$ , and hence by Lemma 7.1 we have

[TABLE]

as claimed. ∎

8. Equi-distribution of the normal vectors

In this section we prove Theorem 1.6. For convenience we decompose the task into two parts.

Proposition 8.1.

With the same assumption as in Theorem 1.6 we have

•

For each $i\in\{1,\cdots,n\}$ , we have

[TABLE]

•

For each $i\neq j$ , and for any $a\in{\mathbb{F}}_{p}$ we have

[TABLE]

Proof.

(of Proposition 8.1) We prove the first item of Proposition 8.1. It suffices to show this for $i=1$ . Fix $i=1$ . We seek to bound the probability of the event that our normal vector ${\mathbf{v}}$ has $v_{1}=0$ under the condition that our first $n-1$ columns achieve full rank, i.e. ${\operatorname{rank}}{(M_{n\times(n-1)})}=n-1.$ Suppose we are given such a normal vector. Then restricting to the bottom $n-1$ rows, we see that this is equivalent to the event that the submatrix $M_{(n-1)\times(n-1)}$ has a nontrivial nullspace. So we rewrite ${\mathbf{P}}(v_{1}=0\,|\,{\operatorname{rank}}{(M_{n\times(n-1)})}=n-1)$ as ${\mathbf{P}}(M_{(n-1)\times(n-1)}$ is singular $|\,{\operatorname{rank}}{(M_{n\times(n-1)})}=n-1)$ . We can simply view this as ${\mathbf{P}}({\operatorname{rank}}{(M_{(n-1)\times(n-1)}})=n-2$ $|\,{\operatorname{rank}}{(M_{n\times(n-1)})}=n-1):={\mathbf{P}}(A|B)={\mathbf{P}}(A\cap B)/P(B).$

By Theorem 1.4 (see also [27, Theorem A.4]), we know that

[TABLE]

Now consider the event $A\cap B.$ This is the event that rows ${\mathbf{r}}_{2},\cdots,{\mathbf{r}}_{n}$ span a subspace $H$ of dimension $n-2$ and ${\mathbf{r}}_{1}$ is not in the span of $H$ , which can be expressed as ${\mathbf{P}}(A^{\prime}\cap B^{\prime})={\mathbf{P}}(A^{\prime}){\mathbf{P}}(B^{\prime}|A^{\prime})$ . For ${\mathbf{P}}(A^{\prime})$ , we again use Theorem 1.4,

[TABLE]

For ${\mathbf{P}}(B^{\prime}|A^{\prime})$ , our previous section tells us that if we condition on rows ${\mathbf{r}}_{2},\cdots,{\mathbf{r}}_{n}$ having rank equal to $n-2$ , then our normal vector ${\mathbf{v}}^{\prime}$ exists and has large $\mathbf{ULCD}$ , i.e. $\rho({\mathbf{v}}^{\prime})\leq\exp(-c\kappa^{2})$ . So the probability that ${\mathbf{r}}_{1}$ is in the span of $H$ under this condition is

[TABLE]

Putting this all together, we have

[TABLE]

Now we prove the second item. It suffices to assume $(i,j)=(1,2)$ . The event that $w_{1}=a,w_{2}=1$ is equivalent to the event that $a{\mathbf{r}}_{1}+{\mathbf{r}}_{2}+{\mathbf{r}}_{3}w_{3}\cdots+{\mathbf{r}}_{n}w_{n}=0$ , where ${\mathbf{r}}_{i}$ is the $i$ -th row of our matrix. If $a$ is zero, we are done via the previous argument, so assume $a$ is nonzero.

Let $H$ be the span of rows ${\mathbf{r}}_{3},\cdots,{\mathbf{r}}_{n}$ , which has full rank $n-2$ by our rank assumption on $M_{n\times(n-1)}$ and the fact that $a{\mathbf{r}}_{1}+{\mathbf{r}}_{2}+{\mathbf{r}}_{3}w_{3}\cdots+{\mathbf{r}}_{n}w_{n}=0.$ . Further, let $\pi$ be the projection to the orthogonal complement $H^{\perp}.$ For each evaluation of rows ${\mathbf{r}}_{3},\cdots,{\mathbf{r}}_{n}$ , $\pi$ is deterministic and $\pi({\mathbf{r}})=\langle{\mathbf{r}},{\mathbf{n}}\rangle,$ where ${\mathbf{n}}$ is the deterministic normal vector. Applying this projection to the linear combination, we have

[TABLE]

Since ${\mathbf{n}}$ is deterministic, each inner product takes values $b,c\in{\mathbf{F}}_{p}$ with probability uniformly $1/p$ with error $O(\exp{(-n^{c})})$ by Theorem 6.3. This means that $a$ , as the ratio of the inner products, is also uniformly distributed with probability $1/p$ and similar error. ∎

We next prove the second part of Theorem 1.6, restated for convenience.

Proposition 8.2.

Assume that $p\lesssim n/\log n$ . Let $n_{a}$ denotes the number of $w_{i}$ such that $w_{i}=a$ . Then for any $\delta<1$ which might depend on $n$ such that $\delta^{-2}p=o(n/\log n)$ . We then have

[TABLE]

Proof.

(of Proposition 8.2) Let ${\mathcal{E}}$ denote the set of vectors under consideration up to scaling, and $\bar{{\mathcal{E}}}$ be the complement. We first have the following elementary fact.

Fact 8.3.

We have

[TABLE]

Proof.

First, we note that each $n_{a}$ has distribution Bin( $n,\frac{1}{p}$ ) with variance

[TABLE]

Letting $0<\delta<1$ and $\mu$ denote the mean of this distribution, the upper-tail and lower-tail Chernoff inequalities combine to form the following bound:

[TABLE]

For each $i$ in $\{0,1,\cdots,p-1\}$ , let $F_{a}$ denote the event that $|n_{a}-\frac{n}{p}|<\delta n/p.$ Then trivial union bound gives

[TABLE]

Since there are $p^{n}$ different choices for ${\mathbf{v}}$ , the number of non-equidistributed vectors ${\mathbf{v}}$ is at most $p^{n}(2pe^{-c\delta^{2}n/p})$ . ∎

Now we complete our result. Let ${\mathbf{w}}$ be an arbitrary vector in ${\mathbf{F}}_{p}^{n}$ . We seek to upper bound ${\mathbf{P}}({\mathbf{w}}$ is normal and ${\mathbf{w}}\in\bar{{\mathcal{E}}}$ ). Immediately we have ${\mathbf{P}}({\mathbf{w}}$ is normal and ${\mathbf{w}}\in\bar{{\mathcal{E}}})$ is bounded above by

[TABLE]

Similar to our previous sections, we may decompose the sum into classes where ${\mathbf{w}}$ is sparse and ${\mathbf{w}}$ is non-sparse. By Lemma 6.1, the contribution over our sparse vectors is negligible. For our non-sparse vectors, we appeal to Theorem 6.3. We can now bound the sum via:

[TABLE]

as long as $\delta^{-2}p=o(n/\log n)$ , completing the proof. ∎

9. Proof of Theorem 1.7

It suffices to prove Theorem 9.1 below. Again, for simplicity we will assume $M_{n}$ to be an iid Bernoulli matrix taking values $\pm 1$ with probability 1/2 and $p\geq 3$ . We recall that $\phi(x)$ has degree $d$ and $p$ is sufficiently large and

[TABLE]

Let $\alpha$ be a root of $\phi(\alpha)=0$ and consider the field extension ${\mathbb{F}}_{q}={\mathbb{F}}_{p}[\alpha]$ . Notice that any element $x$ of this field has form

[TABLE]

More importantly, the event $\phi(x)|D_{M}(x)$ is equivalent to the event that $M_{n}-\alpha$ has rank at most $n-1$ in this field ${\mathbb{F}}_{q}$ . In other words, let $W_{n-k}$ be the subspace in ${\mathbb{F}}_{q}^{n}$ generated by the first $n-k$ columns of the matrix $M_{n}-\alpha$ (equivalently, the columns of $M_{[n]\times[n-k]}-\alpha$ ), then the event that $M_{n}-\alpha$ has rank at most $n-1$ is the union of the (disjoint) events ${\mathcal{E}}_{k}$ that $W_{n-k}$ has rank $n-k$ and the $n-k+1$ -th column $X_{n-k+1}-\alpha{\mathbf{e}}_{n-k+1}$ belongs to $W_{n-k}$ . In what follows we will be mainly focusing on the case of ${\mathcal{E}}_{1}$ , treatments for ${\mathcal{E}}_{2},{\mathcal{E}}_{3},\dots$ will be discussed later, and summing over these events will imply Theorem 1.7.

Theorem 9.1.

There exists an absolute constant $C$ such that

[TABLE]

Consider the normal vector ${\mathbf{v}}=(v_{1},\dots,v_{n})$ of $W_{n-1}$ (i.e. the column space of the matrix $M_{[n]\times[n-1]}-\alpha$ ). This vector can be written as

[TABLE]

where ${\mathbf{u}}_{i}=(u_{1i},\dots,u_{ni})^{T}\in{\mathbb{F}}_{p}^{n}$ . Notice that as for $1\leq j\leq n-1$ we have ${\mathbf{v}}\cdot(X_{j}-\alpha{\mathbf{e}}_{j})=0$ . So we have that

[TABLE]

This implies that

[TABLE]

Let

[TABLE]

By fixing the last coordinates $u_{in}=f_{i}\in{\mathbb{F}}_{p}$ of each ${\mathbf{u}}_{i}$ (and with a loss of a multiplicative factor $p^{d}$ in probability), and by fixing $X={\mathbf{r}}_{n}(M_{[n]\times[n-1]})$ (i.e. $X$ is the last row of $M_{[n]\times[n-1]}$ ), with ${\mathbf{v}}_{i}$ be the truncated vectors $(u_{1i},\dots,u_{(n-1)i})$ we can rewrite (30) as (with ${\mathbf{v}}_{d}={\mathbf{v}}_{0}$ )

[TABLE]

In what follows we set

[TABLE]

Conditioning on $f_{i},X$ , we will show the following key lemma.

Lemma 9.2.

With probability $\exp(-\Theta(n))$ with respect to the columns of the matrix $M_{n}$ , for any vector ${\mathbf{v}}$ that is orthogonal to the first $n-1$ columns of $M_{n}-\alpha$ the subspace $H$ generated by ${\mathbf{v}}_{1},\dots,{\mathbf{v}}_{d}$ (defined as above) in ${\mathbb{F}}_{p}^{n}$ cannot have $m$ -sparse vector. In other words, there do not exist coefficients $\alpha_{i}\in{\mathbb{F}}_{p}$ , not all zero, such that

[TABLE]

We can actually prove a slightly more general version of this lemma, which will be used to control ${\mathcal{E}}_{2},{\mathcal{E}}_{3},\dots$ .

Lemma 9.3.

With probability at most $\exp(-\Theta(n))$ with respect to the columns of the matrix $M_{n}$ , for any vector ${\mathbf{v}}$ that is orthogonal to the first $n^{\prime}$ columns of $M_{n}-\alpha$ (where $n^{\prime}=(1-o(1))n$ ) there do not exist coefficients $\alpha_{i}\in{\mathbb{F}}_{p}$ , not all zero, such that

[TABLE]

We remark that the above lemmas are somewhat similar to Propositions 6.2, 6.3, 6.6, but the situation here is much more complicated as the relation between ${\mathbf{v}}_{i}$ and $X_{1},\dots,X_{n^{\prime}}$ are non-trivial (for instance ${\mathbf{v}}_{i}$ is not orthogonal to $X_{j}$ ), and also the diagonal entries are perturbed by $\alpha$ .

We postpone the proof of this lemma for a moment, and let us use it to prove the following result, which automatically implies Theorem 9.1.

Theorem 9.4.

On the event of Lemma 9.2 we have

[TABLE]

Proof.

(of Theorem 9.4) Notice that the event $(X_{n}-\alpha{\mathbf{e}}_{n})\cdot{\mathbf{v}}=0$ implies that (by (30), using the same notations for ${\mathbf{v}}_{i},f_{i},x_{n}$ ) for all $1\leq i\leq d$

[TABLE]

In other words, conditioning on $x_{n}$ , and by letting $Y_{n}=X_{n}|_{[n-1]}$ and by choosing deterministic numbers $g_{i}\in{\mathbb{F}}_{p}$ appropriately we have

[TABLE]

Now as $\sum_{i}t_{i}{\mathbf{v}}_{i}$ is not very sparse for any non-trivial choice of $(t_{1},\dots,t_{d})$ , by Corollary 4.6 we have

[TABLE]

Thus we have

[TABLE]

∎

Notice that in the above proof, with $Z_{n}=X_{n}-\alpha{\mathbf{e}}_{n}$ , then the event $Z_{n}\cdot{\mathbf{v}}=0$ can be written as

[TABLE]

where ${\operatorname{tr}}:{\mathbb{F}}_{q}\to{\mathbb{F}}_{p}$ is the field trace. We just showed that

[TABLE]

In the same way, we show the following more general version of Theorem 9.1.

Theorem 9.5.

On the event of Lemma 9.3, as long as $k=o(n)$ we have

[TABLE]

In particularly,

[TABLE]

Proof.

(of Theorem 9.5) Notice that the event $Z_{n-k+1}=X_{n-k+1}-\alpha{\mathbf{e}}_{n-k+1}\in W_{n-k}$ is equivalent with the event that this vector is orthogonal to the (orthogonal basis) ${\mathbf{w}}_{1},\dots,{\mathbf{w}}_{k}$ of $W_{n-k}^{\perp}$ in ${\mathbb{F}}_{q}^{n}$ . We have

[TABLE]

Now observe that ${\mathbf{v}}=\sum_{i}t_{i}{\mathbf{w}}_{i}$ is a non-zero vector that is orthogonal to the first $n^{\prime}$ columns of $M_{n}-\alpha$ . By Lemma 9.3 and by the proof of Theorem 9.4 (via Equation (34)) we have

[TABLE]

Plugging this bound into Equation (9) for each non-zero tuple $(t_{1},\dots,t_{d})$ we complete the proof.

∎

For the rest of this section we will be focusing on Lemma 9.2. The proof of Lemma 9.3 can be done similarly. Indeed, assume that ${\mathbf{w}}_{i}=\sum_{k=0}^{d-1}\alpha^{k}{\mathbf{v}}_{ik}$ with ${\mathbf{v}}_{ik}\in{\mathbb{F}}_{p}^{n}$ , then $\alpha_{i}{\mathbf{w}}_{i}$ can be expressed as $\sum_{k=0}^{d-1}\alpha^{k}\alpha_{ik}{\mathbf{v}}_{ik}$ for some $\alpha_{ik}\ \in{\mathbb{F}}_{p}$ , one of which is non-zero if $\alpha_{i}$ is non-zero. Hence if $\sum_{i}\alpha_{i}{\mathbf{w}}_{i}=\sum_{k=0}^{d-1}\alpha^{k}(\sum_{i}\alpha_{ik}{\mathbf{v}}_{ik})$ is $m$ -sparse in ${\mathbb{F}}_{q}$ , then $\sum_{i}\alpha_{ik}{\mathbf{v}}_{ik}$ are $m$ -sparse for any $0\leq k\leq d-1$ . Choose one index $k$ where $\alpha_{k}$ is non-zero, and hence not all $\alpha_{ik}$ are zero.

In what follows we prove the key lemmas on non-sparsity. We will mainly focus on Lemma 9.2 because the proof for Lemma 9.3 is almost identical as long as $n^{\prime}=(1-o(1))n$ .

9.6. Proof of Lemma 9.2

Let us assume that $\sum_{i}c_{i}{\mathbf{v}}_{i}$ is an $m$ -sparse vector. We first note that by a proper “rotation”, we can assume that this vector is ${\mathbf{v}}_{1}$ . This can be seen by, where ${\mathbf{v}}$ and ${\mathbf{u}}_{i}$ are as before,

[TABLE]

By iterating (32) we have (with ${\mathbf{g}}=X$ and $Q$ from (36), and $t_{i}$ are deterministic, being determined by $f_{i}$ from (32))

[TABLE]

Our goal is that, assuming that ${\mathbf{v}}_{1}$ is $m$ -sparse, then conditioning on a realization of $\binom{n}{m}p^{m}=\exp(o(n))$ possible values of ${\mathbf{v}}_{1}$ , we show that the probability ${\mathbf{P}}(Q^{d}{\mathbf{v}}_{1}+\sum_{j=0}^{d-1}t_{j}Q^{j}{\mathbf{g}}={\mathbf{v}}_{1})$ is very small, so that after taking union bound the probability is still negligible (where we will use the assumption that $p\leq n^{1/2-\varepsilon}$ and $m\leq n^{1-2\varepsilon}$ ).

We will estimate ${\mathbf{P}}(Q^{d}{\mathbf{v}}_{1}+\sum_{j=0}^{d-1}t_{j}Q^{j}{\mathbf{g}}={\mathbf{v}}_{1})$ by a decoupling process, which roughly speaking allows us to pass from polynomials of $Q$ to multilinear forms where the factors are sparser matrices than $Q$ , but they are independent, and so that we can control the probability easier. This process, roughly speaking, can be described as follows: assume that $M_{i}$ is a matrix obtained from $Q$ by replacing all entries by zero, except the $i$ -th block of rows (in general we decompose the rows of $Q$ into $2^{d}$ groups, each with consecutive indices of size approximately $(n-1)/2^{d}$ , and our matrices are formed by rows within a group). Then we can write

[TABLE]

where $M_{i1}$ and $M_{i2}$ are submatrices obtained from $M_{i}$ by dividing the $i$ -th block of rows into two groups of (almost) equal size.

In general, in each step of our process, we decrease the polynomial degrees of a given matrix (the total degree remains the same), but double the number of matrices, and hence the probability. We will rely on the following well-known decoupling result (see for instance [7, 41]).

Claim 9.7.

Let $X,Y$ be two random vectors in ${\mathbb{R}}^{k}$ and ${\mathbb{R}}^{l}$ , and let $f:{\mathbb{R}}^{k+l}\to{\mathbb{R}}$ be a function. Then for any $a$ we have

[TABLE]

where $X^{\prime}$ is an independent copy of $X$ , and $Y^{\prime}$ is an independent copy of $Y$ .

In our application below $X,Y,X^{\prime},Y^{\prime}$ , etc will be matrices. Now we describe the process in more details.

•

We start by decomposing $Q=M_{1}+M_{2}$ in (36). Factoring out, we will obtain a sum of many products of $M_{1}$ and $M_{2}$ . In the next step we will be using Claim 9.7 to remove $M_{1}^{d}$ and $M_{2}^{d}$ (i.e. the highest degree polynomial of $M_{1}$ and $M_{2}$ ) accordingly.

•

In general, assume that we have $XZ_{1}XZ_{2}\dots Z_{k}X$ , where $Z_{i}$ are products of matrices that do not contain $X$ (it is possible that $Z_{i}$ is just the identity matrix), and $X$ appears $k$ times; we then decompose $X=X^{\prime}+X^{\prime\prime}$ into block matrices, expanding the products we obtain

[TABLE]

where in $R$ the total number of appearances of $X^{\prime}$ and $X^{\prime\prime}$ are at most $k-1$ . We then keep $X^{\prime}$ , decouple $X^{\prime\prime}$ by another independent matrix $Y^{\prime\prime}$ of the same distribution to remove $X^{\prime}Z_{1}X^{\prime}Z_{2}\dots Z_{k}X^{\prime}$ , and then keep $X^{\prime\prime}$ and $Y^{\prime\prime}$ , decouple $X^{\prime}$ by another independent matrix $Y^{\prime}$ of the same distribution to remove $X^{\prime\prime}Z_{1}X^{\prime\prime}Z_{2}\dots Z_{k}X^{\prime\prime}-Y^{\prime\prime}Z_{1}X^{\prime\prime}Z_{2}\dots Z_{k}Y^{\prime\prime}$ . More precisely, by Claim 9.7 (with $\{{\mathbf{z}}_{1},{\mathbf{z}}_{2}\}=\{{\mathbf{f}},{\mathbf{v}}_{1}\}$ )

[TABLE]

It is crucial to note that by doing so, if the highest degree of $X$ (which was decomposed into $X^{\prime}+X^{\prime\prime}$ ) was also $k$ , then the products having $k$ factors of $X^{\prime}$ (and also $Y^{\prime},X^{\prime\prime},Y^{\prime\prime}$ ) are canceled out in $A(X^{\prime},X^{\prime\prime})-A(X^{\prime},Y^{\prime\prime})-A(Y^{\prime},X^{\prime\prime})+A(Y^{\prime},Y^{\prime\prime})$ .

•

In summary, after each round of the decoupling process, we will not create higher polynomials elsewhere but replace $X$ which appears $k$ times in the product form by four matrices $X^{\prime},X^{\prime\prime},Y^{\prime},Y^{\prime\prime}$ which appear at most $k-1$ times. Hence, after $4^{d}$ steps of iterating the process, we will create a sum of many multilinear forms in which each matrix factor appears at most once. In other words they might have the form

[TABLE]

where $R,S$ are also multilinear forms which might also contain $X_{1},\dots,X_{d},\dots$ but the total degrees are smaller than $d$ .

Notice that in the matrix $X_{i}$ the are approximately $n/4^{d}$ non-zero rows.

Proof.

(of Theorem 9.4) We have seen that

[TABLE]

Set

[TABLE]

A simplified case. In order to motivate our next step, let us assume for now that our RHS (37) has only the term $X_{1}\dots X_{d}$ , that we are estimating the probability that ${\mathbf{P}}((X_{1}X_{2}\dots X_{d}{\mathbf{v}}_{1}=0)$ .

Let $1\leq m_{0}\leq m$ be fixed. We will assume ${\mathbf{v}}_{1}$ to have exactly $m_{0}$ non-zero entries (noting that taking union bound over $m_{0}$ will not significantly change our bounds), then by the fact that $\sup_{a}{\mathbf{P}}(\sum_{i=1}^{m}\xi_{i}x_{i}=a)\leq 1-c$ for some absolute constant $c$ and by the fact that $X_{d}$ has $C_{d}n$ iid row vectors (where $C_{d}\approx 1/4d$ ) we have that

[TABLE]

In the next step, conditioning on this event of $X_{d}$ , we consider the vector $X_{d-1}(X_{d}{\mathbf{v}}_{1})$ and apply the following fact (by relying on Theorem 4.1)

Claim 9.8.

Assume that ${\mathbf{w}}$ is not $C_{d}n$ -sparse, and $X$ is a random matrix with $n^{\prime}$ iid rows as in $M$ . Then for sufficiently large $p$ , and for $p\lesssim\sqrt{n}$

[TABLE]

Proof.

Note that if ${\mathbf{w}}$ is not $C_{d}n$ -sparse then by Theorem 4.1

[TABLE]

As such, as $p$ is sufficiently large the probability under consideration is bounded by

[TABLE]

where we take a union bound over all possible positions for the zero coordinates of $X{\mathbf{w}}$ . ∎

We remark the the above might continue to hold for small $p$ , but we will not be focusing on this case for simplicity.

We next iterate Claim 9.8, by taking union bound and with an assumption that $m\lesssim_{d}\sqrt{n}$ we obtain a bound

[TABLE]

for the event that $X_{1}\dots X_{d}{\mathbf{v}}_{1}=0$ (or $X_{1}\dots X_{d}{\mathbf{v}}_{1}$ is $m$ -sparse).

General case. First let us record here a slightly more general variant of Claim 9.8. We say that a random vector ${\mathbf{w}}$ is $m$ -free if all but at most $m$ coordinates of ${\mathbf{w}}$ are determined. So for instance $m$ -sparse vectors are $m$ -free because all but at most $m$ coordinates of ${\mathbf{w}}$ are zero. By an identical proof to Claim 9.8, we have

Claim 9.9.

Assume that ${\mathbf{w}}$ is not $C_{d}n$ -free, and $X$ is a random matrix with $n^{\prime}$ iid rows as in $M$ . Then for sufficiently large $p$ , and for $p\lesssim\sqrt{n}$

[TABLE]

To continue, recall that we would like to estimate ${\mathbf{P}}(\sum_{\sigma}X_{\sigma(1)}X_{\sigma(2)}\dots X_{\sigma(d)}+R){\mathbf{z}}_{1}+S{\mathbf{z}}_{2}=0)$ , and here we cannot expose $X_{d}$ , and then $X_{d-1}$ , etc one by one in order as in the simplified case above because $X_{i}$ might not appear exactly in the $i$ -th position of each multilinear forms. However we can adjust the process using the following observation.

Fact 9.10.

For any $i$ , and for any vector ${\mathbf{u}}$ , the vector $X_{i}{\mathbf{u}}$ of ${\mathbb{R}}^{n}$ have non-zero entries only in the $i$ -th block. In particularly, if a vector $\sum_{i}X_{i}{\mathbf{u}}_{i}$ is $n^{\prime}$ -sparse, then $X_{1}{\mathbf{u}}_{1}$ is also $n^{\prime}$ -sparse.

Now we describe the method. First, basing on Fact 9.10 we just need to address all the multilinear forms beginning with $X_{1}$ . The vector obtained by summing over these forms is of type $X_{1}(\sum_{i=2}^{d}a_{i}X_{i}R_{i}{\mathbf{w}}_{i})$ . To estimate the probability that this vector is not $m$ -free, we will focus only on the submatrix restricted by the columns of $X_{1}$ having the same index as the rows of $X_{2}$ , and condition on the remaining entries of $X_{1}$ . Only using the randomness from this submatrix $X_{1}^{\prime}$ of $X_{1}$ , it suffices to show that $(X_{1}^{\prime}+F)(X_{2}R_{2}{\mathbf{v}})$ is not $n^{\prime}$ -free with high probability. To show this, assume that we already know that $X_{2}R_{2}{\mathbf{v}}$ is non-sparse, we then can use (39) (i.e. Claim 9.9), where $f_{i}$ is allowed to depend on $F$ . Now to show that $X_{2}R_{2}{\mathbf{v}}$ is not $n^{\prime}$ -free with high probability (where the randomness is on $X_{2},\dots,X_{d}$ ), where $R_{2}$ is a sum of multilinear forms without $X_{1}$ and $X_{2}$ , we again focus on the submatrix $X_{2}^{\prime}$ of $X_{2}$ restricted by the columns having the same indices as that of $X_{3}$ , and continue forward. So in the last step of our argument (or the first step if we go backward as in the simplified case above), we just need to show that $X_{d}{\mathbf{v}}_{1}$ is not $n^{\prime}$ -free with high probability with respect to the randomness of $X_{d}$ and for some appropriate number $n^{\prime}=\Theta_{d}(n)$ , but this was exactly (38). ∎

10. Proof of Theorem 1.8

We are going to prove the following result.

Proposition 10.1 (asymptotically independence).

Let $d\geq 1$ be fixed. Then for any distinct numbers $a_{1},\dots,a_{d}$ in ${\mathbb{F}}_{p}$ ,

[TABLE]

where ${\mathcal{E}}_{a}$ is the event that $a$ is an eigenvalue of $M$ .

Proof.

(of Theorem 1.8) Let $X_{n,p}$ denote the random variable

[TABLE]

By Proposition 10.1, one easily has ${\mathbf{E}}X_{n,p}=1+p\times\exp(-c_{1}n/p^{2})=1+o(1)$ , and more generally for any fixed integer $d\geq 1$

[TABLE]

where $X\sim Pois(1)$ . It thus follows that $X_{n,p}$ is asymptotically distributed as $Pois(1)$ . In particularly

[TABLE]

∎

It remains to justify the above independence lemma.

10.2. Proof of Proposition 10.1

The event $\wedge{\mathcal{E}}_{a_{i}}$ is equivalent to the event that there exist ${\mathbf{v}}_{i},1\leq i\leq d$ such that

[TABLE]

We condition on $M_{[n]\times[n-1]}$ . Let ${\mathbf{u}}_{i}$ be a normal vector of $M_{[n]\times[n-1]}-a_{i}$ . The event $\wedge{\mathcal{E}}_{a_{i}}$ then implies that

[TABLE]

We are going to show that the probability of this event with the randomness on $X_{n}$ is $\frac{1}{p^{d}}$ modulo a small error term for almost all realization of $M_{[n]\times[n-1]}$ . To be more precise, we will restrict on the following event of probability close to 1.

Lemma 10.3.

Let $c_{d}$ be a sufficiently small constant to be chosen later. With probability at least $1-O(\exp(-\Theta(n))$ with respect to $M_{[n]\times[n-1]}$ , none of the vectors $\sum_{u}\beta_{i}{\mathbf{u}}_{i}$ for all $\beta_{i}\in{\mathbb{F}}_{p}$ , not all zero, is $c_{d}n$ -sparse.

Assuming this for now we can conclude our main result.

Proof.

(of Proposition 10.1) Conditioning on the event consider in Lemma 10.3, we can then just follow the proof of Theorem 9.4 applied to the events in (40). More precisely, the estimates following Equation (33) hold with $g_{i}=a_{i}{\mathbf{e}}_{n}\cdot{\mathbf{u}}_{i}$ for $1\leq i\leq d$ . ∎

10.4. Proof of Lemma 10.3

Set $m=c_{d}n$ , for sufficiently small $c_{d}$ . By taking union bound (with a loss of a multiplicative factor $p^{d}$ in probability), it suffices to consider the probability that ${\mathbf{w}}=\sum_{u}\beta_{i}{\mathbf{u}}_{i}$ is $m$ -sparse for a fixed choice of $\beta_{1},\dots,\beta_{d}$ .

Lemma 10.5.

With probability at least $1-O(\exp(-\Theta(n)))$ , ${\mathbf{w}}$ cannot be $m$ -sparse.

As in the previous section, let $Q$ denote the matrix $M_{[n-1]\times[n-1]}^{T}$ . Our result says that the vectors ${\mathbf{w}}$ can be viewed as null vector of a polynomial of degree $d$ of $Q$ . More specifically, let $X$ be the last row of $M_{[n]\times[n-1]}$ .

Claim 10.6.

There exist coefficients $\alpha_{i}$ and $\alpha_{i}^{\prime}$ such that

[TABLE]

It is clear that, by using this claim, we can complete the proof of Lemma 10.3 by following exactly the decoupling process in the proof of Theorem 9.4 in the previous section (which in turns yield the bound $1-\exp(-\Theta(n))$ , obtained at the last step of the process). This bound is clearly strong enough to absorb all union bounds of type $p^{O(d)}$ given the range of $p$ . It thus remains to justify the result above.

Proof.

(of Claim 10.6) By fixing $f_{i}=u_{in}$ (and hence we lose another multiplicative factor of $p^{d}$ in probability), and ${\mathbf{w}}_{i}$ is the concatination of ${\mathbf{u}}_{i}$ , by $(M-a_{i}){\mathbf{u}}_{i}=0$ we have

[TABLE]

As such

[TABLE]

In other words, under the action by $Q-a_{1}$ , we eliminate ${\mathbf{w}}_{1}$ (or we changed it to a deterministic vector). Iterating the process, we then obtain that

[TABLE]

where $\alpha_{d}\neq 0,\alpha_{i},\alpha_{i}^{\prime}$ depend on $\beta_{1},\dots,\beta_{d},a_{1},\dots,a_{d}$ and $X$ . ∎

11. Remarks

First, among the three characterizations provided, Theorem 3.1 was less effective in our current applications because of the polynomial restriction, but this result is expected to have other implications beyond random matrix theory because of its near optimality. The remaining two characterizations, Theorem 4.4 and Theorem 5.1, yield sub-exponential bounds in application. The later is more amenable to perturbations (that we don’t have to assume $\|M\|_{2}=O(\sqrt{n})$ ).

Second, we remark that the error bounds in Theorem 9.1 and in Proposition 10.1 are of the form $\exp(-n^{1-o(1)}/p^{2})$ , which were obtained by applying Corollary 4.6. Compared to the justification for the uniform model in Section 2, our approach seems to be natural and does explain the main terms in Theorem 1.7 (obtained from Theorem 9.1 and Theorem 9.5) and in Theorem 1.8 (obtained from Proposition 10.1). Following our treatment of Section 7, it is natural to expect that these error bounds can be made $\exp(-n^{c})$ , but for this improvement one has to show that the vectors ${\mathbf{v}}_{1}$ in (36) or ${\mathbf{w}}$ in Claim 10.6 to have large ULCD or large $R_{k}$ , but this task seems to be extremely challenging.

Universality is an extremely complicated phenomenon. While we have addressed only a few universal examples for random matrices in (prime) fields in the current note, there remains so many interesting and tantalizing questions. Beside the obvious (and doable) direction of extending the current results to general finite fields ${\mathbb{F}}_{q}$ , we conjecture that the following statistics of the uniform model are universal in terms of iid random matrix model:

•

the distributions of $\lambda_{\phi_{1}},\dots,\lambda_{\phi_{k}}$ are asymptotically independent for different irreducible polynomials $\phi_{1},\dots,\phi_{k}$ (which would then generalize (5));

•

the results of Stong and of Hansen and Schmutz [19] connecting the distribution of degrees of irreducible factors of the characteristic polynomial to the cycle lengths of a random permutation.

Lastly, we conjecture that for a fixed random matrix model (such as the Bernoulli $(0,1)$ or $(-1,1)$ model), the considered statistics over ${\mathbb{F}}_{p}$ for different primes $p$ are asymptotically independent.

Acknowledgements. The authors are thankful to J. Koenig for helpful comments. The first author is supported in part by the National Science Foundation postdoctoral fellowship DMS-1702533. The last two authors are partially supported by National Science Foundation grants DMS-1600782 and CAREER DMS-1752345.

Appendix A Tensorization lemma

The following is an analog of [32, Lemma 2.2].

Lemma A.1.

Let $K,\delta_{0}\geq 0$ be given. Assume that $\xi_{1},\dots,\xi_{n}$ are iid real-valued random variable and that ${\mathbf{P}}(\|\xi_{i}\|_{{\mathbb{R}}/{\mathbb{Z}}}<\delta)\leq K\delta$ for all $\delta\geq\delta_{0}$ . Then

[TABLE]

where $C_{0}$ is absolute.

Proof.

Assume that $\delta\geq\delta_{0}$ . By Chebyshev’s inequality

[TABLE]

On the other hand,

[TABLE]

For $0\leq u\leq 1$ we use ${\mathbf{P}}(\|\xi_{i}\|_{{\mathbb{R}}/{\mathbb{Z}}}\leq\delta u)\leq{\mathbf{P}}(\|\xi_{i}\|_{{\mathbb{R}}/{\mathbb{Z}}}\leq\delta)\leq K\delta$ , while for $u\geq 1$ we have ${\mathbf{P}}(\|\xi_{i}\|_{{\mathbb{R}}/{\mathbb{Z}}}\leq\delta u)\leq K\delta u$ . Thus

[TABLE]

as desired. ∎

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] G. V. Balakin, The distribution of the rank of random matrices over a finite field (Russian, with English summary), Teor. Veroyatn. Primen. 13 (1968), 631-641.
2[2] J. Blomer, R. Karp, and E. Welzl, The rank of sparse random matrices over finite fields, Random Structures Algorithms 10 (1997), no. 4, 407-419.
3[3] R. P. Brent and B. D. Mc Kay, Determinants and ranks of random matrices over ℤ m subscript ℤ 𝑚 {\mathbb{Z}}_{m} , Discrete Math. 66 (1987), no. 1-2, 35-49.
4[4] J. Bourgain, V. Vu and P. M. Wood, On the singularity probability of discrete random matrices, Journal of Functional Analysis 258 (2010), no.2, 559-603.
5[5] J. Clancy, T. Leake, N. Kaplan, S. Payne, and M. M. Wood. On a Cohen-Lenstra heuristic for Jacobians of random graphs, Journal of Algebraic Combinatorics 42 (2015), no. 3, 701-723.
6[6] C. Cooper, On the rank of random matrices, Random Structures Algorithms 16 (2000), no. 2, 209-232.
7[7] K. Costello, T. Tao, V. Vu, Random symmetric matrices are almost surely non-singular, Duke Math. J. 135 (2006), 395-413.
8[8] A. Ferber, V. Jain, K. Luh and W. Samotij, On the counting problem in inverse Littlewood–Offord theory, arxiv.org/abs/1904.10425 .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Some new results in random matrices over finite fields

Abstract.

2010 Mathematics Subject Classification:

1. Introduction

1.1. Some statistics for the uniform model

1.2. Our main results

Theorem 1.3**.**

Theorem 1.4** (Rank distribution).**

Theorem 1.5** (Non-structure of the normal vectors).**

Theorem 1.6** (Uniformity of the normal vectors).**

Theorem 1.7** (Divisibility of the characteristic polynomials).**

Theorem 1.8** (Universality for eigenvalue-free matrices).**

Corollary 1.9**.**

1.10. Notation

2. The uniform model

Proof.

3. Structures of vectors in Fpn{\mathbb{F}}_{p}^{n}Fpn​: an almost optimal characterization

Theorem 3.1** (Arithmetic structure, characterization I).**

Theorem 3.2**.**

Theorem 3.3**.**

Corollary 3.4**.**

Proof.

Lemma 3.5**.**

Theorem 3.6** (degenerate case).**

Proof.

4. Structures of vectors in Fpn{\mathbb{F}}_{p}^{n}Fpn​: a geometric approach

Theorem 4.1**.**

Definition 4.2**.**

Remark 4.3**.**

Theorem 4.4** (Geometric structure, characterization II).**

Corollary 4.5**.**

Corollary 4.6**.**

Proof.

Proof.

Claim 4.7** (spacing of the level sets).**

Proof.

Remark 4.8**.**

Remark 4.9**.**

4.10. Some properties of ULCD

Theorem 4.11** (anti-concentration modulo one).**

Proof.

Remark 4.12**.**

Lemma 4.13** (LCD and size in fields of small order).**

Claim 4.14**.**

Proof.

Proof.

Corollary 4.15** (ULCD cannot be small).**

5. Structures of vectors in Fpn{\mathbb{F}}_{p}^{n}Fpn​: a combinatorial approach

Theorem 5.1** (Combinatorial structure, characterization III).**

Proof.

6. Non-structures of normal vectors

Lemma 6.1**.**

Proof.

Proposition 6.2** (Normal vectors cannot have additive structures).**

Proof.

Proposition 6.3** (Normal vectors cannot have small ULCD).**

Proof.

Lemma 6.4** (Size of the approximating net).**

Fact 6.5**.**

Proof.

Proof of Lemma 6.4.

Proposition 6.6** (Normal vectors cannot have combinatorial structure).**

Fact 6.7**.**

Proof.

Definition 6.8**.**

Lemma 6.9** (Lemma 1.6, [8]).**

Lemma 6.10**.**

Corollary 6.11**.**

Proof.

Lemma 6.12**.**

Proof.

Proof.

7. Distribution of ranks revisited

Lemma 7.1**.**

Theorem 1.3.

Theorem 1.4 (Rank distribution).

Theorem 1.5 (Non-structure of the normal vectors).

Theorem 1.6 (Uniformity of the normal vectors).

Theorem 1.7 (Divisibility of the characteristic polynomials).

Theorem 1.8 (Universality for eigenvalue-free matrices).

Corollary 1.9.

3. Structures of vectors in ${\mathbb{F}}_{p}^{n}$ : an almost optimal characterization

Theorem 3.1 (Arithmetic structure, characterization I).

Theorem 3.2.

Theorem 3.3.

Corollary 3.4.

Lemma 3.5.

Theorem 3.6 (degenerate case).

4. Structures of vectors in ${\mathbb{F}}_{p}^{n}$ : a geometric approach

Theorem 4.1.

Definition 4.2.

Remark 4.3.

Theorem 4.4 (Geometric structure, characterization II).

Corollary 4.5.

Corollary 4.6.

Claim 4.7 (spacing of the level sets).

Remark 4.8.

Remark 4.9.

Theorem 4.11 (anti-concentration modulo one).

Remark 4.12.

Lemma 4.13 (LCD and size in fields of small order).

Claim 4.14.

Corollary 4.15 (ULCD cannot be small).

5. Structures of vectors in ${\mathbb{F}}_{p}^{n}$ : a combinatorial approach

Theorem 5.1 (Combinatorial structure, characterization III).

Lemma 6.1.

Proposition 6.2 (Normal vectors cannot have additive structures).

Proposition 6.3 (Normal vectors cannot have small ULCD).

Lemma 6.4 (Size of the approximating net).

Fact 6.5.

Proposition 6.6 (Normal vectors cannot have combinatorial structure).

Fact 6.7.

Definition 6.8.

Lemma 6.9 (Lemma 1.6, [8]).

Lemma 6.10.

Corollary 6.11.

Lemma 6.12.

Lemma 7.1.

Lemma 7.2.

Proposition 8.1.

Proposition 8.2.

Fact 8.3.

Theorem 9.1.

Lemma 9.2.

Lemma 9.3.

Theorem 9.4.

Theorem 9.5.

Claim 9.7.

Claim 9.8.

Claim 9.9.

Fact 9.10.

Proposition 10.1 (asymptotically independence).

Lemma 10.3.

Lemma 10.5.

Claim 10.6.

Lemma A.1.