Spectrum of random perturbations of Toeplitz matrices with finite   symbols

Anirban Basak; Elliot Paquette; Ofer Zeitouni

arXiv:1812.06207·math.PR·November 14, 2019

Spectrum of random perturbations of Toeplitz matrices with finite symbols

Anirban Basak, Elliot Paquette, Ofer Zeitouni

PDF

TL;DR

This paper studies how the eigenvalues of Toeplitz matrices with finite symbols are affected by small random perturbations, showing they converge to a distribution determined by the symbol evaluated on the unit circle.

Contribution

It extends previous results to non-triangular Toeplitz matrices with more general noise, confirming predictions about eigenvalue distributions under perturbations.

Findings

01

Eigenvalue empirical measure converges to the law of the symbol on the unit circle.

02

Results apply to non-triangular matrices and non-Gaussian noise.

03

Confirms pseudo-spectrum predictions for eigenvalue behavior.

Abstract

Let $T_{N}$ denote an $N \times N$ Toeplitz matrix with finite, $N$ independent symbol $a$ . For $E_{N}$ a noise matrix satisfying mild assumptions (ensuring, in particular, that $N^{- 1/2} ∥ E_{N} ∥_{HS} \to_{N \to \infty} 0$ at a polynomial rate), we prove that the empirical measure of eigenvalues of $T_{N} + E_{N}$ converges to the law of $a (U)$ , where $U$ is uniformly distributed on the unit circle in the complex plane. This extends results from arXiv:1712.00042 to the non-triangular setup and non complex Gaussian noise, and confirms predictions obtained in Reichel and Trefethen (1992) using the notion of pseudo-spectrum.

Figures1

Click any figure to enlarge with its caption.

Equations247

a (λ) := k = - \infty \sum \infty a_{k} λ^{k}, λ \in S^{1},

a (λ) := k = - \infty \sum \infty a_{k} λ^{k}, λ \in S^{1},

T_{N} := a_{0} a_{- 1} a_{- 2} ⋮ ⋮ a_{- (N - 1)} a_{1} a_{0} a_{- 1} ⋱ \dots a_{2} a_{1} ⋱ ⋱ ⋱ \dots \dots ⋱ ⋱ ⋱ a_{- 1} a_{- 2} \dots ⋱ a_{1} a_{0} a_{- 1} a_{N - 1} ⋮ ⋮ a_{2} a_{1} a_{0} .

T_{N} := a_{0} a_{- 1} a_{- 2} ⋮ ⋮ a_{- (N - 1)} a_{1} a_{0} a_{- 1} ⋱ \dots a_{2} a_{1} ⋱ ⋱ ⋱ \dots \dots ⋱ ⋱ ⋱ a_{- 1} a_{- 2} \dots ⋱ a_{1} a_{0} a_{- 1} a_{N - 1} ⋮ ⋮ a_{2} a_{1} a_{0} .

a (λ) = k = - d_{2} \sum d_{1} a_{k} λ^{k}, for some d_{1}, d_{2} \geq 0,

a (λ) = k = - d_{2} \sum d_{1} a_{k} λ^{k}, for some d_{1}, d_{2} \geq 0,

L_{N}^{A} := \frac{1}{N} i = 1 \sum n δ_{λ_{i}},

L_{N}^{A} := \frac{1}{N} i = 1 \sum n δ_{λ_{i}},

E [i, j = 1 \sum N ∣ e_{i, j} ∣^{2}] = O (N^{2}),

E [i, j = 1 \sum N ∣ e_{i, j} ∣^{2}] = O (N^{2}),

P (s_{m i n} (E_{N} + M_{N}) \leq N^{- β}) = o (1) .

P (s_{m i n} (E_{N} + M_{N}) \leq N^{- β}) = o (1) .

L_{μ} (z) := \int lo g ∣ z - x ∣ d μ (x), z \in C .

L_{μ} (z) := \int lo g ∣ z - x ∣ d μ (x), z \in C .

P (∥ Δ_{N} ∥ \geq N^{- γ_{0}}) = o (1),

P (∥ Δ_{N} ∥ \geq N^{- γ_{0}}) = o (1),

L_{L_{N}^{A + Δ}} (z) \to L_{μ} (z), as N \to \infty, in probability .

L_{L_{N}^{A + Δ}} (z) \to L_{μ} (z), as N \to \infty, in probability .

L_{L_{N}^{A + N^{- γ} E}} (z) \to L_{μ} (z), as N \to \infty, in probability .

L_{L_{N}^{A + N^{- γ} E}} (z) \to L_{μ} (z), as N \to \infty, in probability .

a (λ) = k = - d_{2} \sum d_{1} a_{k} λ^{k},

a (λ) = k = - d_{2} \sum d_{1} a_{k} λ^{k},

P_{z, a} (λ) := (a (λ) - z) \cdot λ^{d_{2}} = 0,

P_{z, a} (λ) := (a (λ) - z) \cdot λ^{d_{2}} = 0,

L_{μ_{a}} (z) = lo g ∣ a_{d_{1}} ∣ + k = 1 \sum d lo g_{+} ∣ λ_{k} (z) ∣,

L_{μ_{a}} (z) = lo g ∣ a_{d_{1}} ∣ + k = 1 \sum d lo g_{+} ∣ λ_{k} (z) ∣,

\lim_{N\to\infty}\mathcal{L}_{L_{N}^{T+\Delta}}(z)=\lim_{N\to\infty}\frac{1}{N}\log|\det(T_{N}(z)+\Delta_{N})|=\left\{\begin{array}[]{ll}\log|\lambda_{1}(z)|+\log|\lambda_{2}(z)|&\mbox{ if }z\in\mathcal{R}_{2}\\ \log|\lambda_{1}(z)|&\mbox{ if }z\in\mathcal{R}_{1}\\ 0&\mbox{ if }z\in\mathcal{R}_{0}\end{array}\right.,

\lim_{N\to\infty}\mathcal{L}_{L_{N}^{T+\Delta}}(z)=\lim_{N\to\infty}\frac{1}{N}\log|\det(T_{N}(z)+\Delta_{N})|=\left\{\begin{array}[]{ll}\log|\lambda_{1}(z)|+\log|\lambda_{2}(z)|&\mbox{ if }z\in\mathcal{R}_{2}\\ \log|\lambda_{1}(z)|&\mbox{ if }z\in\mathcal{R}_{1}\\ 0&\mbox{ if }z\in\mathcal{R}_{0}\end{array}\right.,

\det(T_{N}(z)[X;Y])\sim\left\{\begin{array}[]{ll}|\lambda_{1}(z)|^{N}\cdot|\lambda_{2}(z)|^{N}&\mbox{ if }X=Y=[N]\\ |\lambda_{1}(z)|^{N}&\mbox{ if }X=[N-1],\,Y=[N]\setminus\{1\}\\ 1&\mbox{ if }X=[N-2],\ Y=[N]\setminus\{1,2\}\end{array}\right.,

\det(T_{N}(z)[X;Y])\sim\left\{\begin{array}[]{ll}|\lambda_{1}(z)|^{N}\cdot|\lambda_{2}(z)|^{N}&\mbox{ if }X=Y=[N]\\ |\lambda_{1}(z)|^{N}&\mbox{ if }X=[N-1],\,Y=[N]\setminus\{1\}\\ 1&\mbox{ if }X=[N-2],\ Y=[N]\setminus\{1,2\}\end{array}\right.,

ℓ \neq = k \sum P_{ℓ} (z) = o (N^{- C} \cdot j = 1 \prod k ∣ λ_{j} (z) ∣^{N}) = Ω (∣ P_{k} (z) ∣), z \in R_{k}, k = 0, 1, 2,

ℓ \neq = k \sum P_{ℓ} (z) = o (N^{- C} \cdot j = 1 \prod k ∣ λ_{j} (z) ∣^{N}) = Ω (∣ P_{k} (z) ∣), z \in R_{k}, k = 0, 1, 2,

N \to \infty lim L_{L_{N}^{A + Δ}} (z) - L_{L_{N}^{A + N^{- γ} E}} (z) = 0, in probability .

N \to \infty lim L_{L_{N}^{A + Δ}} (z) - L_{L_{N}^{A + N^{- γ} E}} (z) = 0, in probability .

N \to \infty lim \frac{1}{N} \mbox tr ((z Id_{N} - T_{N}) (z Id_{N} - T_{N})^{*})^{k} - E (∣ z - i = - d_{2} \sum d_{1} a_{i} U^{i}) ∣^{2 k}) = 0.

N \to \infty lim \frac{1}{N} \mbox tr ((z Id_{N} - T_{N}) (z Id_{N} - T_{N})^{*})^{k} - E (∣ z - i = - d_{2} \sum d_{1} a_{i} U^{i}) ∣^{2 k}) = 0.

T_{N} = m = 0 \sum d_{1} a_{m} J_{N}^{m} + n = 1 \sum d_{2} a_{- n} (J_{N}^{*})^{n},

T_{N} = m = 0 \sum d_{1} a_{m} J_{N}^{m} + n = 1 \sum d_{2} a_{- n} (J_{N}^{*})^{n},

\frac{1}{N} \mbox tr ((z Id_{N} - T_{N}) (z Id_{N} - T_{N})^{*})^{k} = \frac{1}{N} \underline{m}, \underline{n} \sum b_{\underline{m}, \underline{n}} (z) \mbox tr [J_{N}^{m_{1}} (J_{N}^{*})^{n_{1}} \dots J_{N}^{m_{2 k}} (J_{N}^{*})^{n_{2 k}}]

\frac{1}{N} \mbox tr ((z Id_{N} - T_{N}) (z Id_{N} - T_{N})^{*})^{k} = \frac{1}{N} \underline{m}, \underline{n} \sum b_{\underline{m}, \underline{n}} (z) \mbox tr [J_{N}^{m_{1}} (J_{N}^{*})^{n_{1}} \dots J_{N}^{m_{2 k}} (J_{N}^{*})^{n_{2 k}}]

E (∣ z - i = - d_{2} \sum d_{1} a_{i} U^{i}) ∣^{2}) = \underline{m}, \underline{n} \sum b_{\underline{m}, \underline{n}} (z) E [U^{M_{\underline{m}, \underline{n}}}],

E (∣ z - i = - d_{2} \sum d_{1} a_{i} U^{i}) ∣^{2}) = \underline{m}, \underline{n} \sum b_{\underline{m}, \underline{n}} (z) E [U^{M_{\underline{m}, \underline{n}}}],

\mbox tr [J_{N}^{m_{1}} (J_{N}^{*})^{n_{1}} \dots J_{N}^{m_{2 k}} (J_{N}^{*})^{n_{2 k}}] = 0

\mbox tr [J_{N}^{m_{1}} (J_{N}^{*})^{n_{1}} \dots J_{N}^{m_{2 k}} (J_{N}^{*})^{n_{2 k}}] = 0

N \to \infty lim \frac{1}{N} \mbox tr [J_{N}^{m_{1}} (J_{N}^{*})^{n_{1}} \dots J_{N}^{m_{2 k}} (J_{N}^{*})^{n_{2 k}}] = 1

N \to \infty lim \frac{1}{N} \mbox tr [J_{N}^{m_{1}} (J_{N}^{*})^{n_{1}} \dots J_{N}^{m_{2 k}} (J_{N}^{*})^{n_{2 k}}] = 1

L_{L_{N}^{T + Δ}} (z) \to L_{μ_{a}} (z), in probability .

L_{L_{N}^{T + Δ}} (z) \to L_{μ_{a}} (z), in probability .

L_{L_{N}^{T + N^{- γ} E}} (z) \to L_{μ_{a}} (z), in probability .

L_{L_{N}^{T + N^{- γ} E}} (z) \to L_{μ_{a}} (z), in probability .

P_{k} (z) := X, Y \subset [N] ∣ X ∣ = ∣ Y ∣ = k \sum (- 1)^{sgn (σ_{X}) sgn (σ_{Y})} det (T_{N} (z) [X^{c}; Y^{c}]) \cdot det (Δ_{N} [X; Y]),

P_{k} (z) := X, Y \subset [N] ∣ X ∣ = ∣ Y ∣ = k \sum (- 1)^{sgn (σ_{X}) sgn (σ_{Y})} det (T_{N} (z) [X^{c}; Y^{c}]) \cdot det (Δ_{N} [X; Y]),

S_{d_{1}, d_{2}} := {(i, j) \in [N] \times [N] : i - j \in / {- (N - ℓ); ℓ = 1, 2, \dots, d_{1}} \cup {N - ℓ; ℓ = 1, 2, \dots, d_{2}}} .

S_{d_{1}, d_{2}} := {(i, j) \in [N] \times [N] : i - j \in / {- (N - ℓ); ℓ = 1, 2, \dots, d_{1}} \cup {N - ℓ; ℓ = 1, 2, \dots, d_{2}}} .

(Δ_{N})_{i, j} \neq = 0 only if i - j \in S_{d_{1}, d_{2}}^{c},

(Δ_{N})_{i, j} \neq = 0 only if i - j \in S_{d_{1}, d_{2}}^{c},

S_{d} := {z \in B_{C} (0, R) : d_{1} - d_{0} (z) = d and ∣ λ_{d_{0} (z)} (z) ∣ > 1 > ∣ λ_{d_{0} (z) + 1} (z) ∣} .

S_{d} := {z \in B_{C} (0, R) : d_{1} - d_{0} (z) = d and ∣ λ_{d_{0} (z)} (z) ∣ > 1 > ∣ λ_{d_{0} (z) + 1} (z) ∣} .

B_{C} (0, R) ∖ (\cup_{ℓ = - d_{2}}^{d_{1}} S_{ℓ}) \subset {z \in B_{C} (0, R) : P_{z, a} (λ) = 0 for some λ \in S^{1}} .

B_{C} (0, R) ∖ (\cup_{ℓ = - d_{2}}^{d_{1}} S_{ℓ}) \subset {z \in B_{C} (0, R) : P_{z, a} (λ) = 0 for some λ \in S^{1}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Spectrum of random perturbations of Toeplitz matrices with finite symbols

Anirban Basak∗

∗International Center for Theoretical Sciences

Tata Institute of Fundamental Research

Bangalore 560089, India

and

Department of Mathematics, Weizmann Institute of Science

POB 26, Rehovot 76100, Israel

,

Elliot Paquette‡

‡Department of Mathematics, The Ohio State University

Tower 100, 231 W 18th Ave, Columbus, Ohio 43210, USA

and

Ofer Zeitouni*§*

*§*Department of Mathematics, Weizmann Institute of Science

POB 26, Rehovot 76100, Israel

and

Courant Institute, New York University

251 Mercer St, New York, NY 10012, USA

(Date: December 14, 2018. Revised November 7, 2019.)

Abstract.

Let $T_{N}$ denote an $N\times N$ Toeplitz matrix with finite, $N$ independent symbol ${\bm{a}}$ . For $E_{N}$ a noise matrix satisfying mild assumptions (ensuring, in particular, that ${N^{-1/2}\|E_{N}\|_{{\rm HS}}}\to_{N\to\infty}0$ at a polynomial rate), we prove that the empirical measure of eigenvalues of $T_{N}+E_{N}$ converges to the law of ${\bm{a}}(U)$ , where $U$ is uniformly distributed on the unit circle in the complex plane. This extends results from [2] to the non-triangular setup and non complex Gaussian noise, and confirms predictions obtained in [16] using the notion of pseudospectrum.

1. Introduction

Let $\mathbb{S}^{1}:=\{z\in\mathbb{C}:|z|=1\}$ denote the unit circle in the complex plane. Let ${\bm{a}}:\mathbb{S}^{1}\mapsto\mathbb{C}$ be a function given by

[TABLE]

where $\{a_{k}\}_{k=-\infty}^{\infty}$ is an absolutely summable complex valued sequence. We denote by $T_{N}:=T_{N}({\bm{a}})$ the Toeplitz matrix of dimension $N\times N$ with symbol ${\bm{a}}$ , given by

[TABLE]

From the definition it is clear that when ${\bm{a}}$ is a Laurent polynomial, i.e.

[TABLE]

then $T_{N}$ is a (finitely) banded Toeplitz matrix which can be thought of as a piece from an infinite Toeplitz matrix; we refer to such matrices as Toeplitz matrices with finite symbols.

For any $N\times N$ matrix $A_{N}$ we denote the empirical measure of its eigenvalues, or equivalently esd, the empirical spectral distribution, by $L_{N}^{A}$ . That is,

[TABLE]

where $\lambda_{1},\lambda_{2},\ldots,\lambda_{N}$ are the eigenvalues of $A_{N}$ . In this paper, we find the limit of the empirical spectral distribution (esd) of random perturbations of Toeplitz matrices with finite symbols. This generalizes those results in [2] that deal with triangular Toeplitz matrices with finite symbols (and also with twisted Toeplitz matrices, which we cannot generalize to the non-triangular case, see Remarks 1.7 and 1.9 below). In contrast with [2], we allow for rather general perturbations, as codified in Assumption 1.1.

Assumption 1.1.

Let $\{E_{N}\}_{N\in\mathbb{N}}$ be a sequence of matrices, with possibly complex valued entries, such that the followings hold:

(i)

[TABLE]

where $\{e_{i,j}\}_{i,j=1}^{N}$ are the entries of $E_{N}$ . 2. (ii)

For any $\alpha\in(0,\infty)$ , there exists a $\beta\in(0,\infty)$ , depending only on $\alpha$ , so that for any fixed deterministic matrix $M_{N}$ with $\|M_{N}\|=O(N^{\alpha})$ , we have

[TABLE]

Let $\mu_{\bm{a}}$ denote the law of ${\bm{a}}(U)$ , where $U$ is a random variable uniformly distributed on the unit circle in the complex plane. Equipped with Assumption 1.1 we now state the main result of this paper.

Theorem 1.2.

Let $T_{N}$ be any $N\times N$ Toeplitz matrix with a symbol ${\bm{a}}$ , where ${\bm{a}}$ is a Laurent polynomial. Assume that $\{E_{N}\}_{N\in\mathbb{N}}$ satisfy Assumption 1.1. Then, for any $\gamma>\frac{1}{2}$ , the esd $L_{N}^{T+N^{-\gamma}E}$ of $T_{N}+N^{-\gamma}E_{N}$ converges weakly, in probability, to $\mu_{\bm{a}}$ .

Assumption 1.1(i) holds as soon as the second moment of each of the entries (of both complex and real parts) is uniformly bounded. By [15, Theorem 2.1], whenever the entries of $E_{N}$ are i.i.d. (complex or real) with common $N$ -independent distribution having a finite variance, Assumption 1.1(ii) holds. Therefore, Theorem 1.2 holds in that setup. In the next remark, we summarize other cases where Assumption 1.1, and hence Theorem 1.2, hold.

Remark 1.3.

Assumption 1.1 holds under various relaxed assumptions on the noise matrix $E_{N}$ , which we list below.

(1)

When the entries of $E_{N}$ are independent and dominated by a single distribution (in the Fourier-analytic sense) that has a $\kappa$ -controlled second moment for some $\kappa>0$ , see [15, Definition 2.2 and Remark 2.8]. 2. (2)

When the entries of $E_{N}$ are independent, satisfy a uniform anti-concentration bound near [math], and have uniform lower bound on the truncated variance, see [4, Lemma A.1]. Furthermore, [15, Theorem 2.9] and [4, Lemma A.1] allow $E_{N}$ to be a sparse random matrix. 3. (3)

When the entries of $E_{N}$ have an inhomogeneous variance profile satisfying appropriate assumptions, by a recent result of Cook [6]. Specifically, by [6, Theorem 1.24], the assumption is satisfied when the variance profile is super-regular, see [6, Definition 1.23] for a precise formulation. 4. (4)

When $E_{N}=\sqrt{N}U_{N}$ , where $U_{N}$ is a Haar distributed unitary matrix, see [18, Theorem 1.1].

Remark 1.4.

We believe that the sequence $N^{-\gamma}$ in Theorem 1.2 can be replaced by any sequence $\mathfrak{a}_{N}$ satisfying $\sqrt{N}\mathfrak{a}_{N}\to_{N\to\infty}0$ . We chose to work with $N^{-\gamma}$ in order to somewhat simplify the proofs.

Remark 1.5.

A general notion developed to deal with perturbations of non-normal matrices is that of pseudospectrum, see [17] for an extensive review. This notion provides worse-case estimates and does not focus on the evaluation of limits of empirical measures under random perturbation. However, Theorem 1.2 is consistent with predictions based on pseudospectrum. For a thorough discussion of how pseudospectrum relates to Theorem 1.2, see [2, Section 1.3] and [16].

Our approach to the proof of Theorem 1.2 differs from the one employed in [2], which derived a deterministic equivalence that worked only for complex i.i.d. Gaussian perturbations (in particular, even real Gaussian perturbations are not covered by [2]). Instead, our approach is based on a perturbation idea that can be traced back in this context to [9]. See Section 1.1 below for a further discussion on this.

To describe the approach of this paper we first recall the important notion of logarithmic potential associated with a probability measure $\mu$ .

Definition 1.1 (Log-potential).

For a probability measure $\mu$ supported on the complex plane define its log-potential as follows:

[TABLE]

As a first step, we will show that there exists a random matrix $\Delta_{N}$ , with a polynomially decaying spectral norm, such that the conclusion of Theorem 1.2 holds with $N^{-\gamma}E_{N}$ replaced by $\Delta_{N}$ .

Theorem 1.6.

Let $T_{N}$ be any $N\times N$ Toeplitz matrix with a symbol ${\bm{a}}$ , where ${\bm{a}}$ is a Laurent polynomial. Then, there exists a random matrix $\Delta_{N}$ with

[TABLE]

for some $\gamma_{0}>0$ , so that $L_{N}^{T+\Delta}$ converges weakly, in probability, to $\mu_{\bm{a}}$ . Equivalently, for Lebesgue almost every $z\in\mathbb{C}$ , ${\mathcal{L}}_{L_{N}^{T+\Delta}}(z)\to{\mathcal{L}}_{\mu_{\bm{a}}}(z)$ , in probability.

Remark 1.7.

We do not know the analogue of Theorem 1.6 for the twisted Toeplitz matrices considered in [2], and their non-triangular generalizations. For this reason, we cannot extend Theorem 1.2 to the general banded twisted case. See however Remark 1.9 below for the case of upper triangular twisted Toeplitz matrices.

We next state the replacement principle alluded to above. Here and in the sequel, $B_{\mathbb{C}}(c,R)$ denotes the open ball in the complex plane of center $c$ and radius $R$ .

Theorem 1.8 (Replacement principle).

Let $A_{N}$ be any deterministic matrix with a bounded operator norm. Suppose $\Delta_{N}$ and $E_{N}$ are random matrices. Let $\mu$ be a probability measure on $\mathbb{C}$ whose support is contained in $B_{\mathbb{C}}(0,R_{0}/2)$ for some $R_{0}<\infty$ . Assume the following.

(a)

$E_{N}$ * and $\Delta_{N}$ are independent. $\Delta_{N}$ satisfies (1.1) and $E_{N}$ satisfies Assumption 1.1. * 2. (b)

For Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ , the empirical distribution of the singular values of $A_{N}-z\operatorname{Id}_{N}$ converges weakly, to the law induced by $|X-z|$ , where $X\sim\mu$ . 3. (c)

For Lebesgue a.e. every $z\in B_{\mathbb{C}}(0,R_{0})$ ,

[TABLE]

Then, for any $\gamma>\frac{1}{2}$ , for Lebesgue a.e. every $z\in B_{\mathbb{C}}(0,R_{0})$ ,

[TABLE]

Theorem 1.8 is a generalization of the replacement lemma in [9, Theorem 5], with the advantage that it allows for more general noise models $E_{N}$ and that it is stated directly in terms of logarithmic potentials and avoids the need to realize the $*$ -limit of $A_{N}$ as a regular element of a non-commutative probability space. It may be of independent interest beyond the study of perturbations of Toeplitz matrices.

Remark 1.9.

Theorem 1.8 shows that [2, Theorem 4.1] remains true if one replaces there the complex Gaussian noise $G_{N}$ by a noise $E_{N}$ satisfying Assumption 1.1. This can be seen by using in Theorem 1.8 $\Delta_{N}=N^{-\gamma}G_{N}$ , and using [2, Lemma 4.6] to verify condition (b) of the theorem.

1.1. Related results and extensions

The study of the limiting esd of random perturbations of Toeplitz matrices can be traced back to [7] where in the simplest case of ${\bm{a}}(\lambda)=\lambda$ , i.e. when the Toeplitz matrix is the standard Jordan matrix, they derive the limit by studying a relevant Grushin problem. On the other hand [9] derives the limit in the same set-up by first analyzing the limit of the log-potential of the esd of a specific (deterministic) perturbation of the Jordan matrix. Then they use an argument similar in spirit to that of Theorem 1.8 which allows them to replace that specific perturbation by a polynomially vanishing Gaussian perturbation. When the Toeplitz matrix is non-triangular with an arbitrary symbol it is not straightforward to find the required perturbation. Furthermore, it is not clear whether there exists at all some deterministic perturbation allowing one to apply [9, Theorem 5]. Theorem 1.6 of this paper shows that one can indeed find a random perturbation which does that job. Moreover, instead of appealing to [9, Theorem 5] we use Theorem 1.8 which enables us to consider a broad class of random perturbations.

Recently in [2] the limiting spectral distribution of Gaussian perturbation of triangular Toeplitz matrices has been derived by adopting a different strategy. The key to the proof in [2] lies in the following observation: If for Lebesgue a.e. $z\in\mathbb{C}$ the number of polynomially small singular values of $M_{N}-z\operatorname{Id}_{N}$ is not too large, where $\{M_{N}\}_{N\in\mathbb{N}}$ is some sequence of matrices and $\operatorname{Id}_{N}$ is the identity matrix, then the limiting esd of Gaussian perturbations of $M_{N}$ can be described by the Brown measure associated with the limiting operator. So it boils down to finding estimates on the number of small singular values. When $M_{N}=T_{N}$ , a triangular Toeplitz (or a twisted Toeplitz) matrix, this task has been accomplished in [2]. If $T_{N}$ is a non-triangular matrix then the approach to finding bounds on the number of small singular values that is used in [2] fail.

Let us add that recent works of Sjöstrand and Vogel [12, 13] also deal with the limiting spectrum of Gaussian perturbations of general Toeplitz matrices. They use yet another strategy which is similar in spirit to the one adopted in [7]. In particular, their methods are robust enough that in [13] they apply them to Toeplitz operators with unbounded symbols.

There are several possible extensions of this paper that one can pursue. For example, one may be interested in understanding finer details of the spectrum, such as the behavior of the outliers of random perturbations of Toeplitz matrices. Building on the ideas of this paper the behavior of the outliers has been studied in a follow-up work [3].

Another interesting question would be to study the limiting esd of random perturbations of Toeplitz matrices with infinite symbols; as mentioned above, for certain perturbations this was achieved in [13]. A careful inspection of the proof of Theorem 1.2 of this paper reveals that one can build on the strategies developed in this paper to consider the case of Toeplitz matrices with a slowly growing bandwidth. For ease of writing and explanation we chose to work with a fixed bandwidth. The case of Toeplitz matrices with a general infinite symbol is at present beyond the scope of our methods.

Outline of the rest of the paper

We will show in Section 3 that Theorem 1.2 is an immediate consequence of Theorems 1.6 and 1.8. In Section 2 we provide the outlines of the proofs of Theorems 1.6 and 1.8. The proofs of these two theorems are carried out in Sections 4 and 5, respectively. Appendix A contains some algebraic results are that are used in the proofs.

Acknowledgements

AB is partially supported by a Start-up Research Grant (SRG/2019/001376) from Science and Engineering Research Board of Govt. of India, and ICTS–Infosys Excellence Grant. OZ is partially supported by Israel Science Foundation grant 147/15 and funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement number 692452). We thank the anonymous referees for helpful comments that enhanced the presentation of this paper.

2. Outlines of proofs of Theorems 1.6 and 1.8

We begin with an outline of the proof of Theorem 1.6. From [14, Theorem 2.8.3] and the fact that the support of $\mu_{\bm{a}}$ is compact, it suffices to show that for Lebesgue a.e. $z$ in some large compact subset of the complex plane, $\mathcal{L}_{L_{N}^{T+\Delta}}(z)\to\mathcal{L}_{\mu_{{\bm{a}}}}(z)$ in probability. Toward this goal, it is useful first to obtain a different representation of the limit.

Lemma 2.1.

Let

[TABLE]

for some $d_{1},d_{2}\in\mathbb{N}$ . For any $z\in\mathbb{C}$ let $\lambda_{1}(z),\lambda_{2}(z),\ldots,\lambda_{d}(z)$ be the roots of the polynomial equation

[TABLE]

where $d:=d_{1}+d_{2}$ . Then, for any $z\in\mathbb{C}$ ,

[TABLE]

where for $x\geq 0$ , $\log_{+}(x):=\max\{\log x,0\}$ .

The proof of Lemma 2.1 is a straightforward modification of that of [2, Lemma 4.3]. We omit the details.

We next sketch the proof of Theorem 1.6, in the special case where $T_{N}$ is the Toeplitz matrix with symbol ${\bm{a}}(\lambda)=\lambda+\lambda^{2}$ . Set $T_{N}(z):=T_{N}-z\operatorname{Id}_{N}$ where $z\in\mathbb{C}$ . By Lemma 2.1, the form of limiting log potential depends on the number of roots of the polynomial $P_{z,{\bm{a}}}(\lambda)$ greater than one in modulus. This yields (open) regions $\mathcal{R}_{\ell}\subset\mathbb{C}$ , $\ell=0,1,2$ , whose boundaries have zero Lebesgue measure and the closure of whose union is $\mathbb{C}$ , so that for all $z\in{\mathcal{R}}_{\ell}$ there are exactly $\ell$ roots of the equation $\lambda+\lambda^{2}-z=0$ that are greater than one in modulus. Thus, to establish Theorem 1.6 we need to find a noise matrix $\Delta_{N}$ such that the following holds:

[TABLE]

where $\lambda_{1}(z)$ and $\lambda_{2}(z)$ are the roots of the relevant equation arranged in the non-increasing order of their moduli. We refer the reader to Figure 1 for an illustration of the regions $\mathcal{R}_{\ell},\,\ell=0,1,2$ .

(We will see later that it is enough to consider the noise $\Delta_{N}$ supported on the lower left elements $\Delta_{N}(N,1),\Delta_{N}(N,2),\Delta_{N}(N-1,1)$ .)

To derive (2.2), we expand the determinant of $T_{N}(z)+\Delta_{N}$ . The latter can be written as a linear combination of products of determinants of various sub-matrices of $T_{N}(z)$ and $\Delta_{N}$ (see Lemma A.1 below). We identify the dominant term in this expansion, as follows. Let $A_{N}[X;Y]$ denote the sub-matrix of $A_{N}$ induced by the rows and the columns indexed by $X$ and $Y$ , respectively. Recalling Widom’s theorem concerning the determinant of a finitely banded Toeplitz matrix (see [11, 19]), we obtain

[TABLE]

where we write $a_{N}\sim b_{N}$ to indicate that there exists some absolute constant $C>0$ such that $N^{-C}a_{N}\leq b_{N}\leq N^{C}a_{N}$ , for all large $N$ .

From (2.3) we see that if $z\in\mathcal{R}_{0}$ or $\mathcal{R}_{1}$ then there are sub-matrices of $T_{N}(z)$ whose determinants are of larger magnitude than that of $T_{N}(z)$ . We also note that the expansion of $\det(T_{N}(z)+\Delta_{N})$ has terms that are products of determinants of these sub-matrices and the determinant of relevant sub-matrices of the noise matrix $\Delta_{N}$ (of fixed dimension), where the latter can be chosen to be non-zero and only polynomially (in $N$ ) decaying. It follows that if the determinants of those sub-matrices of $T_{N}(z)$ are of maximal exponential growth among the determinants of all possible sub-matrices of $T_{N}(z)$ , then $\frac{1}{N}\log|\det(T_{N}(z)+\Delta_{N})|$ converges to the limit in (2.2). This not only explains how the limit arises but also identifies potential candidates for the dominant terms (depending on the location of $z$ in the complex plane) in the expansion of the determinant, and gives a heuristic for the proof of Theorem 1.6.

To justify this heuristic and obtain an actual proof of Theorem 1.6 in the case under consideration, it is natural to extend (2.3) and claim that

[TABLE]

for some large absolute constant $C$ , with large probability, where $P_{\ell}(z)$ is the homogeneous polynomial of degree $\ell$ in the entries of $\Delta_{N}$ , in the expansion of the determinant of $T_{N}(z)+\Delta_{N}$ . In (2.4) we have used the standard notations $a_{n}=o(b_{n})$ and $a_{n}=\Omega(b_{n})$ to denote $\lim_{n\to\infty}a_{n}/b_{n}=0$ and $\liminf_{n\to\infty}a_{n}/b_{n}>0$ , respectively. Finding bounds on $P_{\ell}(z)$ requires the same for $\det(T_{N}(z)[X;Y])$ for all subsets $X,Y\subset[N]$ such that $|X|=|Y|=N-\ell$ . As $T_{N}(z)[X,Y]$ is not necessarily a Toeplitz matrix for arbitrary choices of $X,Y\subset[N]$ we can no longer rely on Widom’s result. We overcome this obstacle by noting that any upper triangular finitely banded Toeplitz matrix $T_{N}$ can be represented as a product of bidiagonal matrices, where the bidiagonal matrices depend on the roots of polynomial equation associated with the symbol of the Toeplitz matrix in context. Since the determinant of any sub-matrix of a bidiagonal matrix is easily computable (see Lemma A.3) one can then use the Cauchy-Binet theorem to find a bound on $\det(T_{N}(z)[X;Y])$ . Using this and some combinatorial arguments, we then obtain the desired bound on $P_{\ell}(z)$ whenever the entries of $\Delta_{N}$ are uniformly polynomially vanishing.

We emphasize that the approach described above generalizes easily to triangular finitely banded Toeplitz matrix. The general case requires a modification, since non-triangular Toeplitz matrices cannot be decomposed into a product. We resolve this issue by using the following simple key observation: any Toeplitz matrix with finite symbol can be viewed as a sub-matrix of an upper triangular Toeplitz matrix with an another finite symbol of a slightly larger dimension. Using this observation, we can then follow the same scheme as described above to find an upper bound on $P_{\ell}(z)$ .

To complete the proof of (2.4) we then need to find a lower bound on the predicted dominant term, $P_{k}(z)$ . This is obtained using an anti-concentration estimate, which is shown to hold whenever the entries of $\Delta_{N}$ are assumed to have a bounded density, which we will impose since the matrix $\Delta_{N}$ is an auxilliary matrix and does not appear in the statement of our main theorem, Theorem 1.2. See Lemma 4.1 and Proposition 4.5. This will prove (2.4). To finish the proof of Theorem 1.6, we then obtain an (easy) matching upper bound on $P_{k}(z)$ .

We next outline the proof of Theorem 1.8. It suffices to show that for Lebesgue a.e. $z$ in a compact subset of $\mathbb{C}$ ,

[TABLE]

Using the assumptions of Theorem 1.8 and standard perturbation results for the spectrum of Hermitian matrices, it readily follows that $\nu_{A_{N}+\Delta_{N}}^{z}$ and $\nu^{z}_{A_{N}+N^{-\gamma}E_{N}}$ , the empirical distributions of the singular values of $A_{N}(z)+\Delta_{N}$ and $A_{N}(z)+N^{-\gamma}E_{N}$ , respectively, have the same limit, and that limit is $\mu_{z}$ , the law of $|X-z|$ where $X\sim\mu$ . As $\log(\cdot)$ is unbounded both near [math] and $\infty$ , the limit in (2.5) is not immediate from this. Using bounds on the Hilbert-Schmidt norms of the relevant matrices the singularity near $\infty$ can be taken care of. Treating the singularity of $\log(\cdot)$ near [math] involves two steps. As the integral of $\log(\cdot)$ near zero, with respect to $\mu_{z}$ is negligible, using assumptions (b)-(c) the same can be shown to hold for $\nu^{z}_{A_{N}+\Delta_{N}}$ . Hence, it suffices to show that the integral of $\log(\cdot)$ on the interval $(0,\varepsilon)$ with respect to $\nu^{z}_{A_{N}+N^{-\gamma}E_{N}}$ goes to zero as $\varepsilon\downarrow 0$ .

The latter is obtained by standard arguments, as follows. We use Assumption 1.1(ii) to deduce that it is enough to integrate $\log(\cdot)$ in $(N^{-\kappa_{\star}},\varepsilon)$ for some small constant $\kappa_{\star}$ . Now, using bounds on Hilbert-Schmidt norms of $E_{N}$ and $\Delta_{N}$ one can derive a bound on the difference of the Stieltjes transforms of $\nu^{z}_{A_{N}+N^{-\gamma}E_{N}}$ and $\nu^{z}_{A_{N}+\Delta_{N}}$ . Using this, one obtains that the difference of the total mass of any interval near zero, under $\nu^{z}_{A_{N}+N^{-\gamma}E_{N}}$ and $\nu^{z}_{A_{N}+\Delta_{N}}$ , is negligible. Upon using an integration by parts, this gives the required control on the integral of $\log(\cdot)$ near [math] under $\nu^{z}_{A_{N}+N^{-\gamma}E_{N}}$ and completes the proof.

3. Proof of Theorem 1.2 using Theorems 1.6 and 1.8

We will take $\Delta_{N}$ provided by Theorem 1.6, set $\mu=\mu_{\bm{a}}$ in Theorem 1.8, and verify that the hypotheses of the latter hold. Clearly, $A_{N}$ has uniformly bounded operator norm. The assumption (a) is obvious. To see that assumption (b) holds, it is enough to check that for $k$ positive integer,

[TABLE]

To check (3.1) we first note that

[TABLE]

where $J_{N}$ is the nilpotent matrix given by given by $(J_{N})_{i,j}={\bf 1}_{j=i+1}$ . Using this observation we then expand ${\rm tr}((z\operatorname{Id}_{N}-T_{N})(z\operatorname{Id}_{N}-T_{N})^{*})^{k}$ and find out the limit of each term in term in that expansion. To work out this step we need to introduce some notation.

Let $\underline{m}:=(m_{1},\ldots,m_{2k})$ and $\underline{n}:=(n_{1},\ldots,n_{2k})$ with $n_{i},m_{i}$ non-negative integers bounded by $\max(d_{1},d_{2})$ , and set $M:=M_{\underline{m},\underline{n}}:=\sum m_{i}-\sum n_{i}$ . We say that $(\underline{m},\underline{n})$ is balanced if $M_{\underline{m},\underline{n}}=0$ . Using (3.2) we find that

[TABLE]

for appropriate coefficients $b_{\underline{m},\underline{n}}(z)$ , while

[TABLE]

with the same coefficients $b_{\underline{m},\underline{n}}(z)$ . Note that

[TABLE]

if $(\underline{m},\underline{n})$ is not balanced, while

[TABLE]

if $(\underline{m},\underline{n})$ is balanced. Similarly, $\mathbb{E}[U^{M_{\underline{m},\underline{n}}}]$ equals $1$ if $(\underline{m},\underline{n})$ is balanced, and vanishes otherwise. Combining these facts, we obtain (3.1), and thus verify that assumption (b) holds.

Assumption (c) holds because, from Theorem 1.6, we see that for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ ,

[TABLE]

We have checked all assumptions of Theorem 1.8; applying the latter we conclude that for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ and for any $\gamma>\frac{1}{2}$ ,

[TABLE]

By the proof of [14, Theorem 2.8.3] and the fact that the support of $\mu_{\bm{a}}$ is compact, this implies the convergence in probability of $L_{N}^{T+N^{-\gamma}E}$ to $\mu_{\bm{a}}$ in the vague topology, and hence in the weak topology. ∎

4. Proof of Theorem 1.6

In this section we prove Theorem 1.6. As outlined in Section 2, the key is to establish (2.4). Turning to this task, introduce, for any $k\in[N]$ ,

[TABLE]

where ${X}^{c}:=[N]\setminus X$ , ${Y}^{c}:=[N]\setminus Y$ , and for $Z\in\{X,Y\}$ $\sigma_{Z}$ is the permutation on $[N]$ which places all the elements of $Z$ before all the elements of ${Z}^{c}$ , but preserves the order of the elements within the two sets. Define

[TABLE]

To prove Theorem 1.6 we will choose $\Delta_{N}$ which satisfies the following band structure:

[TABLE]

where $(\Delta_{N})_{i,j}$ denotes the $(i,j)$ -th entry of $\Delta_{N}$ . That is, $\Delta_{N}$ has non-zero entries only in its lower left and upper right corners, and the widths of those corners are determined by $d_{1}$ and $d_{2}$ , respectively. As indicated in (2.3) such a band structure is necessary (as we will see it is also sufficient) to have a non-zero contribution from the sub-matrices of $T_{N}(z)$ whose determinants are of larger magnitudes compared to that of the whole matrix, in the expansion of $\det(T_{N}(z)+\Delta_{N})$ . Recall from (2.2)-(2.3) that the dominant term depends on the number of roots of $P_{z,{\bm{a}}}(\cdot)$ of (2.1), that are greater than one in modulus. Hence, we split the complex plane into regions according to the number of roots of $P_{z,{\bm{a}}}(\cdot)$ with modulus greater than one, using the following notation.

Let $\{-\lambda_{i}(z)\}_{i=1}^{d}$ , $d:=d_{1}+d_{2}$ 111Hereafter $\{-\lambda_{\ell}(z)\}$ will denote the roots of the equation $P_{z,{\bm{a}}}(\lambda)=0$ . This change in notation is adopted to avoid the unnecessary appearance of signs in the determinant of the sub-matrices of $J_{N}-\lambda\operatorname{Id}_{N}$ ., be the roots of the equation $P_{z,{\bm{a}}}(\cdot)=0$ , arranged so that $|\lambda_{1}(z)|\geq|\lambda_{2}(z)|\geq\cdots\geq|\lambda_{d}(z)|$ . For $z\in\mathbb{C}$ , let $d_{0}(z)$ denote the number of roots of the equation $P_{z,{\bm{a}}}(\cdot)=0$ that are greater than or equal to one in moduli. Fixing $R<\infty$ , for $-d_{2}\leq\mathfrak{d}\leq d_{1}$ we define

[TABLE]

Note that

[TABLE]

If $P_{z,{\bm{a}}}(\lambda)=0$ for some $\lambda\in\mathbb{S}^{1}$ then we also have that

[TABLE]

Therefore $B_{\mathbb{C}}(0,R)\setminus(\cup_{\ell=-d_{2}}^{d_{1}}\mathcal{S}_{\ell})$ is contained in a set of Lebesgue measure zero and hence it is enough to consider $z\in\cup_{\ell=-d_{2}}^{d_{1}}\mathcal{S}_{\ell}$ . Further let $\mathcal{N}$ be the set of $z$ ’s for which $P_{z}(\cdot)$ admits a double root. It follows from [5, Lemma 11.4] that the cardinality of $\mathcal{N}$ is at most finite.

The next lemma identifies the dominant term in the expansion of $\det(T_{N}(z)+\Delta_{N})$ .

Lemma 4.1.

Fix $\mathfrak{d}$ such that $-d_{2}\leq\mathfrak{d}\leq d_{1}$ . Let $\Delta_{N}$ be such that

[TABLE]

for some $\gamma_{\star}>d$ , where $\{\updelta_{i,j}\}$ are uniformly bounded real valued independent random variables with uniformly bounded densities with respect to the Lebesgue measure. Then, for Lebesgue a.e. $z\in\mathcal{S}_{\mathfrak{d}}$ , and any $\varepsilon_{0}>0$ ,

[TABLE]

where an empty product by convention is set to one.

Lemma 4.1 yields a lower bound on the order of the magnitude of the predicted dominant term in the expansion of $\det(T_{N}(z)+\Delta_{N})$ . Next we need to show that the sum of the rest of the terms is of smaller order. To show this, we split it into two sums: $\sum_{\ell<|\mathfrak{d}|}P_{\ell}(z)$ and $\sum_{\ell>|\mathfrak{d}|}P_{\ell}(z)$ . The second sum will be shown to be polynomially small compared to the leading term, whereas the first will be shown to be exponentially small. This is the content of the two following lemmas.

Lemma 4.2.

Let $\mathfrak{d},\Delta_{N}$ , and $\gamma_{\star}$ be as in Lemma 4.1. Then, for Lebesgue a.e. $z\in\mathcal{S}_{\mathfrak{d}}$ ,

[TABLE]

Lemma 4.3.

Under the same set-up as in Lemma 4.2, for Lebesgue a.e. $z\in\mathcal{S}_{\mathfrak{d}}$ , we have

[TABLE]

for some small constant $\bar{\varepsilon}:=\bar{\varepsilon}(z,{\bm{a}})\in(0,1)$ .

The proofs of Lemmas 4.2 and 4.3 are in Section 4.1, while the proof of Lemma 4.1 is postponed to Section 4.2. To complete the proof of Theorem 1.6, we will also need an upper bound on the dominant term, which is contained in the next lemma, whose proof is deferred to Section 4.2.

Lemma 4.4.

Under the same set-up as in Lemma 4.1, for Lebesgue a.e. $z\in\mathcal{S}_{\mathfrak{d}}$ , there exists a constant $C_{0}$ depending on $z$ and ${\bm{a}}$ only, so that

[TABLE]

Equipped with these four lemmas, we now compete the proof of Theorem 1.6.

Proof of Theorem 1.6.

From the definition of $\Delta_{N}$ it follows that there are at most $d$ non-zero entries in each row of $\Delta_{N}\Delta_{N}^{*}$ . Furthermore, each entry of $\Delta_{N}\Delta_{N}^{*}$ is at most $O(N^{-2\gamma_{\star}})$ . Therefore, by the Gershgorin circle theorem, it follows that $\|\Delta_{N}\|=O(N^{-\gamma_{\star}})$ , establishing the desired property (1.1). Next, as in the proof of Theorem 1.2, the weak convergence of $L_{N}^{T+\Delta}$ to $\mu_{{\bm{a}}}$ follows from the convergence, for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ , of the log-potentials:

[TABLE]

To this end, recalling the definition of $P_{k}(z)$ from (4.1) and applying Lemma A.1 we have that

[TABLE]

for any integer $\mathfrak{d}$ between $-N$ and $N$ . Setting $\varepsilon_{0}=\frac{\gamma_{\star}-d}{4}>0$ in Lemma 4.1 and combining Lemmas 4.1-4.3 we note that for Lebesgue a.e. $z\in\mathcal{S}_{\mathfrak{d}}$ , there exists an event of probability at least $1-o(1)$ such that, on that event, we have

[TABLE]

for all large $N$ . This in turn implies that

[TABLE]

for Lebesgue a.e. every $z\in\mathcal{S}_{\mathfrak{d}}$ . Finally combining Lemmas 4.1 and 4.4 we obtain that for Lebesgue a.e. every $z\in\mathcal{S}_{\mathfrak{d}}$ ,

[TABLE]

Combining (4.5)-(4.7) we now deduce (4.4) for Lebesgue a.e. $z\in\mathcal{S}_{\mathfrak{d}}$ and any integer $\mathfrak{d}$ such that $-d_{2}\leq\mathfrak{d}\leq d_{1}$ . This completes the proof. ∎

4.1. Upper bound on non-dominant terms

Recall from Section 2 that to establish bounds on the predicted non-dominant terms, one uses the fact any upper triangular Toeplitz matrix with a finite symbol can be expressed as a product of bidiagonal matrices. To use the same representation for a non-triangular Toeplitz matrix we view it as a sub-matrix of an upper triangular Toeplitz matrix of a slightly larger dimension. Toward this end, we introduce the folowing definition.

Definition 4.1 (Toeplitz with a shifted symbol).

Let $T_{N}$ be a Toeplitz matrix with finite symbol ${\bm{a}}(\lambda)=\sum_{\ell=-d_{2}}^{d_{1}}a_{\ell}\lambda^{\ell}$ and as before $d=d_{1}+d_{2}$ . For $\bar{d}_{1},\bar{d}_{2}\in\mathbb{N}$ such that $\bar{d}_{1}+\bar{d}_{2}=d$ and $z\in\mathbb{C}$ , set $T_{N}(z;\bar{d}_{1},\bar{d}_{2}):=T_{N}(z;\bar{d}_{1},\bar{d}_{2})({\bm{a}})$ to be the $N\times N$ Toeplitz matrix with first row and column

[TABLE]

respectively, where $a^{\prime}_{j}:=a_{j}-z\cdot{\bf 1}_{\{j=0\}}$ , $j=-d_{2},-d_{2}+1,\ldots,d_{1}$ . That is,

[TABLE]

From Definition 4.1, it follows that

[TABLE]

Note that $T_{N+d_{2}}(z;d,0)$ is an upper triangular Toeplitz matrix. Since $\{-\lambda_{\ell}(z)\}_{\ell=1}^{d}$ are the roots of the equation $P_{z,{\bm{a}}}(\lambda)=0$ we obtain that

[TABLE]

where we recall that $J_{n}$ is the nilpotent matrix given by $(J_{n})_{i,j}={\bf 1}_{j=i+1}$ , for $i,j\in[n]$ .

Hence, recalling the definition of $\{P_{k}(z)\}_{k=1}^{N}$ from (4.1), applying the Cauchy-Binet theorem, and writing $S+\ell:=\{x+\ell,x\in S\}$ for any set of integers $S$ and an integer $\ell$ , we obtain that

[TABLE]

where

[TABLE]

and $\check{Z}:=[N+d_{2}]\setminus Z$ for any set $Z\subset[N+d_{2}]$ . Equipped with this preparatory decomposition of $P_{k}(z)$ , we are now ready to step into the proof of Lemma 4.2.

Proof of Lemma 4.2.

From the definition of the noise matrix it follows that the number of non-zero rows (and also non-zero columns) in $\Delta_{N}$ is at most $d$ . This means that $P_{k}(z)=0$ for any $k>d$ . Therefore, it is enough to show that (4.3) holds with the sum in the numerator being replaced by $P_{k}(z)$ , where $|\mathfrak{d}|<k\leq d$ .

To achieve this, we need to simplify (4.8); this simplification, summarized in (4.12) and (4.13) below, will also be useful in the proof of Lemma 4.3. From (4.8)-(4.9), we see that each $X_{i}$ is of cardinality $k+d_{2}$ . Therefore, we write

[TABLE]

and for brevity we also denote $\mathcal{X}_{k}:=(X_{1},X_{2},\ldots,X_{d+1})$ . Applying Lemma A.3 we see that

[TABLE]

only when $\mathcal{X}_{k}\in L_{{\bm{\ell}},k}$ for some ${\bm{\ell}}:=(\ell_{1},\ell_{2},\ldots,\ell_{d})$ with $0\leq\ell_{i}\leq N-k\leq N+d_{2}$ , $i=1,2,\ldots,d$ , where

[TABLE]

Since

[TABLE]

we have the following following equivalent representation of $L_{{\bm{\ell}},k}$ :

[TABLE]

where

[TABLE]

We also note that in (4.8) the outer sum is over $X,Y\subset[N]$ and due to the constraint (4.9) we only need to consider $\mathcal{X}_{k}\in\mathfrak{L}_{{\bm{\ell}},k}$ , where

[TABLE]

Thus applying Lemma A.3 again, from (4.8) we now deduce that

[TABLE]

where

[TABLE]

Returning to the proof of the lemma, it suffices to bound $Q_{{\bm{\ell}},k}$ . Turning to this task, we assume without loss of generality that $|(\Delta_{N})_{i,j}|\leq 1$ . This implies that

[TABLE]

for every $X,Y\subset[N]$ such that $|X|=|Y|=k$ . On the other hand, the definition of $d_{0}=d_{0}(z,{\bm{a}})$ and the fact that $z\in\mathcal{S}_{\mathfrak{d}}$ imply that there are no roots of $P_{z,{\bm{a}}}(\cdot)$ on the unit circle, hence we deduce that there exists $\varepsilon_{\star}=\varepsilon_{\star}(z,{\bm{a}})>0$ , such that

[TABLE]

Hence,

[TABLE]

where the last inequality follows from the fact that $|\mathfrak{d}|<k\leq d$ . To finish the proof it remains to find an upper bound on the cardinality of $\mathfrak{L}_{\ell,k}$ . We claim that

[TABLE]

Equipped with (4.17), it now follows from (4.16) that

[TABLE]

Since $a_{d_{2}}\neq 0$ implies that $\{\lambda_{\ell}(z)\}$ are bounded away from zero, (4.18) together with (4.12) yield (4.3).

It remains to establish the bound (4.17). To this end, set

[TABLE]

For the $\{x_{i,j}\}$ to satisfy $\mathcal{X}_{k}\in\mathfrak{L}_{{\bm{\ell}},k}$ , we observe that the $\{\delta_{i,j}(\mathcal{X}_{k})\}$ ’s can be chosen in at most

[TABLE]

ways. Next, recall that $\mathcal{X}_{k}\in\mathfrak{L}_{{\bm{\ell}},k}$ implies that

[TABLE]

Thus, $\{x_{1,\ell}\}_{\ell=1}^{k}$ and $\{\delta_{i,j}(\mathcal{X}_{k})\}$ automatically fix $\mathcal{X}_{k}$ . Since the number of choices of $\{x_{1,\ell}\}_{\ell=1}^{k}$ is at most $\binom{N}{k}\leq\binom{N}{d}$ , as $k\leq d$ , for all large $N$ , the claim (4.17) follows from (4.20). The proof of the lemma is now complete. ∎

Next we show that for $z\in S_{\mathfrak{d}}$ the sum $\sum_{k<|\mathfrak{d}|}P_{k}(z)$ is of smaller order compared to the dominant term $P_{|\mathfrak{d}|}(z)$ .

Proof of Lemma 4.3.

We first claim that for any $k<|\mathfrak{d}|$ , the set $\mathfrak{L}_{{\bm{\ell}},k}$ (see (4.11)) being nonempty forces either $\sum_{i=d_{0}+1}^{d}\hat{\ell}_{i}$ or $\sum_{i=1}^{d_{0}}\hat{\ell}_{i}$ to be close to $N$ , depending on whether $\mathfrak{d}>0$ or $\mathfrak{d}<0$ . This observation will be then combined with the bounds (4.15) and (4.17) to complete the proof.

Consider first the case $\mathfrak{d}=d_{1}-d_{0}>0$ . For any $k<\mathfrak{d}$ we have that $d_{0}+k+d_{2}+1\leq d_{1}+d_{2}=d$ and hence for any $\mathcal{X}_{k}\in\mathfrak{L}_{{\bm{\ell}},k}\subset L_{{\bm{\ell}},k}$ ,

[TABLE]

As $\delta_{i,j}(\mathcal{X}_{k})\leq\hat{\ell}_{i}+k+d_{2}$ for $i\in[d]\setminus[d_{0}]$ and $j\in[k+d_{2}+1]$ , it further implies that if $\mathfrak{L}_{{\bm{\ell}},k}\neq\emptyset$ then we must have

[TABLE]

Next we consider the case $\mathfrak{d}<0$ . For any $\mathcal{X}_{k}\subset\mathfrak{L}_{{\bm{\ell}},k}$ we have that $x_{1,k+1}=N+1$ . Therefore

[TABLE]

On other hand, we have that $x_{d+1,\ell}=\ell$ for $\ell\in[d_{2}]$ . Since $x_{i+1,\ell}\leq x_{i,\ell}<x_{i+1,\ell+1}$ for any $\ell\in[k+d_{2}-1]$ , and $\{x_{i,\ell}\}$ are integers, using induction, we further obtain that

[TABLE]

for any $\mathcal{X}_{k}\in\mathfrak{L}_{{\bm{\ell}},k}$ . As $k+1\leq|\mathfrak{d}|=d_{0}-d_{1}=d_{2}-(d-d_{0})$ we find that $x_{d_{0}+1,k+1}=k+1$ . Hence, from (4.23) we deduce that

[TABLE]

for any $\mathcal{X}_{k}\subset\mathfrak{L}_{{\bm{\ell}},k}$ . Noting that $\delta_{i,k+1}(\mathcal{X}_{k})\leq\hat{\ell}_{i}-1$ for all $i\in[d_{0}]$ , we obtain

[TABLE]

Thus, (4.22) and (4.24) implies that, if $k<|\mathfrak{d}|$ then

[TABLE]

To complete the proof of the lemma we now use (4.13)-(4.15) and (4.17) to conclude that for any $k<|\mathfrak{d}|$ ,

[TABLE]

for all large $N$ , for some sufficiently small $\bar{\varepsilon}>0$ , depending only on $z$ and ${\bm{a}}$ . The proof finishes upon using (4.12). ∎

4.2. Lower and upper bounds on the dominant term

We will first prove Lemma 4.1, which is a lower bound on the dominant term. The proof is based on the following elementary anti-concentration bound for homogeneous polynomials of independent random variables, which may be of independent interest.

Proposition 4.5.

Fix $k,n\in\mathbb{N}$ and let $\{U_{i}\}_{i=1}^{n}$ be a sequence of independent real-valued random variables, whose law possesses a density with respect to the Lebesgue measure which is uniformly bounded by one. Let $Q_{k}(U_{1},U_{2},\ldots,U_{n})$ be a homogenous polynomial of degree $k$ such that the degree of each variable is at most one. That is,

[TABLE]

for some collection of complex valued coefficients $\{a(\mathcal{I});\,\mathcal{I}\in\binom{[n]}{k}\}$ , where $\binom{[n]}{k}$ denotes the set of all $k$ distinct elements of $[n]$ .

Assume that there exists an $\mathcal{I}_{0}\in\binom{[n]}{k}$ such that $|a(\mathcal{I}_{0})|\geq c_{\star}$ for some absolute constant $c_{\star}>0$ . Then for any $\varepsilon\in(0,{e^{-1}}]$ we have

[TABLE]

Proof.

As the densities of $\{U_{i}\}_{i\in[n]}$ are uniformly bounded by one, the desired anti-concentration property is immediate for $k=1$ . To prove the general case, we proceed by induction. To this end, we introduce some notation. Order the elements of $\mathcal{I}_{0}$ and denote them by $i_{1}^{0},i_{2}^{0},\ldots,i_{k}^{0}$ . For $j\leq k$ , define $\mathcal{I}^{0}_{j}:=\{i_{j}^{0},i_{j+1}^{0},\ldots,i_{k}^{0}\}$ . Set

[TABLE]

For $1\leq j\leq k-1$ , we iteratively define

[TABLE]

Equipped with the above notations we see that

[TABLE]

and $Q_{1}^{0}=a(\mathcal{I}_{0})$ . We will prove inductively that

[TABLE]

from which the desired anti-concentration bound follows by taking $j=k+1$ . Hence, it only remains to prove (4.25).

For $j=2$ , $Q_{j}^{0}$ is a homogeneous polynomial of degree $1$ in the variables $U_{i}$ , and (4.25) follows from the assumptions on $\{U_{\ell}\}_{\ell=1}^{n}$ and the fact that $|a(\mathcal{I}_{0})|\geq c_{\star}$ . Assuming that (4.25) holds for $j=j_{*}$ and fixing $\delta\in(0,1)$ , we have that with $C_{j}:={(8e)}^{j-1}(c_{\star}\wedge 1)^{-1}$ ,

[TABLE]

where we have used the fact that $Q_{j_{*}}^{1}$ and $Q_{j_{*}}^{0}$ are independent of $U_{i_{j_{*}}}^{0}$ , and the bound on the density for the latter. Using integration by parts, for any probability measure $\mu$ supported on $[0,\infty)$ we have that

[TABLE]

Therefore, using the induction hypothesis,

[TABLE]

Since for $\delta\leq e^{-1}$ we have that $\log(1/\delta)\geq 1$ , combining the above with (4.2) and setting $\delta=\varepsilon$ we establish (4.25) for $j=j_{*}+1$ . This completes the proof. ∎

Equipped with Proposition 4.5 we now begin the proof of Lemma 4.1.

Proof of Lemma 4.1.

Recalling (4.1) we note that $P_{|\mathfrak{d}|}(z)$ is a homogeneous polynomial of degree $|\mathfrak{d}|$ in the entries of the noise matrix $\Delta_{N}$ such that the degree of each entry of $\Delta_{N}$ is one. Therefore, to apply Proposition 4.5 we only need to show that there exists $X,Y\subset[N]$ with $|X|=|Y|=|\mathfrak{d}|$ such $\det(T_{N}(z)[X^{c};Y^{c}])$ is bounded below. The choice of such subsets will depend on the sign of $\mathfrak{d}$ . Hence, the proof is split into two cases.

Considering the case $\mathfrak{d}>0$ we set $X=[N]\setminus[N-\mathfrak{d}]$ and $Y=[\mathfrak{d}]$ . Recalling Definition 4.1 we find that

[TABLE]

We apply Widom’s result on the determinant of finitely banded Toeplitz matrices, in particular [5, Theorem 2.8] to deduce that for any $z\in\mathbb{C}\setminus\mathcal{N}$ , one has

[TABLE]

for some collection of coefficients $\{C_{\mathcal{I}}\}$ , where recall that $\mathcal{N}$ is the collections of $z$ ’s such that $P_{z,{\bm{a}}}(\cdot)$ has double roots. Furthermore, the coefficients $\{C_{\mathcal{I}}\}$ are bounded both below and above, for any $z\in B_{\mathbb{C}}(0,R)\setminus\mathcal{N}$ . As $z\in\mathcal{S}_{\mathfrak{d}}$ and $d_{1}-\mathfrak{d}=d_{0}(z)$ , using (4.15) we therefore deduce that there exists some small positive constant $c_{0}>0$ so that, for all large $N$ ,

[TABLE]

From the definition of $\Delta_{N}$ it follows that for the above choices of $X$ and $Y$ the determinant of $\Delta_{N}[X;Y]$ , ignoring the factor $N^{-\gamma_{\star}\mathfrak{d}}$ , is a homogeneous polynomial of degree $\mathfrak{d}$ of independent uniformly bounded random variables with uniformly bounded densities. Therefore, we are in a position to apply Proposition 4.5.

Without loss of generality, assuming that the densities of $\{(\Delta_{N})_{i,j}\}_{i,j=1}^{N}$ are uniformly bounded by one, we apply Proposition 4.5 for

[TABLE]

with $c_{\star}=c_{0}$ and $\varepsilon=N^{-\varepsilon_{0}/2}c_{\star}$ to arrive at (4.2) for any $z\in\mathcal{S}_{\mathfrak{d}}\setminus\mathcal{N}$ . As $\mathcal{N}$ contains at most finitely many points this proves the lemma when $\mathfrak{d}>0$ .

Turning to prove the same for $\mathfrak{d}<0$ , we reverse the roles of $X$ and $Y$ . That is, we now set $X=[-\mathfrak{d}]$ and $Y=[N]\setminus[N+\mathfrak{d}]$ and follow the same steps as above.

For $\mathfrak{d}=0$ the proof is straightforward. From its definition we have $P_{0}(z)=\det(T_{N}(z))$ . Upon setting $X=Y=\emptyset$ in (4.27) the result is immediate. Now the proof of the lemma is complete. ∎

We end this section with the proof of Lemma 4.4. Its proof is very similar to that of Lemma 4.2. Hence, only an outline is provided.

Proof of Lemma 4.4.

We split the proof into two cases: $\mathfrak{d}\neq 0$ and $\mathfrak{d}=0$ . First, let us consider $\mathfrak{d}\neq 0$ . As $\gamma_{\star}>d$ we find from (4.12)-(4.15) and (4.17) that

[TABLE]

If $\mathfrak{d}=0$ then the desired result follows from Widom’s result (see [5, Theorem 2.8]). ∎

5. Proof of Theorem 1.8

We recall from Section 2 that to prove Theorem 1.8 it suffices to establish (2.5). As outlined there, the key to the latter is to bound the difference of the mass of intervals near zero under the measures $\nu^{z}_{A_{N}+\Delta_{N}}$ and $\nu^{z}_{A_{N}+E_{N}}$ , the empirical distribution of the singular values of $A_{N}(z)+\Delta_{N}$ and $A_{N}(z)+N^{-\gamma}E_{N}$ , respectively, where $A_{N}(z):=A_{N}-z\operatorname{Id}_{N}$ . This in turns will be achieved by controlling the differences of the Stieltjes transforms of the corresponding measures. So, we begin this section with its definition.

Definition 5.1.

The Stieltjes transform of a probability measure $\mu$ on $\mathbb{R}$ is defined as

[TABLE]

To obtain a bound on the probability of any interval under $\mu$ from that of $G_{\mu}(\cdot)$ we use the following two inequalities. These are a consequence of [9, Eqns. (6)-(8)]: for any $\tau,\varrho>0$ , and $a,b\in\mathbb{R}$ such that $b-a>\varrho$ we have

[TABLE]

and

[TABLE]

Now to find a difference of the Stieltjes transforms of $\nu^{z}_{A_{N}+\Delta_{N}}$ and $\nu^{z}_{A_{N}+E_{N}}$ we also need the symmetrized form of the Stieltjes transform, as follows. For a $N\times N$ matrix $C_{N}$ , define

[TABLE]

and the Stieltjes transform

[TABLE]

$G_{C_{N}}(\cdot)$ is the Stieltjes transform of the symmetrized version of the empirical measure of the singular values of $C_{N}$ . Equipped with the above notation we have the following lemma.

Lemma 5.1.

For $C_{N}$ and $D_{N}$ any $N\times N$ matrices,

[TABLE]

where $\|\cdot\|_{{\rm HS}}$ denotes the Hilbert-Schmidt norm.

Proof.

Using the resolvent identity we have that

[TABLE]

Recall the following version of the Cauchy-Schwarz inequality: for any two $(2N)\times(2N)$ matrices $A_{N}$ and $B_{N}$

[TABLE]

Since for any Hermitian matrix $H_{N}$ one has $\|(\xi-H_{N})^{-1}\|\leq 1/|\Im(\xi)|$ , the claim follows from (5.5) upon using (5.6) with $A_{N}=\left(\xi-\widetilde{D}_{N}\right)^{-1}\cdot\left(\xi-\widetilde{C}_{N}\right)^{-1}$ and $B_{N}=\widetilde{C}_{N}-\widetilde{D}_{N}$ . ∎

As a last preliminary step, we need the following easy lemma.

Lemma 5.2.

For any probability measure $\mu$ ,

[TABLE]

for Lebesgue almost every $z\in\mathbb{C}$ .

Proof.

For $\varepsilon<1$ , set $F(z,\varepsilon):=\int\log(1/|x-z|){\bf 1}_{\{|x-z|\leq\varepsilon\}}d\mu(x)$ . Fix $z_{0}\in C$ . Note that, by Fubini’s theorem,

[TABLE]

In particular, for any $\delta>0$ , with ${\mathcal{A}}_{\varepsilon}(\delta):=\{z\in B_{\mathbb{C}}(z_{0},1):F(z,\varepsilon)>\delta\}$ , we obtain that

[TABLE]

In particular, $\mbox{\rm Leb}(\cap_{\varepsilon<1}{\mathcal{A}}_{\varepsilon}(\delta))=0$ . Using the monotonicity of $F(z,\varepsilon)$ , we conclude that for Lebesgue almost every $z\in B_{\mathbb{C}}(z_{0},1)$ , $\limsup_{\varepsilon\to 0}F(z,\varepsilon)\leq\delta$ . Taking a sequence $\delta_{n}\to 0$ gives (5.7), first for Lebesgue almost every $z\in B_{\mathbb{C}}(z_{0},1)$ , and then for almost every $z$ . ∎

We are now ready to prove Theorem 1.8.

Proof of Theorem 1.8.

To establish (1.3) we first claim that $\nu^{z}_{A_{N}+N^{-\gamma}E_{N}}\Rightarrow\mu_{z}$ , in probability, for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ , where $\mu_{z}$ is the law of $|X-z|$ and $X\sim\mu$ . The argument is similar to that employed in the proof of Theorem 1.2. Write $B_{N}:=A_{N}+N^{-\gamma}E_{N}$ and $B_{N}(z):=B_{N}-z\operatorname{Id}_{N}$ . We have that $\nu^{z}_{A_{N}}\Rightarrow\mu_{z}$ by assumption (b), while Assumption 1.1(i) and Markov’s inequality imply that

[TABLE]

for any $\gamma>1/2$ . On the other hand, by the Hoffman-Wielandt inequality, see [1, Lemma 2.1.19], the map $D_{N}\mapsto L_{N}^{D}$ , viewed as a map from the space of $N\times N$ Hermitian matrices equipped with the normalized Hilbert-Schmidt norm $N^{-1/2}\|\cdot\|_{\rm HS}$ to the space of probability measures equipped with the weak topology, is continuous. Note that for any matrix, the singular values of $A$ are the same as the modulus of the eigenvalues of the matrix

[TABLE]

up to double the multiplicity for each singular value. In particular,

[TABLE]

in probability, by (5.8). We conclude from that and the above mentioned continuity of the empirical measure in the (normalized) Hilbert-Schmidt norm that

[TABLE]

as claimed.

To complete the proof we need to extend the convergence of $\nu^{z}_{B_{N}}$ to the convergence of the integral of $\log(\cdot)$ against this measure. To this end, using (5.8) again and the fact that the operator norm of $A_{N}(z)$ is bounded, we see that there exists a compact set $\mathbb{K}\subset\mathbb{R}$ such that for any $z\in B_{\mathbb{C}}(0,R_{0})$

[TABLE]

Hence, for any $\varepsilon>0$ ,

[TABLE]

for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ . Note that

[TABLE]

Thus, (5.7) together with (5.9) imply that it only remains to show that given any $\delta>0$ , there exists $\varepsilon_{0}(\delta)$ such that for any $\varepsilon\leq\varepsilon_{0}(\delta)$

[TABLE]

for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ and some large constant $C_{0}$ . To prove this, we first show that an analogue of (5.10) holds for the empirical measure of the singular values of $A_{N}+\Delta_{N}$ .

Turning to do this task, using (1.1) and arguing similarly to the steps leading to (5.9), we obtain that

[TABLE]

for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ , and further, for any $\varepsilon>0$ ,

[TABLE]

for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ . Together with assumptions (b)-(c), we conclude that for Lebesgue a.e. $z\in B_{\mathbb{C}}(0,R_{0})$ , given any $\delta>0$ , there exists $\varepsilon_{0}(\delta)$ such that for all $\varepsilon\leq 2\varepsilon_{0}(\delta)$ ,

[TABLE]

Having shown (5.12) we now proceed to the proof of (5.10). Using Assumption 1.1(ii) we see that there exists a sufficiently large constant $\kappa_{\star}$ such that

[TABLE]

where $s_{\min}(H)$ is the minimal singular value of a matrix $H$ . Hence,

[TABLE]

Now to control the integral of $\log(\cdot)$ over $(N^{-\kappa_{\star}},\varepsilon)$ we apply (5.4) to deduce that

[TABLE]

on the event

[TABLE]

where $\tau=N^{-\delta^{\prime}/4}$ and $\delta^{\prime}=\min\{\frac{1}{2}(\gamma-\frac{1}{2}),\gamma_{0}\}$ .

Let $\widetilde{\nu}_{A_{N}+\Delta_{N}}^{z}$ and $\widetilde{\nu}_{B_{N}}^{z}$ denote the symmetrized versions of the probability measures $\nu_{A_{N}+\Delta_{N}}^{z}$ and $\nu_{B_{N}}^{z}$ , respectively. Setting $\varrho=N^{-\delta^{\prime}/8}$ , $\kappa=\delta^{\prime}/16$ , and using (5.1)-(5.2) and (5.14) in the second inequality, we have that

[TABLE]

on $\Omega_{N}$ , for all large $N$ , where in the third inequality we used the symmetry of $\tilde{\nu}_{A_{N}+\Delta_{N}}^{z}$ and $\varrho=o(N^{-\kappa})$ .

It remains to bound the integral of $\log(\cdot)$ over $(N^{-\kappa},\varepsilon)$ . Toward this, using integration by parts we note that, for $0\leq a_{1}<a_{2}<1$ and any probability measure $\mu$ on $\mathbb{R}$ ,

[TABLE]

Arguing as in (5.15) we obtain

[TABLE]

where in the last step we have used the fact that $t+N^{-\kappa}\leq 2t$ for any $t\geq N^{-\kappa}$ , and a change of variables. Similar reasoning yields that

[TABLE]

Thus combining (5.15), and (5.17)-(5.18), and using (5.16) we deduce that for $\varepsilon\leq\varepsilon_{0}(\delta)$ sufficiently small and all large $N$ ,

[TABLE]

on the event $\Omega_{N}$ , where $C_{0}^{\prime}$ is some large constant. Finally, (5.8) and (1.1) imply that $\mathbb{P}(\Omega_{N}^{c})=o(1)$ . Therefore, combining (5.12) and (5.13), the claim in (5.10) now follows. This completes the proof of the theorem. ∎

Appendix A Some algebraic facts

In this section we collect a couple of standard matrix results which have been used in the proofs appearing in Section 4.

The first result shows that the determinant of the sum of the two matrices can be expressed as a linear combination of products of the determinants of appropriate sub-matrices. The proof follows from the definition of the determinant, see e.g. [10]. We adopt the convention that the determinant of the matrix of size zero is one. For an $N\times N$ matrix $A$ , and $X,Y\subseteq[N]$ we write $A[X;Y]$ for the sub-matrix of $A$ which consists of the rows in $X$ and the columns in $Y$ .

Lemma A.1.

For any $N\times N$ matrices $A$ and $B$ we have

[TABLE]

where ${X}^{c}:=[N]\setminus X$ , ${Y}^{c}:=[N]\setminus Y$ and $\sigma_{Z}$ for $Z\in\{X,Y\}$ is the permutation on $[N]$ which places all the elements of $Z$ before all the elements of ${Z}^{c}$ , but preserves the order of elements within the two sets.

The next lemma evaluates the determinant of any sub-matrix of a bidiagonal matrix.

Lemma A.2 ([8, Lemma 2.2]).

Let $A_{N}$ be an upper bi-diagonal matrix and $X,Y\subset[N]$ such that $|X|=|Y|$ . Then $\det(A_{N}[X;Y])$ equals the product of the diagonal entries of $A_{N}[X;Y]$ .

The next lemma, which follows readily from Lemma A.2, evaluates the determinant of any sub-matrix of a bidiagonal Toeplitz matrix.

Lemma A.3 ([8, Lemma 2.3]).

Let $A_{N}=J_{N}+\mathfrak{z}\operatorname{Id}_{N}$ , $\mathfrak{z}\in\mathbb{C}$ , $X=\{x_{1}<x_{2}<\cdots<x_{k}\}\subset[N]$ , and $Y=\{y_{1}<y_{2}<\ldots<y_{k}\}\subset[N]$ . Then, with $y_{k+1}=\infty$ ,

[TABLE]

Bibliography19

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Greg W. Anderson, Alice Guionnet, and Ofer Zeitouni. An introduction to random matrices . No. 118. Cambridge University Press, 2010.
2[2] A. Basak, E. Paquette, and O. Zeitouni. Regularization of non-normal matrices by Gaussian noise - the banded Toeplitz and twisted Toeplitz cases. Forum of Mathematics, Sigma , 7 , E 3, 2019.
3[3] A. Basak and O. Zeitouni. Outliers of random perturbations of Toeplitz matrices with finite symbols. Ar Xiv preprint , ar Xiv:1905.10244, 2019.
4[4] C. Bordenave and D. Chafaï. Around the circular law. Probability Surveys , 9 , 1–89, 2012.
5[5] A. Böttcher and S. M. Grudsky. Spectral Properties of Banded Toeplitz Matrices . Vol. 96, Siam, 2005.
6[6] N. Cook. Lower bounds for the smallest singular value of structured random matrices. The Annals of probability , 46 (6), 3442–3500, 2018.
7[7] R. B. Davies and M. Hager. Perturbations of Jordan matrices. Journal of Approx.imation Theory 156 , 82–94, 2009.
8[8] O. N. Feldheim, E. Paquette, and O. Zeitouni. Regularization of non-normal matrices by Gaussian noise. International Mathematics Research Notices 18 , 8724–8751, 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Spectrum of random perturbations of Toeplitz matrices with finite symbols

Abstract.

1. Introduction

Assumption 1.1**.**

Theorem 1.2**.**

Remark 1.3**.**

Remark 1.4**.**

Remark 1.5**.**

Definition 1.1** (Log-potential).**

Theorem 1.6**.**

Remark 1.7**.**

Theorem 1.8** (Replacement principle).**

Remark 1.9**.**

1.1. Related results and extensions

Outline of the rest of the paper

Acknowledgements

2. Outlines of proofs of Theorems 1.6 and 1.8

Lemma 2.1**.**

3. Proof of Theorem 1.2 using Theorems 1.6 and 1.8

4. Proof of Theorem 1.6

Lemma 4.1**.**

Lemma 4.2**.**

Lemma 4.3**.**

Lemma 4.4**.**

Proof of Theorem 1.6.

4.1. Upper bound on non-dominant terms

Definition 4.1** (Toeplitz with a shifted symbol).**

Proof of Lemma 4.2.

Proof of Lemma 4.3.

4.2. Lower and upper bounds on the dominant term

Proposition 4.5**.**

Proof.

Proof of Lemma 4.1.

Proof of Lemma 4.4.

5. Proof of Theorem 1.8

Definition 5.1**.**

Lemma 5.1**.**

Proof.

Lemma 5.2**.**

Proof.

Proof of Theorem 1.8.

Appendix A Some algebraic facts

Lemma A.1**.**

Lemma A.2** ([8, Lemma 2.2]).**

Lemma A.3** ([8, Lemma 2.3]).**

Assumption 1.1.

Theorem 1.2.

Remark 1.3.

Remark 1.4.

Remark 1.5.

Definition 1.1 (Log-potential).

Theorem 1.6.

Remark 1.7.

Theorem 1.8 (Replacement principle).

Remark 1.9.

Lemma 2.1.

Lemma 4.1.

Lemma 4.2.

Lemma 4.3.

Lemma 4.4.

Definition 4.1 (Toeplitz with a shifted symbol).

Proposition 4.5.

Definition 5.1.

Lemma 5.1.

Lemma 5.2.

Lemma A.1.

Lemma A.2 ([8, Lemma 2.2]).

Lemma A.3 ([8, Lemma 2.3]).