On heavy-tail phenomena in some large deviations problems

Fanny Augeri

arXiv:1706.06184·math.PR·June 21, 2017

On heavy-tail phenomena in some large deviations problems

Fanny Augeri

PDF

Open Access

TL;DR

This paper investigates heavy-tail phenomena in large deviations, showing they can be explained by translation mechanisms, and applies these results to spectral measures, eigenvalues, traces, and last-passage times with heavy-tailed distributions.

Contribution

It establishes a general large deviations principle for functionals under heavy-tailed measures, revealing translation as the key mechanism behind observed deviations.

Findings

01

Heavy-tail phenomena explained by translation mechanisms.

02

Large deviations principles for spectral measures and eigenvalues.

03

Application to last-passage times with heavy-tailed weights.

Abstract

In this paper, we revisit the proof of the large deviations principle of Wiener chaoses partially given by Borel, and then by Ledoux in its full form. We show that some heavy-tail phenomena observed in large deviations can be explained by the same mechanism as for the Wiener chaoses, meaning that the deviations are created, in a sense, by translations. More precisely, we prove a general large deviations principle for a certain class of functionals $f_{n} : R^{n} \to X$ , where $X$ is some metric space, under the $n$ -fold probability measure $ν_{α}^{n}$ , where $ν_{α} = Y_{α}^{- 1} e^{- ∣ x ∣^{α}} d x$ , $α \in (0, 2]$ , for which the large deviations are due to translations. We retrieve, as an application, the large deviations principles known for the Wigner matrices without Gaussian tails, of the empirical spectral measure by Bordenave and Caputo,…

Equations962

\forall x\in B,\ I_{\Psi}(x)=\inf\Big{\{}\frac{1}{2}|h|^{2}:x=\Psi^{(d)}(h),h\in\mathcal{H}\Big{\}},

\forall x\in B,\ I_{\Psi}(x)=\inf\Big{\{}\frac{1}{2}|h|^{2}:x=\Psi^{(d)}(h),h\in\mathcal{H}\Big{\}},

\forall h \in H, Ψ^{(d)} (h) = \int Ψ (x + h) d μ (x) .

\forall h \in H, Ψ^{(d)} (h) = \int Ψ (x + h) d μ (x) .

||h||_{\ell^{\alpha}}=\big{(}\sum_{i=1}^{n}|x_{i}|^{\alpha}\big{)}^{1/\alpha}.

||h||_{\ell^{\alpha}}=\big{(}\sum_{i=1}^{n}|x_{i}|^{\alpha}\big{)}^{1/\alpha}.

- B^{\circ} in f J \leq n \to + \infty lim inf \frac{1}{υ ( n )} lo g P (Z_{n} \in B) \leq n \to + \infty lim sup \frac{1}{υ ( n )} lo g P (Z_{n} \in B) \leq - \overline{B} in f J,

- B^{\circ} in f J \leq n \to + \infty lim inf \frac{1}{υ ( n )} lo g P (Z_{n} \in B) \leq n \to + \infty lim sup \frac{1}{υ ( n )} lo g P (Z_{n} \in B) \leq - \overline{B} in f J,

f_{n} (X_{n} + v (n)^{1/ α} h_{n}) ≃ F_{n} (h_{n}),

f_{n} (X_{n} + v (n)^{1/ α} h_{n}) ≃ F_{n} (h_{n}),

J_{α} = δ > 0 sup n \in N n \to + \infty lim sup I_{n, δ},

J_{α} = δ > 0 sup n \in N n \to + \infty lim sup I_{n, δ},

\forall x \in X, I_{n, δ} (x) = in f {∣∣ h ∣ ∣_{ℓ^{α}}^{α} : d (F_{n} (h), x) < δ, h \in R^{n}} .

\forall x \in X, I_{n, δ} (x) = in f {∣∣ h ∣ ∣_{ℓ^{α}}^{α} : d (F_{n} (h), x) < δ, h \in R^{n}} .

I_{α} = δ > 0 sup n \in N in f I_{n, δ} .

I_{α} = δ > 0 sup n \in N in f I_{n, δ} .

\forall x \in X, I_{n, δ} (x) = in f {∣∣ h ∣ ∣_{ℓ^{α}}^{α} : d (F_{n} (h), x) < δ, h \in R^{n}} .

\forall x \in X, I_{n, δ} (x) = in f {∣∣ h ∣ ∣_{ℓ^{α}}^{α} : d (F_{n} (h), x) < δ, h \in R^{n}} .

\forall x \in X, I_{α} (x) = δ > 0 sup n \in N in f I_{n, δ} (x) .

\forall x \in X, I_{α} (x) = δ > 0 sup n \in N in f I_{n, δ} (x) .

\sup_{h_{n}\in rB_{\ell^{\alpha}}}d\big{(}f_{n}(X_{n}+v(n)^{1/\alpha}h_{n}),F_{n}(h_{n})\big{)}\underset{\underset{n\in N}{n\to+\infty}}{\longrightarrow}0,

\sup_{h_{n}\in rB_{\ell^{\alpha}}}d\big{(}f_{n}(X_{n}+v(n)^{1/\alpha}h_{n}),F_{n}(h_{n})\big{)}\underset{\underset{n\in N}{n\to+\infty}}{\longrightarrow}0,

E ∣∣ h ∣ ∣_{ℓ^{2}} \leq t_{δ} (n) sup L_{n} (h) \leq δ,

E ∣∣ h ∣ ∣_{ℓ^{2}} \leq t_{δ} (n) sup L_{n} (h) \leq δ,

\mathcal{L}_{n}(h)=\sup_{X_{n}+rv(n)^{1/\alpha}B_{\ell^{\alpha}}}d\big{(}f_{n}(x+h),f_{n}(x)\big{)},

\mathcal{L}_{n}(h)=\sup_{X_{n}+rv(n)^{1/\alpha}B_{\ell^{\alpha}}}d\big{(}f_{n}(x+h),f_{n}(x)\big{)},

(lo g n)^{α /2} = o (lo g \frac{t _{δ} ( n ) ^{2}}{v ( n )}) if α \neq = 1, or v (n) = o (t_{δ} (n)^{2}) if α = 1.

(lo g n)^{α /2} = o (lo g \frac{t _{δ} ( n ) ^{2}}{v ( n )}) if α \neq = 1, or v (n) = o (t_{δ} (n)^{2}) if α = 1.

I_{α} (x) = δ > 0 sup n \in N n \to + \infty lim sup I_{n, δ} (x) .

I_{α} (x) = δ > 0 sup n \in N n \to + \infty lim sup I_{n, δ} (x) .

d (f_{n} (X_{n} + v (n)^{1/ α} h_{n}), F_{n} (h_{n})) n \in N n \to + \infty ⟶ 0,

d (f_{n} (X_{n} + v (n)^{1/ α} h_{n}), F_{n} (h_{n})) n \in N n \to + \infty ⟶ 0,

(\log n)^{\alpha/2}=o\big{(}\log\frac{1}{L_{2}(n)^{2}v(n)}\big{)}\text{ if }\alpha\in(1,2),\text{ and }v(n)=o\big{(}\frac{1}{L_{2}(n)^{2}}\big{)}\text{ if }\alpha=1.

(\log n)^{\alpha/2}=o\big{(}\log\frac{1}{L_{2}(n)^{2}v(n)}\big{)}\text{ if }\alpha\in(1,2),\text{ and }v(n)=o\big{(}\frac{1}{L_{2}(n)^{2}}\big{)}\text{ if }\alpha=1.

\forall x \in X, \tilde{I}_{α} (x) = n in f {∣∣ h ∣ ∣_{ℓ^{α}}^{α} : x = F_{n} (h), h \in R^{n}} .

\forall x \in X, \tilde{I}_{α} (x) = n in f {∣∣ h ∣ ∣_{ℓ^{α}}^{α} : x = F_{n} (h), h \in R^{n}} .

I_{α} = δ > 0 sup B (x, δ) in f \tilde{I}_{α} .

I_{α} = δ > 0 sup B (x, δ) in f \tilde{I}_{α} .

∣∣ t^{- d} Ψ (x + t h) - Ψ^{(d)} (h) ∣∣ t \to + \infty ⟶ 0,

∣∣ t^{- d} Ψ (x + t h) - Ψ^{(d)} (h) ∣∣ t \to + \infty ⟶ 0,

n \in N n \to + \infty lim inf \frac{1}{v ( n )} lo g ν_{α}^{n} (E + v (n)^{1/ α} h_{n}) \geq - n \in N n \to + \infty lim sup c_{α} (h_{n}),

n \in N n \to + \infty lim inf \frac{1}{v ( n )} lo g ν_{α}^{n} (E + v (n)^{1/ α} h_{n}) \geq - n \in N n \to + \infty lim sup c_{α} (h_{n}),

n \in N n \to + \infty lim sup \frac{1}{v ( n )} lo g ν_{α}^{n} (x \in / E + {c_{α} \leq r v (n)}) \leq - r,

n \in N n \to + \infty lim sup \frac{1}{v ( n )} lo g ν_{α}^{n} (x \in / E + {c_{α} \leq r v (n)}) \leq - r,

I_{n,\delta}^{+}(x)=\inf\big{\{}||h||_{\ell^{\alpha}}^{\alpha}:d(F_{n}(h),x)<\delta,h\in\mathbb{R}_{+}^{n}\big{\}}.

I_{n,\delta}^{+}(x)=\inf\big{\{}||h||_{\ell^{\alpha}}^{\alpha}:d(F_{n}(h),x)<\delta,h\in\mathbb{R}_{+}^{n}\big{\}}.

I_{α} (x) = δ > 0 sup n \in N n \to + \infty lim sup I_{n, δ}^{+} (x) .

I_{α} (x) = δ > 0 sup n \in N n \to + \infty lim sup I_{n, δ}^{+} (x) .

\forall A\in\mathcal{H}_{n}^{(\beta)},\ W_{\alpha}(A)=b\sum_{i}|A_{i,i}|^{\alpha}+\sum_{i<j}\Big{(}a_{1}|\Re A_{i,j}|^{\alpha}+a_{2}|\Im A_{i,j}|^{\alpha}\Big{)},

\forall A\in\mathcal{H}_{n}^{(\beta)},\ W_{\alpha}(A)=b\sum_{i}|A_{i,i}|^{\alpha}+\sum_{i<j}\Big{(}a_{1}|\Re A_{i,j}|^{\alpha}+a_{2}|\Im A_{i,j}|^{\alpha}\Big{)},

μ_{A} = \frac{1}{n} i = 1 \sum n δ_{λ_{i}},

μ_{A} = \frac{1}{n} i = 1 \sum n δ_{λ_{i}},

μ_{X / n} n \to \infty ⇝ μ_{sc},

μ_{X / n} n \to \infty ⇝ μ_{sc},

μ_{sc} = \frac{1}{2 π} 4 - x^{2} \mathds 1_{∣ x ∣ \leq 2} d x .

μ_{sc} = \frac{1}{2 π} 4 - x^{2} \mathds 1_{∣ x ∣ \leq 2} d x .

λ_{X / n} n \to + \infty ⟶ 2,

λ_{X / n} n \to + \infty ⟶ 2,

I_{α} (μ) = δ > 0 sup n \in N in f {W_{α} (A) : A \in H_{n}^{(β)}, d (μ, μ_{sc} ⊞ μ_{n^{1/ α} A}) < δ},

I_{α} (μ) = δ > 0 sup n \in N in f {W_{α} (A) : A \in H_{n}^{(β)}, d (μ, μ_{sc} ⊞ μ_{n^{1/ α} A}) < δ},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpectral Theory in Mathematical Physics · advanced mathematical theories · Random Matrices and Applications

Full text

On heavy-tail phenomena in some large deviations problems

Fanny Augeri111Institut de Mathématiques de Toulouse, France, E-mail: [email protected]

Abstract

In this paper, we revisit the proof of the large deviations principle of Wiener chaoses partially given by Borell [20], and then by Ledoux [31] in its full form. We show that some heavy-tail phenomena observed in large deviations can be explained by the same mechanism as for the Wiener chaoses, meaning that the deviations are created, in a sense, by translations. More precisely, we prove a general large deviations principle for a certain class of functionals $f_{n}:\mathbb{R}^{n}\to\mathcal{X}$ , where $\mathcal{X}$ is some metric space, under the $n$ -fold probability measure $\nu_{\alpha}^{n}$ , where $\nu_{\alpha}=Y_{\alpha}^{-1}e^{-|x|^{\alpha}}dx$ , $\alpha\in(0,2]$ , for which the large deviations are due to translations. We retrieve, as an application, the large deviations principles known for the Wigner matrices without Gaussian tails in [19], [4], [5] of the empirical spectral measure, the largest eigenvalue, and traces of polynomials. We also apply our large deviations result to the last-passage time, which yields a large deviations principle when the weights follow the law $Z_{\alpha}^{-1}e^{-x^{\alpha}}\mathds{1}_{x\geq 0}dx$ , with $\alpha\in(0,1)$ .

1 Introduction

In [31], Ledoux proposed a large deviations principle for the Wiener chaoses based on the approach Borell gave in [20] for estimating their tail distribution. The main feature which stands out of the proof is that the large deviations of Wiener chaoses are due to translations by elements of the Cameron-Martin space. The lower bound consists in an application of the Cameron-Martin formula, whereas the upper bound relies on the Gaussian isoperimetric inequality.

More precisely, let $(E,\mathcal{H},\mu)$ be an abstract Wiener space, where $E$ is a separable Banach space, $\mu$ is a Gaussian measure on $E$ , and $\mathcal{H}$ the reproducing kernel (see [33] or [23, chapter 4] for proper definitions). Let also $\Psi$ be a homogenous Wiener chaos of degree $d$ taking values in some Banach space $B$ , that is, a random variable in the subspace spanned in $L^{2}(\mu;B)$ by Hermite polynomials of degree $d$ . From [31], we know that $t^{-d}\Psi$ follows a large deviations principle with speed $t^{2}$ and good rate function $I_{\Psi}$ defined by,

[TABLE]

where $|\ |$ denotes the norm of the reproducing kernel $\mathcal{H}$ , and

[TABLE]

We believe Borell and Ledoux’s approach to be extremely fruitful, and can shed a new light on heavy-tail phenomena appearing in the large deviations of certain models, where the large deviations are created also, in a sense, by translations. We already used this approach in a previous work [5] to deal with the question of the large deviations of traces of powers of Gaussian Wigner matrices. Indeed, this problem can be reformulated as understanding the large deviations of Gaussian chaoses defined on spaces with growing dimension. Although this problem cannot be solved directly by using the large deviations principle of Wiener chaoses, the same outline of proof was carried out in this case, and yields a rate function having a similar structure as (1).

We would like here to push further this approach in a more general setting, and give some elements showing that heavy-tail phenomena in the large deviations of certain models can be understood using the paradigm of the Wiener chaoses. To this end, we propose a general large deviations result for a certain class of functionals $f_{n}:\mathbb{R}^{n}\to\mathcal{X}$ , where $\mathcal{X}$ is some metric space, under the $n$ -fold probability measure $\nu_{\alpha}^{n}$ , where $\nu_{\alpha}=Y_{\alpha}^{-1}e^{-|x|^{\alpha}}dx$ , with $\alpha\in(0,2]$ , for which the large deviations are governed by translations.

As an application of this result, we will retrieve the large deviations principles of different spectral functionals of the so-called Wigner matrices without Gaussian tails. Introduced in [19] by Bordenave and Caputo, the model of Wigner matrices without Gaussian tails designates Wigner matrices whose entries have tail distributions behaving like $e^{-ct^{\alpha}}$ , with $c>0$ , and $\alpha\in(0,2)$ . This model gives rise to a heavy-tail phenomenon which enables one to derive full large deviations principles for the spectral measure [19] (see [26] in the Wishart matrix case), the largest eigenvalue [4], and the traces of powers [5].

In the more restricted setting where we assume that the entries have a density with respect to Lebesgue measure which is proportional to $e^{-c|x|^{\alpha}}$ , with $c>0$ , and $\alpha\in(0,2)$ , the large deviations principles of these spectral functionals will fall in a unified way from our general large deviation result.

Another application of this result will consist in a large deviations principle for the last-passage time when the weights are independent and have a density on $\mathbb{R}^{+}$ proportional to $e^{-x^{\alpha}}$ for $\alpha\in(0,1)$ .

2 Main results

Let us present the main results of this paper. For $\alpha>0$ , we denote by $\nu_{\alpha}$ the probability measure on $\mathbb{R}$ with density $Y_{\alpha}^{-1}e^{-|x|^{\alpha}}$ with respect to Lebesgue measure, and $\nu_{\alpha}^{n}$ its $n$ -fold product measure on $\mathbb{R}^{n}$ . Similarly, we define $\mu_{\alpha}$ the probability measure on $\mathbb{R}^{+}$ with density $Z_{\alpha}^{-1}e^{-x^{\alpha}}$ . We will denote for any $h\in\mathbb{R}^{n}$ ,

[TABLE]

We recall that a sequence of random variables $(Z_{n})_{n\in\mathbb{N}}$ taking value in some topological space $\mathcal{X}$ equipped with the Borel $\sigma$ -field $\mathcal{B}$ , follows a large deviations principle (LDP) with speed $\upsilon(n)$ , and rate function $J:\mathcal{X}\to[0,+\infty]$ , if $J$ is lower semicontinuous and $\upsilon(n)$ increases to infinity and for all $B\in\mathcal{B}$ ,

[TABLE]

where $B^{\circ}$ denotes the interior of $B$ and $\overline{B}$ the closure of $B$ . We recall that $J$ is lower semicontinuous if its $t$ -level sets $\{x\in\mathcal{X}:J(x)\leq t\}$ are closed, for any $t\in[0,+\infty)$ . Furthermore, if all the level sets are compact, then we say that $J$ is a good rate function.

The purpose of the general large deviations result we will present, is to identify a class of functionals $f_{n}:\mathbb{R}^{n}\to\mathcal{X}$ , where $\mathcal{X}$ is some metric space, for which the large deviations are created by translations. Let us describe first informally the assumptions we will make. Let $X_{n}$ follow the law $\nu_{\alpha}^{n}$ . We will assume that $f_{n}(X_{n})$ admits a kind of deterministic equivalent under additive deformations, given by a certain function $F_{n}$ , that is,

[TABLE]

in probability, for any sequence $h_{n}\in\mathbb{R}^{n}$ , $\sup_{n}||h_{n}||_{\ell^{\alpha}}<+\infty$ , where $v(n)$ will eventually be the speed of deviations. It is convenient to think of $F_{n}(h_{n})$ as a deterministic equivalent of $f_{n}(X_{n}+v(n)^{1/\alpha}h_{n})$ , where we took the large $n$ limit on the variable $X_{n}$ . Under this assumption, we will show that a large deviations lower bound for $f_{n}(X_{n})$ at speed $v(n)$ , holds with rate function,

[TABLE]

where

[TABLE]

This rate function $J_{\alpha}$ can be interpreted by saying that to make a deviation around some $F_{n}(h_{n})$ , $X_{n}$ needs to make a translation by $v(n)^{1/\alpha}h_{n}$ , which one pays at the exponential scale $v(n)$ by $||h_{n}||_{\ell^{\alpha}}^{\alpha}$ .

For the upper bound, we will further assume that for any $r>0$ , the deterministic equivalent (3) holds uniformly in $||h_{n}||_{\ell^{\alpha}}\leq r$ . The upper bound will rely on sharp large deviation inequalities for $\nu_{\alpha}^{n}$ , where we will need, excepted in the Gaussian case, to neglect the Euclidean enlargements appearing naturally. We thus make the assumption that $f_{n}$ has a small, in expectation, local Lipschitz constant with respect to $||\ ||_{\ell^{2}}$ when $\alpha<2$ . Finally, under some compactness property of $F_{n}$ , we will prove that a large deviations upper bound holds for $f_{n}(X_{n})$ with speed $v(n)$ and rate function,

[TABLE]

Thus, if we moreover assume that the upper bound rate function $I_{\alpha}$ matches the lower rate function, we will get a full large deviations principle with speed $v(n)$ . More precisely, we will prove the following result.

2.1 Theorem.

Let $(\mathcal{X},d)$ be a metric space. Let $\alpha\in(0,2]$ and $N\subset\mathbb{N}$ an infinite subset. Let $X_{n}$ be a random variable with law $\nu_{\alpha}^{n}$ . Let $f_{n},F_{n}:\mathbb{R}^{n}\to\mathcal{X}$ be measurable functions. Let $(v(n))_{n\in N}$ be a sequence going to $+\infty$ . Define for $\delta>0$ and $n\in N$ , the function

[TABLE]

We set

[TABLE]

*We assume:

(i).(Uniform deterministic equivalent). For any $r>0$ ,*

[TABLE]

*in probability.

(ii).(Control of the Lipschitz constant). If $\alpha<2$ , then for any $\delta>0$ and $r>0$ , there is a sequence $t_{\delta}(n)$ such that,*

[TABLE]

with

[TABLE]

satisfying,

[TABLE]

*(iii).(Compactness). For any $r>0$ , $\cup_{n\in N}F_{n}(rB_{\ell^{\alpha}})$ is relatively compact.

(iv).(Upper bound = lower bound). For any $x\in\mathcal{X}$ ,*

[TABLE]

Then $(f_{n}(X_{n}))_{n\in N}$ satisfies a LDP with speed $v(n)$ and good rate function $I_{\alpha}$ .

Let us make some remarks on the assumptions of this theorem.

*2.2 Remarks**.*

(a). We will prove that under the assumption that for any sequence $h_{n}\in\mathbb{R}^{n}$ , $n\in N$ , such that $\sup_{n}||h_{n}||_{\ell^{\alpha}}<+\infty$ ,

[TABLE]

in probability, the lower bound of the LDP holds with the rate function (6).

(b). The assumption $(i)$ that the approximation (7) holds uniformly in $h_{n}\in rB_{\ell^{\alpha}}$ is crucial for deriving the upper bound of the LDP with rate function (4), and is one of the most constraining assumptions of Theorem 2.1. In the applications we develop when $\alpha<2$ , this is proven by some concentration inequality and chaining arguments, which can be carried out successfully due to the “sparsity” of the ball $B_{\ell^{\alpha}}$ .

(c). The formulation of assumption $(ii)$ on the Lipschitz constant of $f_{n}$ is specially designed to include polynomial functionals $f_{n}$ , as the trace of a polynomial of random matrices. In other words, it says that the “local” Lipschitz constant of $f_{n}$ , is small enough uniformly on the set $X_{n}+rv(n)^{1/\alpha}B_{\ell^{\alpha}}$ . Note that when $f_{n}$ is $L_{2}(n)$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ , a sufficient condition for assumption $(ii)$ to be fulfilled is

[TABLE]

This assumption ensures that the deviations of $f_{n}(X_{n})$ are explained by a heavy-tail phenomenon. For example, it fails to hold for empirical means under $\nu_{\alpha}^{n}$ when $\alpha\in[1,2)$ .

(d). The compactness assumption of $(iii)$ is made to ensure that $I_{\alpha}$ is a good rate function. As one can observe in the proof, without it, the upper bound of the LDP holds only for compact sets.

(e). The rate function $I_{\alpha}$ can be simplified in certain cases. Define the function $\tilde{I}_{\alpha}$ by,

[TABLE]

One can see that,

[TABLE]

Thus if $\tilde{I}_{\alpha}$ is lower semi-continuous, then $I_{\alpha}=\tilde{I}_{\alpha}$ .

The proof is in line with the ideas and the framework developed by Borell and Ledoux in [20], [21] and [31], [23], for the large deviations for Wiener chaoses. To make a parallel with their approach, one can observe that the first step in their proof is to show some deterministic equivalent for the Wiener chaoses when deformed in a direction of the reproducing kernel, that is, by [23, chapter 5 (5.7)], for any $h\in\mathcal{H}$ ,

[TABLE]

in probability, and even uniformly in $h\in\mathcal{O}$ the unit ball of $\mathcal{H}$ , along a discretization of $\Psi$ by [23, chapter 5 (5.9)], where $\Psi^{(d)}$ defined in (2). Similarly, we make the assumption $(i)$ that a uniform deterministic equivalent holds for the functionals $f_{n}$ .

For the lower bound, we replace the use of the Cameron-Martin formula, used in the context of abstract Wiener space, with a lower bound estimate of the probability of translated events, that is,

[TABLE]

for a given sequence $h_{n}\in\mathbb{R}^{n}$ , subsets $E$ such that $\liminf_{n}\nu_{\alpha}^{n}(E)>0$ , and where $c_{\alpha}$ is some weight function. In the Gaussian case $\alpha=2$ , the translation formula of the Gaussian measure gives this estimate with $c_{\alpha}(h)=||h||_{\ell^{2}}^{2}$ . When $\alpha<2$ , one can mimic the Gaussian case to get such an estimate (10) with $c_{\alpha}(h)=||h||_{\ell^{\alpha}}^{\alpha}$ , whereas when $\alpha>2$ , we believe that there is a competition between the speed and the dimension which is not workable in the applications.

Whereas the Gaussian isoperimetric inequality is used in the proof of the upper bound of the deviations of Wiener chaoses, ours will rely on sharp large deviation inequalities for $\nu_{\alpha}^{n}$ with respect to the weight function $c_{\alpha}$ , that is

[TABLE]

for some “large enough” subsets $E$ . We will show that we can take $c_{\alpha}=||h||_{\ell^{\alpha}}^{\alpha}$ , which together with (10) will allow us to make the upper and lower bound match. In the Gaussian case, this is due to the Gaussian isoperimetric inequality, whereas when $\alpha<2$ , we will have to call for sharp inf-convolution inequalities for $\nu_{\alpha}^{n}$ . This is in particular where assumption $(ii)$ plays its role since it enables us, when $\alpha<2$ , to neglect the Euclidean balls which come naturally in the deviation inequality of $\nu_{\alpha}^{n}$ , and consider subsets $E$ which are indeed large enough.

These two estimates (10) and (11) are behind the limitation in Theorem 2.1 to the probability measures $\nu_{\alpha}^{n}$ for $\alpha\in(0,2]$ . For example, if one replaces the measure $\nu_{\alpha}$ by the probability measure on $\mathbb{R}_{+}$ with density $Z_{\alpha}^{-1}e^{-x^{\alpha}}$ , one can show that (10) holds provided $h_{n}$ has all its coordinates non-negative (and $n=o(v(n))$ if $\alpha>1$ ). But then, we will have to prove (11) with $c_{\alpha}(h)=||h||_{\ell^{\alpha}}^{\alpha}$ if the coordinates of $h$ are non-negative, and $+\infty$ otherwise, which we do not know how to obtain for the subsets $E$ we are dealing with in the proof.

This said, we can give a version of Theorem 2.1 for the probability measure $\mu_{\alpha}$ , with density $Z_{\alpha}^{-1}e^{-x^{\alpha}}\mathds{1}_{x\geq 0}$ , which will be sufficient to prove a LDP result for the last-passage time.

2.3 Theorem.

Let $\alpha\in(0,1]$ and $N\subset\mathbb{N}$ an infinite subset. Let $X_{n}$ be a random variable distributed according to $\mu_{\alpha}^{n}$ . Let $f_{n},F_{n}:\mathbb{R}^{n}\to\mathbb{R}$ be measurable functions. Let $(v_{n})_{n\in N}$ be a sequence going to $+\infty$ . Define $I_{\alpha}$ as in (4), and for $\delta>0$ and $n\in N$ ,

[TABLE]

*Assume $(i)-(ii)-(iii)$ from Theorem 2.1, and,

(iv)’. For any $x\in\mathcal{X}$ ,*

[TABLE]

Then $(f_{n}(X_{n}))_{n\in N}$ satisfies a LDP with speed $v(n)$ and good rate function $I_{\alpha}$ .

*2.4 Remark**.*

We only state this result for $\alpha\in(0,1]$ because for $\alpha>1$ , we know how to get the lower bound (10) for a sequence $h_{n}\in\mathbb{R}_{+}^{n}$ only under the additional assumption on the speed that $n=o(v(n))$ . But this condition and the requirement $(ii)$ cannot be met simultaneously in the applications we will present.

2.1 Applications to Wigner matrices

We present now the applications of Theorem 2.1 to Wigner matrices. We denote by $\mathcal{H}_{n}^{(\beta)}$ the set of Hermitian matrices when $\beta=2$ , and symmetric matrices when $\beta=1$ , of size $n$ . We define $\mathcal{S}_{\alpha}$ the class of Wigner matrices whose law is of density $Z_{W_{\alpha}}^{-1}e^{-W_{\alpha}}$ with respect to the Lebesgue measure $\ell_{n}^{(\beta)}$ on $\mathcal{H}_{n}^{(\beta)}$ , where

[TABLE]

for some $b,a_{1},a_{2}\in(0,+\infty)$ , and where $Z_{W_{\alpha}}$ is the normalizing constant.

We will denote by $\mu_{A}$ the empirical spectral measure of a matrix $A\in\mathcal{H}_{n}^{(\beta)}$ , that is,

[TABLE]

where $\lambda_{1},...,\lambda_{n}$ are the eigenvalues of $A$ , and we will denote by $\lambda_{A}$ the largest eigenvalue of $A$ .

We will say that $X$ is a Wigner matrix if $X$ is a random Hermitian matrix with independent coefficients (up to the symmetry) such that $(X_{i,i})_{1\leq i\leq n}$ are identically distributed and $(X_{i,j})_{i<j}$ are identically distributed. If $\mathbb{E}|X_{1,2}-\mathbb{E}X_{1,2}|^{2}=1$ , then by Wigner’s theorem (see [2, Theorem 2.1.1, Exercice 2.1.16], [6, Theorem 2.5]), almost surely,

[TABLE]

where $\leadsto$ denotes the weak convergence, and $\mu_{sc}$ is the semi-circular law defined by,

[TABLE]

If we assume furthermore that $\mathbb{E}{X_{1,1}}^{2}<+\infty$ and $\mathbb{E}X_{1,2}^{4}<+\infty$ , then we know by [7], [6, Theorem 5.1],

[TABLE]

in probability.

As a consequence of Theorem 2.1, we have the following large deviations principles, originally proven in [19], in the case of the empirical spectral measure and in [4] for the largest eigenvalue.

2.5 Theorem.

Let $\alpha\in(0,2)$ . Assume $X$ is in the class $\mathcal{S}_{\alpha}$ such that $\mathbb{E}|X_{1,2}|^{2}=1$ . $(\mu_{X/\sqrt{n}})_{n\in\mathbb{N}}$ follows a LDP with respect to the weak topology with speed $n^{1+\alpha/2}$ , and good rate function $I_{\alpha}$ , defined for any probability measure $\mu$ on $\mathbb{R}$ by,

[TABLE]

where $d$ is a distance compatible with the weak topology, $\boxplus$ stands for the free convolution (see [2, section 2.3.3] for a definition), and $\mu_{sc}$ is the semi-circular law.

*2.6 Remark**.*

In [19], the rate function $I_{\alpha}$ is computed explicitly for measures $\mu_{sc}\boxplus\nu$ , where $\nu$ is a symmetric probability measure, for which we have

[TABLE]

2.7 Theorem.

Let $\alpha\in(0,2)$ . Assume that $X$ is in the class $\mathcal{S}_{\alpha}$ such that $\mathbb{E}|X_{1,2}|^{2}=1$ . $(\lambda_{X/\sqrt{n}})_{n\in\mathbb{N}}$ follows a LDP with speed $n^{\alpha/2}$ and good rate function $J_{\alpha}$ , defined for any $x\in\mathbb{R}$ by,

[TABLE]

with

[TABLE]

and where $g_{\mu_{sc}}$ denotes the Stieltjes transform of $\mu_{sc}$ , that is,

[TABLE]

*2.8 Remark**.*

The constant $c$ can be computed explicitly, we refer the reader to [4, section 8] for more details.

If $\textbf{X}=(X_{1},...,X_{p})$ is a collection of independent centered Wigner matrices such that $\mathbb{E}M_{1,2}^{2}=1$ for any $M\in\{X_{1},...,X_{n}\}$ , and with entries having finite moments of order $d$ , then for any non-commutative polynomial $P\in\mathbb{C}\langle\textbf{X}\rangle$ of total degree $d$ , we know by [2, Theorem 5.4.2],

[TABLE]

in probability, where $\tau_{n}=\frac{1}{n}\mathrm{tr}$ and $\textbf{s}=(s_{1},...,s_{p})$ is a free family of $p$ semi-circular variables in a non-commutative probability space $(\mathcal{A},\tau)$ (see [2, section 5.3] for a definition).

Concerning the large deviations of such normalized traces of polynomials in independent matrices in the class $\mathcal{S}_{\alpha}$ , with $\alpha\in(0,2]$ we have the following result.

2.9 Theorem.

Let $\alpha\in(0,2]$ and $p,d\in\mathbb{N}$ , $d>\alpha$ . Assume $\textbf{X}=(X_{1},...,X_{p})$ is a collection of independent Wigner matrices in the class $\mathcal{S}_{\alpha}$ , such that for $M\in\{X_{1},...,X_{p}\}$ , $\mathbb{E}|M_{1,2}|^{2}=1$ . We assume that $X_{i}$ is distributed according to $Z_{W_{\alpha}}^{-1}e^{-W_{\alpha,i}}d\ell_{n}^{(\beta)}$ , where $W_{\alpha,i}$ is of the form (12). Let $P\in\mathbb{C}\langle\textbf{X}\rangle$ be a non-commutative polynomial of total degree $d$ . We denote by $\tau_{n}$ the state $\frac{1}{n}\mathrm{tr}$ on $\mathcal{H}_{n}^{(\beta)}$ . The sequence

[TABLE]

satisfies a LDP with speed $n^{\alpha\big{(}\frac{1}{2}+\frac{1}{d}\big{)}}$ and good rate function $K_{\alpha}$ , defined for all $x\in\mathbb{R}$ by

[TABLE]

where for any $\sigma\in\{-1,1\}$ ,

[TABLE]

where $W_{\alpha}(\textbf{H})=\sum_{i=1}^{p}W_{\alpha,i}(H_{i})$ and $P_{d}$ is the homogeneous part of degree $d$ of $P$ .

*2.10 Remark**.*

Unlike the previous results on deviations of the spectral measure and the largest eigenvalue, this one allows us to consider Gaussian matrices. As we will see in the proof, the mechanism of deviations of traces of polynomials is the same in both cases $\alpha\in(0,2)$ , and $\alpha=2$ . This is essentially due to the fact that still in the Gaussian case there is a heavy-tail phenomena which appears when the degree of the polynomial is strictly greater than $2$ since there is no exponential moments.

This large deviations principle is an extension, although in a more restricted setting, of the large deviations principle proven in [5], in the case where $p=1$ and $P=X^{d}$ for some $d\geq 3$ , for Gaussian matrices and Wigner matrices without Gaussian tails.

2.2 Application to last-passage percolation

Let $d\in\mathbb{N}$ , $d\geq 2$ . We denote by $\mathbb{Z}_{+}^{d}$ the subset of vectors of $\mathbb{Z}^{d}$ with non-negative coordinates. Let $(X_{v})_{v\in\mathbb{Z}_{+}^{d}}$ be a collection of weights. We will call a directed path a path in which at each step, one coordinate is increased by $1$ . For $v_{1},v_{2}\in\mathbb{Z}^{d}_{+}$ , we denote by $\Pi(v_{1},v_{2})$ the set of directed paths from $v_{1}$ to $v_{2}$ . We will identify a path with the set of its vertices. We define the last-passage time $T_{v_{1},v_{2}}(X)$ , by

[TABLE]

We know by a work of Martin [34], that if the weights $X_{v}$ are i.i.d random variables with common distribution function $F$ satisfying,

[TABLE]

then for any $v\in\mathbb{R}^{d}_{+}$ ,

[TABLE]

where $g$ is a continuous function on $\mathbb{R}^{d}_{+}$ .

As an application of Theorem 2.3, we will get the following LDP for the last-passage time.

2.11 Theorem.

Let $\alpha\in(0,1)$ . For any $n\in\mathbb{N}$ , we set $T(X)=T_{0,(n,...,n)}(X)$ . Let $(X_{v})_{v\in\mathbb{Z}^{d}_{+}}$ be a family of i.i.d random variables distributed according to $\mu_{\alpha}$ . The sequence $T(X)/n$ satisfies a LDP with speed $n^{\alpha}$ and good rate function $L_{\alpha}$ , defined by

[TABLE]

2.3 Concentration inequalities

In order to prove that assumption $(i)$ holds in the context of Wigner matrices in the class $\mathcal{S}_{\alpha}$ when $\alpha\in(0,2)$ for the largest eigenvalue and the empirical spectral measure, we will prove some concentration inequalities for Wigner matrices which we would like to present as they can be of independent interest.

To derive such concentration inequalities for functions of the spectrum of random matrices, we will follow the classical argument which consists in considering our functionals as functions of the entries, and taking advantage of the concentration property of the law of the underlying random matrix. This approach is made possible in the setting where the spectrum is a smooth function of the entries, which will be our case as we will work with Hermitian matrices.

For Wigner matrices with bounded entries, or satisfying a Log-Sobolev inequality, or also for certain unitarily or orthogonally invariant models, concentration inequalities for Lipschitz (convex) linear statistics of the eigenvalues and for the largest eigenvalue, have been extensively studied by Guionnet-Zeitouni [28], Guionnet [27, Part II], and Ledoux [32, Chapter 8 §8.5] (see also [2, sections 2.3, 4.4]).

More precisely, we will provide concentration inequalities for the linear statistics, the spectral measure and the largest eigenvalue of random Hermitian matrices satisfying a certain concentration property which will be indexed by some $\alpha\in(0,2]$ . As we will see, this concentration property will capture the gradation of speeds of large deviations for the spectral functionals we are interested in, as it has been observed in Theorems 2.5 and 2.7.

We now present the concentration property with which we will be working.

2.12 Definition.

Let $\alpha\in(0,2]$ . We will say in the following that a Wigner matrix $X$ satisfies the concentration property $\mathcal{C}_{\alpha}$ , if there is a constant $\kappa>0$ , such that for any Borel subset $A$ of $\mathcal{H}_{n}^{(\beta)}$ , such that $\mathbb{P}(X\in A)\geq 1/2$ , and any $t>0$ ,

[TABLE]

if $\alpha\in[1,2]$ , and

[TABLE]

if $\alpha\in(0,1)$ , where for any $p>0$ ,

[TABLE]

with

[TABLE]

When $\alpha\in[1,2]$ , the motivation for defining this concentration property $\mathcal{C}_{\alpha}$ comes from Talagrand’s famous two-levels deviation inequality [39] for the measure $\nu_{\alpha}^{n}$ , which says that there is a constant $L>0$ such that for any $n\in\mathbb{N}$ , any Borel subset $A$ of $\mathbb{R}^{n}$ with $\nu_{\alpha}^{n}(A)>0$ , and $r>0$ ,

[TABLE]

and similarly for $\mu_{\alpha}$ .

In particular, the Wigner matrices in the class $\mathcal{S}_{\alpha}$ for $\alpha\in[1,2]$ satisfy the concentration property $\mathcal{C}_{\alpha}$ with some $\kappa$ depending on the parameters $b,a_{1},a_{2}$ of the law of $X$ (see (12)). More generally, we know by the results of Bobkov-Ledoux [16, Corollary 3.2], and Gozlan [25, Proposition 1.2] that if $X$ is a Wigner matrix with entries satisfying a certain Poincaré-type inequality, where the underlying metric on $\mathbb{R}^{m}$ , $m=1,2$ , is the following,

[TABLE]

where $\omega_{\alpha}(t)=\mathrm{sg}(t)\max(|t|,|t|^{\alpha})$ , $\mathrm{sg}(t)$ standing for the sign of $t$ , then $X$ satisfies the concentration property $\mathcal{C}_{\alpha}$ with some constant $\kappa$ depending on the spectral gap. We will get into more details in section 5 about this functional inequality, and present some workable criterion available for a Wigner matrix to satisfy $\mathcal{C}_{\alpha}$ when $\alpha\in[1,2]$ .

When $\alpha\in(0,1)$ , the concentration property of the law of Wigner matrices in the class $\mathcal{S}_{\alpha}$ differs significantly from the case where $\alpha\in[1,2]$ . We know by Talagrand [38, Proposition 5.1] that as $\nu_{\alpha}$ does not have exponential tails, $\nu_{\alpha}^{n}$ cannot satisfy a dimension-free concentration inequality. Transporting $\nu_{1}^{n}$ onto $\nu_{\alpha}^{n}$ , we will prove the following deviation inequality.

2.13 Proposition.

Let $n\in\mathbb{N}$ , $n\geq 2$ . There is a constant $c>0$ depending on $\alpha$ , such that for any $r>0$ , $A$ Borel subset of $\mathbb{R}^{n}$ , and $C>0$ such that $\nu_{\alpha}^{n}(A)>1/C$ ,

[TABLE]

We will discuss in remark 5.4 in section 5.2 the optimality of such a deviation inequality for $\nu_{\alpha}$ . The above proposition justifies the definition of the concentration property $\mathcal{C}_{\alpha}$ in the case where $\alpha\in(0,1)$ , as it implies that Wigner matrices in the class $\mathcal{S}_{\alpha}$ satisfy this property when $\alpha\in(0,1)$ .

Regarding the linear statistics of Wigner matrices having concentration $\mathcal{C}_{\alpha}$ , we will consider different families of function whether $\alpha\in(0,1)$ or $\alpha\in[1,2]$ . To this end, we define $\mathcal{M}_{s}^{\alpha}$ the set of finite signed measures $\sigma$ such that its total variation $|\sigma|$ has a finite $\alpha^{\text{th}}$ -moment. Following [37, Chapter 2 §5.1], we define when $\alpha\in(0,1)$ , the fractional integrals of order $\alpha+1$ of $\sigma\in\mathcal{M}_{s}^{\alpha}$ , by

[TABLE]

This definition interpolates for non-integer order the usual iterated integral (see [37, Chapter 1 §2.3] for more details). With these definitions, we will prove the following deviations inequalities.

2.14 Proposition.

Let $\alpha\in(0,2]$ . Let $X$ be a Wigner matrix having concentration $\mathcal{C}_{\alpha}$ with some $\kappa>0$ . There is a constant $c_{\alpha}>0$ such that if $\alpha\in[1,2]$ and $f:\mathbb{R}\to\mathbb{R}$ is some $1$ -Lipschitz function, then for any $t>0$ ,

[TABLE]

if $\alpha\in(0,1)$ , $f$ is $1$ -Lipschitz and moreover $f=\mathcal{I}_{\pm}^{\alpha+1}(\sigma)$ for some $\sigma\in\mathcal{M}_{s}^{\alpha}$ such that $|\sigma|(\mathbb{R})\leq m$ , then for any $t>0$ ,

[TABLE]

where $m_{f}$ denotes a median of $\mu_{X/\sqrt{n}}$ .

*2.15 Remark**.*

The reason for considering the class of function $\mathcal{I}_{\pm}^{\alpha+1}(\mathcal{M}_{s}^{\alpha})$ in the case $\alpha\in(0,1)$ , comes from the fact that we only understand the stability of the empirical spectral measure with respect to $||\ ||_{\ell^{\alpha}}$ , by using a certain distance $d_{\alpha}$ which controls this class of functions (see section 5.4 for more details).

Still in the case $\alpha<1$ , note that one cannot expect the above concentration inequality to be true for all Lipschitz functions, since a change of large deviations speed may occur as the entries of $X$ do not have exponential tails. Indeed, for example if $X$ is in the class $\mathcal{S}_{\alpha}$ , Theorem 2.9 tells us the speed of large deviations of $\frac{1}{n}\mathrm{tr}(X/\sqrt{n})$ is $n^{3\alpha/2}$ .

*2.16 Remark**.*

One can identify the image $\mathcal{I}_{\pm}^{\alpha+1}(\mathcal{M}_{s}^{\alpha})$ , by a minor change of [37, Theorem 6.3]. To ease the notation, we will only describe $\mathcal{I}_{+}^{\alpha+1}(\mathcal{M}_{s}^{\alpha})$ . For any $\varphi\in L^{1}(\mathbb{R})$ , one can define the fractional integral of order $\alpha$ by,

[TABLE]

The function above is well-defined almost everywhere as $t^{-\alpha}\varphi(x-t)$ is integrable on a neighborhood of [math] for almost all $x$ by Fubini theorem. With this definition, the set $\mathcal{I}_{+}^{\alpha+1}(\mathcal{M}_{s}^{\alpha})$ consists of the functions $f$ such that there is some $\varphi\in L^{1}(\mathbb{R})$ and $\sigma\in\mathcal{M}_{s}^{\alpha}$ , such that

[TABLE]

*2.17 Remark**.*

Note also that the exponential bound can be simplified in the case $\alpha\in(0,1)$ if $m\geq c_{0}$ , where $c_{0}$ is a constant independent of $n$ . One gets then, for any $t>0$ ,

[TABLE]

In order to state our concentration inequality for the spectral measure, we will work with the following distance $d$ defined on the set of probability measures on $\mathbb{R}$ , denoted by $\mathcal{P}(\mathbb{R})$ , in order to quantify the deviations:

[TABLE]

where $\mathcal{K}$ is a compact subset of $\{z\in\mathbb{C}:\Im z\geq 2\}$ with an accumulation point, such that $\mathrm{diam}(\mathcal{K})\leq 1$ , and with $g_{\mu}$ the Stieltjes transform of $\mu$ , that is,

[TABLE]

where $\mathbb{C}^{+}=\{z\in\mathbb{C}:\Im z>0\}$ . This distance metrizes the weak topology on $\mathcal{P}(\mathbb{R})$ by [2, Theorem 2.4.4].

We will prove the following concentration inequalities for the empirical spectral measure and the largest eigenvalues of Wigner matrices having concentration $\mathcal{C}_{\alpha}$ .

2.18 Proposition.

Let $\alpha\in(0,2]$ . Let $X$ be a Wigner matrix satisfying $\mathcal{C}_{\alpha}$ with some $\kappa>0$ . There exists a constant $c_{\alpha}>0$ , depending on $\alpha$ , such that for any $t>0$ ,

[TABLE]

where $\delta_{n}=O\big{(}\kappa n^{-1}(\log n)^{(1/\alpha-1)_{+}}\big{)}$ , and where for $\alpha\in[1,2]$ ,

[TABLE]

whereas for $\alpha\in(0,1)$

[TABLE]

2.19 Proposition.

Let $\alpha\in(0,2]$ . Let $X$ be a Wigner matrix satisfying $\mathcal{C}_{\alpha}$ for some $\kappa>0$ . There is a constant $c_{\alpha}>0$ , such that for any $t>0$ ,

[TABLE]

where

[TABLE]

if $\alpha\in[1,2]$ , and

[TABLE]

if $\alpha\in(0,1)$ , and where $\varepsilon_{n}=O(\kappa n^{-1/2}(\log n)^{(1/\alpha-1)_{+}})$ , uniformly in $H\in\mathcal{H}_{n}^{(\beta)}$ .

2.4 Spectral variation inequalities

We would like also to advertise for some spectral variation inequalities, which are not particularly new, but which are maybe a little less known in the form we will propose. Indeed, to obtain the concentration inequality of Proposition 2.18, we need to understand the stability of the spectrum of Hermitian matrices with respect to the distance $||\ ||_{\ell^{p}}$ for $p\geq 1$ or $||\ ||_{\ell^{p}}^{p}$ when $p<1$ .

For $p\geq 1$ , define the $L^{p}$ -Wasserstein distance on the set of probability measures on $\mathbb{R}$ with finite $p^{\text{th}}$ -moment by,

[TABLE]

where the infimum is over all coupling $\pi$ between $\mu$ and $\nu$ , two probability measures on $\mathbb{R}$ with finite $p^{\text{th}}$ -moment.

When $p\geq 1$ , we get as a mere consequence of Lidskii’s theorem (see [15, Theorem III.4.1]) the following lemma.

2.20 Lemma.

Let $p\in[1,2]$ , and $A,B\in\mathcal{H}_{n}^{(\beta)}$ .

[TABLE]

As a consequence,

[TABLE]

Whereas for $p<1$ , we obtain by Rofteld’s inequality (see [15, Theorem IV.2.14] or [41]) the following.

2.21 Lemma.

Let $p\in(0,1)$ . Let $A,B\in\mathcal{H}_{n}^{(\beta)}$ . For any $t\in\mathbb{R}$ ,

[TABLE]

where $\lambda_{1}(A),...,\lambda_{n}(A)$ denote the eigenvalues of $A$ , and similarly for $B$ . Furthermore, there is a positive constant $C_{p}$ , such that for any $A,B\in\mathcal{H}_{n}^{(\beta)}$ ,

[TABLE]

with

[TABLE]

Acknowledgements

I would like to thank my supervisor Charles Bordenave for his inspiring advice and the many fruitful conversations which helped me build the present paper. I am also grateful to Franck Barthe and Michel Ledoux for precious conversations and references, as well as Guillaume Aubrun for pointing me out the result of [24, Proposition 3.2.2]. I would like also to thank IMPA for its welcome, where this work was partially carried out.

2.5 Organization of the paper

In the section 3, we prove some inf-convolution inequalities for $\nu_{\alpha}^{n}$ . As the large deviations of our functional $f_{n}$ are governed by translates, we will need some sharp deviation inequalities with respect to the metric $||\ ||_{\ell^{\alpha}}$ (or $||\ ||_{\ell^{\alpha}}^{\alpha}$ when $\alpha<1$ ). We will provide a family of weights $W_{\alpha,\varepsilon}$ which captures the asymptotics of the tail distribution of $\nu_{\alpha}^{n}$ , that is, behaving like $||x||_{\ell^{\alpha}}^{\alpha}$ when $||x||_{\infty}\gg 1$ . This will be done by transporting and tensoring the family of optimal weights known for the exponential law due to Talagrand [40, Theorem 1.2].

In the section 4, we give a proof of Theorems 2.1 and 2.3. The upper bound relies on Proposition 4.1 which gives a large deviations sharp upper bound for $\nu_{\alpha}^{n}$ with respect to the metric $||\ ||_{\ell^{\alpha}}$ using the inf-convolution inequalities proven in section 3. The lower bound is given by Proposition 4.4 which estimates at the exponential scale $v(n)$ the probability, under $\nu_{\alpha}^{n}$ , of an event translated by some element $v(n)^{1/\alpha}h_{n}$ .

The rest of the paper is devoted to applications to Wigner matrices and the last-passage time.

In the section 5, we prove the concentration inequalities of Propositions 2.18 and 2.19 for the largest eigenvalue, linear statistics and empirical spectral measure of Wigner matrices satisfying the concentration property $\mathcal{C}_{\alpha}$ defined in (15) and (16). To do so, we will prove and discuss the spectral variations inequalities in Lemmas 2.20 and 2.21 in section 5.4.

In section 6, we show some uniform deterministic equivalents for the spectral measure, largest eigenvalue and traces of non-commutative polynomials of deformed Wigner matrices in the class $\mathcal{S}_{\alpha}$ . To make the equivalents for the spectral measure and largest eigenvalue of hold uniformly for $\alpha<2$ , we make use of the concentration inequalities we proved in section 5, and perform a classical chaining argument.

In section 7, we provide a deterministic equivalent for the last-passage time under additive deformations of the weights. The strategy to make our equivalent hold uniformly will be the same as for the case of the spectral measure and largest eigenvalue of Wigner matrices in the class $\mathcal{S}_{\alpha}$ , meaning that it will rely on concentration and chaining arguments.

In section 8, we apply Theorem 2.1 in the setting of Wigner matrices in the class $\mathcal{S}_{\alpha}$ , to the spectral measure, the largest eigenvalue (for $\alpha\in(0,2)$ ) and to traces of non-commutative polynomials (for $\alpha\in(0,2]$ ). Using of the uniform deterministic equivalents we proved in section 6, we give a proof of Theorems 2.5, 2.7, and 2.9.

Finally we prove in section 9, the large deviations principle for the last-passage time of Theorem 2.11 by applying Theorem 2.3 and using the uniform deterministic equivalent proved in section 7.

3 Inf-convolution inequalities for $\nu_{\alpha}^{n}$

Let $\nu$ be a probability measure on $\mathbb{R}^{n}$ , and let $w$ be a measurable function on $\mathbb{R}^{n}$ taking non-negative values. Following Maurey (see [35]), we will say that $(\nu,w)$ satisfies the $\tau$ -property if for any non-negative measurable function $f$ on $\mathbb{R}^{n}$ ,

[TABLE]

where $\Box$ denotes the inf-convolution, that is,

[TABLE]

The $\tau$ -property is closely linked to transportation-cost inequalities. By the Kantorovitch duality (see [42, Theorem 5.10]), and the duality of the entropy (see [22, Lemma 6.2.13]), it is known that under mild assumptions on $w$ that the following general inf-convolution inequality,

[TABLE]

satisfied for any non-negative measurable function $f$ is equivalent to the following transportation-cost inequality: for any $\mu$ probability measure on $\mathbb{R}^{n}$ ,

[TABLE]

where $D(\mu||\nu)$ is the relative entropy of $\mu$ with respect to $\nu$ , and

[TABLE]

In particular, under the assumption that $w$ is upper semi-continuous, Kantorovitch duality is valid by [42, Theorem 5.10], so that the equivalence above between (24) and (26) holds.

One can observe that if $(\nu,w)$ satisfies the $\tau$ -property, then by Jensen’s inequality, it satisfies also the general inf-convolution inequality (24), and therefore $\nu$ satisfies the transportation-cost inequality (25) with cost function $w$ .

Conversely, according to [25, Proposition 4.13], if $\nu$ satisfies the transportation-cost inequality (25) with cost function $w$ , then $(\nu,w\Box w)$ satisfies the $\tau$ -property. If moreover $w$ is sub-additive, then one can see that $w\Box w=w$ and thus $(\nu,w)$ satisfies the $\tau$ -property. Whereas if $w$ is convex, then $w\Box w=2w(./2)$ so that $(\nu,2w(./2))$ satisfies the $\tau$ -property. This remark will be useful later when we will need to translate a transportation-cost inequality into a $\tau$ -property.

More importantly for us, the $\tau$ -property yields deviations bounds with respect to enlargements by the weight $w$ . We know from [35, Lemma 4], that if $(\nu,w)$ satisfies the $\tau$ -property, then for any Borel subset $A$ of $\mathbb{R}^{n}$ , and any $t>0$ ,

[TABLE]

We define another form of inf-convolution inequality, designed to enable us to get the best constants in our weight functions, (and also to deal with the measure $\nu_{\alpha}^{n}$ when $\alpha\in(0,1)$ ), which we will call the truncated $\tau$ -property. More precisely, we will say that a measure $\nu$ on $\mathbb{R}^{n}$ with the weight function $w$ , satisfies the $A_{0}$ -truncated $\tau$ -property, where $A_{0}$ is a Borel subset of $\mathbb{R}^{n}$ , if (23) is true for any non-negative measurable function $f$ such that $f=+\infty$ on $A_{0}^{c}$ .

This $A_{0}$ -truncated $\tau$ -property yields a deviation inequality with respect to enlargement by the weight $w$ of the following form: for any Borel subset $A$ of $\mathbb{R}^{n}$ such that $\nu(A)>0$ , and any $r>0$ ,

[TABLE]

The goal of this section is to find, for the measure $\nu_{\alpha}^{n}$ , when $\alpha\in(0,2)$ , a family of weights $W_{\alpha,\varepsilon}$ for which a truncated $\tau$ -property is satisfied, and which captures the asymptotics of the tail distribution of $\nu_{\alpha}^{n}$ . More precisely, we will prove the following proposition.

3.1 Proposition.

Let $\alpha>0$ . If $\alpha=1$ , then for any $\varepsilon<1/2$ , $(\nu_{1}^{n},W_{1,\varepsilon})$ satisfies the $\tau$ -property with

[TABLE]

where

[TABLE]

If $\alpha\neq 1$ , there are some constants $\kappa>0$ and $\varepsilon_{0}\in(0,1)$ such that for $\varepsilon\in(0,\varepsilon_{0})$ and $m\geq 1$ , $(\nu_{\alpha}^{n},W_{\alpha,\varepsilon}^{(m)})$ satisfies the $mB_{\ell^{\infty}}$ -truncated $\tau$ -property, where

[TABLE]

with

[TABLE]

The rest of this section will be devoted to proving the above proposition. We will reduce the problem in a first phase to the one-dimensional case, and to an estimation of the monotone rearrangement of $\nu_{1}$ onto $\nu_{\alpha}$ .

As the usual $\tau$ -property (see [35, Lemma 1]), the truncated version of the $\tau$ -property tensorizes in the following way.

3.2 Lemma.

Let $\nu_{i}$ be a probability measure defined on some measurable space $\mathcal{X}_{i}$ , $A_{i}$ be some measurable subset of $\mathcal{X}_{i}$ and $w_{i}:\mathcal{X}_{i}\to\mathbb{R}_{+}$ be a measurable function, for $i=1,2$ .

If $(\nu_{i},w_{i})$ satisfies the $A_{i}$ -truncated $\tau$ -property for $i=1,2$ , then $(\nu_{1}\otimes\nu_{2},w)$ satisfies the $A_{1}\times A_{2}$ -truncated $\tau$ -property with

[TABLE]

Since we are dealing with the product measure $\nu_{\alpha}^{n}$ , we can focus on studying the $\tau$ -property for the one-dimensional marginal $\nu_{\alpha}$ .

For the exponential measure, we have the following result due to Talagrand, which gives a family of optimal weights $c_{\lambda}$ .

3.3 Proposition ([39, Theorem 1.2]).

Let $\lambda\in(0,1)$ . Define the weight function $c_{\lambda}$ for any $x\in\mathbb{R}$ by,

[TABLE]

For any $\lambda\in(0,1)$ , $\nu_{1}$ satisfies a transportation-cost inequality (25) with cost function $c_{\lambda}$ .

Note that, $c_{\lambda}(x)\sim_{\pm\infty}(1-\lambda)|x|$ . Thus, when $\lambda\ll 1$ , $c_{\lambda}$ captures the exact asymptotics of the tail distribution of the exponential law.

For technical reasons, we prefer to work with a different family of weights than the one defined in Proposition 3.3. In the following corollary, we reformulate Talagrand’s result for the symmetric exponential measure $\nu_{1}$ .

3.4 Corollary.

Let $\delta>0$ . We define the weight function $w_{\varepsilon}$ , for any $t\in\mathbb{R}$ , by

[TABLE]

For any $\delta\in(0,1/2)$ , $(\nu_{1},w_{\delta})$ satisfies the $\tau$ -property. As a consequence, $(\nu_{1}^{n},W_{1,\delta})$ satisfies the $\tau$ -property, with $W_{1,\delta}$ defined in Proposition 3.1.

This reformulation reveals in particular the structure of the enlargements given by the weights $c_{\lambda}$ which consist in a mixture of $\ell^{2}$ and $\ell^{1}$ -balls.

Proof.

As $c_{\lambda}$ is a convex function, we know by [25, Proposition 4.13] that $(\nu_{1},2c_{\lambda}(./2))$ satisfies the $\tau$ -property. To prove Corollary 3.4, it suffices to prove that $w_{\delta}\leq 2c_{\delta}(./2)$ for any $\delta\in(0,1/2)$ . Since both functions are even, it is sufficient to prove the inequality on $\mathbb{R}_{+}$ . Let $t>0$ . By Taylor’s formula

[TABLE]

for some $y\in[0,t]$ . If $t\leq 2/\delta^{2}$ and $\delta\leq 1/2$ , we get

[TABLE]

If $t\geq 1/\delta^{2}$ , we have

[TABLE]

Thus, $2c_{\delta}(t/2)\geq(1-2\delta)t$ for $t\geq 2/\delta^{2}$ .

After tensorization (see [35, Lemma 1]), we obtain that $(\nu_{1}^{n},W_{1,\delta})$ satisfies the $\tau$ -property with $W_{1,\delta}$ defined in Proposition 3.1.

∎

For $\alpha\neq 1$ , the general strategy is to transport this $\tau$ -property of the symmetric exponential law to obtain a $\tau$ -property for $\nu_{\alpha}$ . It extends in our setting of truncated $\tau$ -property, a result of Maurey [35, Lemma 2].

3.5 Lemma.

Let $A$ be a Borel subset of $\mathbb{R}^{n}$ . Let $\mu$ be a probability measure on $\mathbb{R}^{n}$ and let $\psi:\mathbb{R}^{n}\to\mathbb{R}^{n}$ be a bijective measurable map. Assume $(\mu,w)$ satisfies the $\tau$ -property. Let $A$ be a Borel subset of $\mathbb{R}^{n}$ and let $\tilde{w}$ be a weight function such that,

[TABLE]

Then, $(\mu\circ\psi^{-1},\tilde{w})$ satisfies the $A$ -truncated $\tau$ -property.

Proof.

Let $f:\mathbb{R}^{n}\to\mathbb{R}$ be a measurable non-negative function being $+\infty$ on $A^{c}$ . Applying the $\tau$ -property of $(\mu,w)$ to $f\circ\psi$ , we get

[TABLE]

But, as $\psi$ is a bijection and $f=+\infty$ on $A^{c}$ ,

[TABLE]

From the assumption on $\tilde{w}$ , we deduce

[TABLE]

Therefore,

[TABLE]

∎

In particular, in the one-dimensional case, if $(\mu,w)$ satisfies the $\tau$ -property and $w$ is even and non-decreasing on $\mathbb{R}_{+}$ , then $\mu\circ\psi^{-1}$ satisfies the $A$ -truncated $\tau$ -property with any even weight function $\tilde{w}$ such that

[TABLE]

where $\Delta_{A}$ is defined for any $s\geq 0$ by,

[TABLE]

If $\mu$ and $\nu$ are two probability measures on $\mathbb{R}$ , we define the monotone rearrangement $T$ of $\mu$ onto $\nu$ by,

[TABLE]

This defines a unique non-decreasing map if the distribution function of $\nu$ is invertible, which sends $\mu$ to $\nu$ .

Let $\psi$ be the monotone rearrangement of $\nu_{1}$ onto $\nu_{\alpha}$ . One can easily check that $\psi$ is an odd function, and that its restriction $\varphi$ on $\mathbb{R}^{+}$ satisfies,

[TABLE]

where $Z_{\alpha}$ is the normalizing constant of $\mu_{\alpha}$ , so that $\varphi$ is the monotone rearrangement of $\mu_{1}$ onto $\mu_{\alpha}$ . Thus, we are reduced to understand the behavior of the map $\varphi$ and how it deforms the weights $c_{\varepsilon}$ of Proposition 3.3.

3.1 Behavior of the monotone rearrangement

When $\alpha\geq 1$ , we have the following estimate on the monotone rearrangement due to Talagrand [39].

3.6 Lemma ([39, Lemma 2.5]).

Let $\alpha\geq 1$ . Let $\psi$ be the monotone rearrangement sending $\nu_{1}$ to $\nu_{\alpha}$ . Denote by $\Delta$ the function defined for any $s\geq 0$ by,

[TABLE]

There is a constant $c>0$ depending on $\alpha$ such that for any $s\geq 0$ ,

[TABLE]

*3.7 Remark**.*

In [39, Lemma 2.5], this estimate is derived for the monotone rearrangement $\varphi$ of $\mu_{1}$ onto $\mu_{\alpha}$ . But since,

[TABLE]

one easily deduces the same estimate for $\psi$ , together with the fact that if $x,y$ have opposite signs,

[TABLE]

where $c^{\prime}$ is some constant and where we used the fact that $|x-y|=|x|+|y|$ .

To get the exact asymptotic of the tail distribution of $\nu_{\alpha}$ we will need of the following finer estimate on the monotone rearrangement.

3.8 Lemma.

Let $\alpha\geq 1$ . Define for any $m\geq 1$ ,

[TABLE]

There is a constant $\gamma$ depending on $\alpha$ , such that for any $\varepsilon\in(0,1)$ , and $s\geq m\varepsilon^{-1}$ ,

[TABLE]

Proof.

By definition of $\psi$ , we have for any $x\in\mathbb{R}$ ,

[TABLE]

where $Z_{\alpha}$ is the normalizing constant of $\mu_{\alpha}$ . Let $s\geq m\varepsilon^{-1}$ and $x,y\in\mathbb{R}$ such that $0\leq|x|\leq m$ , and $|x-y|=s$ . If $x$ and $y$ have the same signs, we can assume without loss of generality, that both $x,y\geq 0$ . As $x\leq m\leq s$ , we have $y=x+s$ . Thus,

[TABLE]

We have, on one hand, as $s\geq 1$ ,

[TABLE]

And on the other hand,

[TABLE]

Therefore, as $s\geq m\varepsilon^{-1}$ ,

[TABLE]

for some constant $\gamma>0$ . Now, if $x$ and $y$ have opposite signs, we can assume without loss of generality that $x\leq 0$ and $y\geq 0$ . Then, $y\geq s-m$ so that,

[TABLE]

Thus, we can find some constant $\gamma^{\prime}$ such that $|\psi^{-1}(y)-\psi^{-1}(x)|\geq(1-\gamma^{\prime}\varepsilon)s^{\alpha}$ .

∎

*3.9 Remark**.*

The truncation we performed here is made to ensure we get the best constant (that is $1$ ) in the estimate of the large increments of the monotone rearrangement. Indeed, defining $\Delta$ as in (30), we would get for $s\gg 1$ ,

[TABLE]

with $2^{1-\alpha}<1$ .

When $\alpha<1$ , we get the following estimate on the monotone rearrangement of $\nu_{1}$ onto $\nu_{\alpha}$ . Note that as $\nu_{\alpha}$ does not have an exponential tail, the rearrangement map cannot be a Lipschitz function.

3.10 Lemma.

Let $\alpha\in(0,1)$ . Let $\varphi$ be the monotone rearrangement of $\mu_{1}$ onto $\mu_{\alpha}$ . There is a constant $K>0$ depending on $\alpha$ such that for any $x,y\in[0,+\infty)$ ,

[TABLE]

Proof.

This proof is very much in the spirit of [39, Lemma 2.5]. We begin by bounding from above

[TABLE]

when $x\geq 1$ . The change of variable $u=y^{\alpha}$ gives,

[TABLE]

Let $m=\lceil\frac{1}{\alpha}\rceil$ . Integrating by parts $m$ times, we get

[TABLE]

As $\frac{1}{\alpha}-m\leq 0$ , we deduce for any $x\geq 1$ ,

[TABLE]

where $K>0$ is some constant depending on $\alpha$ which will vary along the proof. Therefore, for any $x\geq 1$ ,

[TABLE]

By definition $\varphi$ satisfies for any $x>0$ ,

[TABLE]

This implies that $\varphi$ is an increasing homeomorphism of $\mathbb{R}_{+}$ . For $\varphi(x)\geq 1$ , we have

[TABLE]

From (34), we see that $\varphi$ is differentiable, and $\varphi^{\prime}$ satisfies for any $x\geq 0$ ,

[TABLE]

Thus by (35), we get for $t\geq\varphi^{-1}(1)$ ,

[TABLE]

Dividing by $\varphi(t)^{1-\alpha}$ and integrating on $[\varphi^{-1}(1),x]$ we get

[TABLE]

for any $x\geq\varphi^{-1}(1)$ . Hence,

[TABLE]

for $x\geq\varphi^{-1}(1)$ . By (36) we deduce

[TABLE]

Since $\varphi^{\prime}$ is continuous, at the price of taking $K$ larger, we have

[TABLE]

Let $x\geq 0$ , and $y\in\mathbb{R}$ such that $x+y\geq 0$ . If $x,x+y\leq 1$ ,

[TABLE]

Whereas if $x,x+y\geq 1$ ,

[TABLE]

Now, if $0\leq x\leq 1\leq x+y$ ,

[TABLE]

In conclusion, for any $x\geq 0$ , $x+y\geq 0$ ,

[TABLE]

The mean value theorem yields

[TABLE]

Using the convexity of $x\mapsto|x|^{\frac{1}{\alpha}-1}$ , if $1/\alpha\geq 1$ , or its sub-additivity, when $1/\alpha-1\in(0,1)$ , we get

[TABLE]

with $a_{\alpha}=\max(1,2^{\frac{1}{\alpha}-2})$ . Together with (38), this gives the claim. ∎

As in the case $\alpha\geq 1$ , we can refine the estimate of Lemma 3.10 to get the following result.

3.11 Lemma.

Let $\alpha\in(0,1)$ . Let $\psi$ be the monotone rearrangement of $\nu_{1}$ onto $\nu_{\alpha}$ . Let $\varepsilon\in(0,1)$ . Define the function $\Delta_{m}$ by,

[TABLE]

There is some constant $\gamma>0$ , such that

[TABLE]

Proof.

Since $\varphi$ and $\psi$ are linked by the the relation (31), the same estimate as in Lemma 3.10 holds for the Brenier map $\psi$ . Therefore, we have for any $|s|\leq\psi^{-1}(m)$ , and $t\in\mathbb{R}$ ,

[TABLE]

with $K\geq 1$ . Fix $|x|\leq m$ , and $y\in\mathbb{R}$ . We have

[TABLE]

But we know from (37) that for $m\geq 1$ , $\psi^{-1}(m)\geq c_{0}m^{\alpha}$ , with some constant $c_{0}>0$ . Thus, for $m\geq 1$ , there is a constant $\gamma>0$ , which will vary along the proof without changing name, such that

[TABLE]

We deduce that for $|y-x|\leq m/\varepsilon$ ,

[TABLE]

Let $s=|y-x|$ . Assume now $s\geq m/\varepsilon$ . Proceeding as in the proof of Lemma 3.8 in the case $\alpha\geq 1$ , we assume first that $x,y\geq 0$ . As $s\geq m\geq x$ , we must have $y=x+s$ . Then,

[TABLE]

On one hand, as $\alpha<1$ , we have using the sub-additivity of $u\in\mathbb{R}^{+}\mapsto u^{\alpha}$ ,

[TABLE]

and on the other hand, by (33),

[TABLE]

where $C$ is some constant depending on $\alpha$ . Thus,

[TABLE]

As $\log s\leq(2/\alpha)s^{\alpha/2}$ for $s\geq 1$ , we deduce that

[TABLE]

If $x$ and $y$ have opposite signs, we can assume $x\leq 0$ and $y\geq 0$ , thus $y=s-m$ and we get,

[TABLE]

As $s\geq m/\varepsilon$ , we deduce

[TABLE]

which ends the proof of the claim. ∎

3.2 A family of weights for $\nu_{\alpha}$

Using transport arguments, we will work in this section at obtaining a family of weights for $\nu_{\alpha}$ which capture its exact tail distribution.

3.12 Proposition.

Let $\alpha>0$ , $\alpha\neq 1$ , and $m\geq 1$ . There exist some constants $\kappa,\varepsilon_{0}>0$ depending on $\alpha$ such that for any $\varepsilon\in(0,\varepsilon_{0})$ , $(\nu_{\alpha},w_{\alpha,\varepsilon}^{(m)})$ satisfies the $[-m,m]$ -truncated $\tau$ -property where,

[TABLE]

Proof.

Let $\varepsilon\in(0,1)$ and $m\geq 1$ . Let $\delta>0$ such that

[TABLE]

With this choice of $\delta$ , we will prove that for $s\geq 0$ ,

[TABLE]

with the appropriate constants $\kappa$ and $\varepsilon_{0}$ , $w_{\delta}$ defined in Corollary 3.4, and where $\Delta_{m}$ is as in (32). Using the result of Lemma 3.5, this will yield the claim.

Let $\varepsilon$ be small enough such that $w_{\delta}$ is non-decreasing. This is possible since $\delta^{2}\leq 2\varepsilon^{\alpha}$ . Let $s\geq m/\varepsilon$ . If $\varepsilon$ is small enough, we have by Lemma 3.8 or 3.11,

[TABLE]

If $\alpha>1$ , then by Lemma 3.8 we get, as $\delta^{2}\leq 4\varepsilon^{\alpha}$ ,

[TABLE]

for some constant $\kappa$ which will vary along the proof. Similarly, when $\alpha<1$ , we get by Lemma 3.11,

[TABLE]

Now let $s\leq m/\varepsilon$ . Assume $\alpha\geq 1$ . By Lemma 3.6 and the fact that $w_{\delta}$ is non-decreasing, we have

[TABLE]

where $c$ is some positive constant. Without loss of generality, we can assume $c\leq 1/2$ . Then, as $m\varepsilon^{-1}\leq 4\delta^{-2}$ , we have $cs\leq 2\delta^{-2}$ , so that we get

[TABLE]

Using the fact that $\delta e^{-\frac{1}{\delta}}\geq c_{1}e^{-2/\delta}$ , for some constant $c_{1}>0$ , we get the claim in the case $\alpha>1$ . Assume now $\alpha<1$ . From Lemma 3.11 and the fact that $w_{\delta}$ is non-decreasing, we deduce

[TABLE]

Without loss of generality, we can assume that $\gamma\geq 2$ . As $m\varepsilon^{-1}\leq 4\delta^{-2}$ and $s\leq m/\varepsilon$ , we have

[TABLE]

Thus,

[TABLE]

with some $a>0$ . But, we can find some constant $c_{2}>0$ such that

[TABLE]

which, recalling that $(m\varepsilon^{-1})^{\alpha}=4\delta^{-2}$ gives the claim.

∎

We can now give a proof of Proposition 3.1.

Proof of Proposition 3.1.

As $(\nu_{\alpha},w_{\alpha,\varepsilon}^{(m)})$ satisfies the $[-m,m]$ -truncated $\tau$ -property for $\varepsilon\in(0,\varepsilon_{0})$ , for some $\varepsilon_{0}>0$ and any $m\geq 1$ by Proposition 3.12, we deduce by the tensorization property of the $\tau$ -property (see Lemma 3.2) that $(\nu_{\alpha}^{n},W_{\alpha,\varepsilon}^{(m)})$ satisfies the $mB_{\ell^{\infty}}$ -truncated $\tau$ -property with $W_{\alpha,\varepsilon}^{(m)}$ defined as in (29).

∎

4 Large deviations

We will prove in this section Theorem 2.1. As sketched in the introduction, the proof will consist in looking for, in a first phase, large deviations inequalities for $\nu_{\alpha}^{n}$ and lower bounds estimates of the probability of translates.

As a consequence of the truncated $\tau$ -property of Proposition 3.1, satisfied by $\nu_{\alpha}^{n}$ and the weight functions $W_{\alpha,\varepsilon}^{(m)}$ , we deduce an isoperimetric-type bound for $\nu_{\alpha}^{n}$ with respect to the metric $||\ ||_{\ell^{\alpha}}$ (or $||\ ||_{\ell^{\alpha}}^{\alpha}$ in the case $\alpha<1$ ). This estimate will be of paramount importance to derive the upper bound of Theorem 2.1.

4.1 Proposition.

Let $\alpha>0$ , $\alpha\neq 2$ . Let $r>0$ . Let $v(n)$ , $t(n)$ be two sequences going to $+\infty$ as $n$ goes to $+\infty$ . Let $E$ and $F$ be Borel subsets of $\mathbb{R}^{n}$ such that

[TABLE]

For $\alpha\neq 1$ , we assume that

[TABLE]

whereas for $\alpha=1$ , we assume $v(n)=o(t(n)^{2})$ . Then,

[TABLE]

*4.2 Remark**.*

For $\alpha=2$ , the Gaussian isoperimetric inequality (see [32, Theorem 2.5]) entails the same result without any further assumption on the speed $v(n)$ or the set $E$ than $\liminf_{n}\nu_{2}^{n}(E)>0$ .

Proof.

Before going into the proof per say, we need to relate the enlargements by the weights $W_{\alpha,\varepsilon}^{(m)}$ , for which we know that $(\nu_{\alpha}^{n},W_{\alpha,\varepsilon}^{(m)})$ satisfies the $\tau$ -property, and therefore a deviation inequality of the type (28), to the $\ell^{\alpha}$ -balls. This is the subject of the following lemma.

4.3 Lemma.

Let $\alpha>0$ . With the notation of Proposition 3.1, for any $r>0$ , $m\geq 1$ and $\varepsilon\in(0,\varepsilon_{0})$ ,

[TABLE]

with $k_{m}(\varepsilon)=\sqrt{\kappa}e^{\frac{1}{2}(\frac{m}{\varepsilon})^{\alpha/2}}$ . Moreover, there is a function $l:\mathbb{R}_{+}\to\mathbb{R}_{+}$ , such that

[TABLE]

Proof.

We will prove only the first statement, the proof for the second one being similar. Let $y\in\mathbb{R}^{n}$ . By cutting the entries of $y$ , we can find $y_{1},y_{2}\in\mathbb{R}^{n}$ , such that $y=y_{1}+y_{2}$ , for any $i\in\{1,...,n\}$ , $y_{1}(i)y_{2}(i)=0$ , and

[TABLE]

By the very definition of $W_{\alpha,\varepsilon}^{(m)}$ ,

[TABLE]

and

[TABLE]

Thus, if we let

[TABLE]

and if $W_{\alpha,\varepsilon}^{(m)}(y)\leq r(1-\kappa\varepsilon^{(\alpha/2)\wedge 1})$ , then $||y_{1}||_{\ell^{2}}\leq k_{m}(\varepsilon)\sqrt{r}$ , and $||y_{2}||_{\ell^{\alpha}}^{\alpha}\leq r$ .

∎

With this lemma proven, we can now give the proof of Proposition 4.1. We start with the case $\alpha=1$ . As $v(n)=o(t(n)^{2})$ , for $n$ large enough, we have $l(\varepsilon)\sqrt{rv(n)}\leq t(n)$ . Then, by Lemma 4.3, we have

[TABLE]

But by assumption, $F+t(n)B_{\ell^{2}}\subset E$ . Thus,

[TABLE]

We deduce that,

[TABLE]

As $(\nu_{1}^{n},W_{1,\varepsilon})$ satisfies the $\tau$ -property by Corollary 3.4, we have the following deviation inequality (see (27)),

[TABLE]

As $\liminf_{n}\nu_{1}^{n}(F)>0$ , we get

[TABLE]

Letting $\varepsilon$ going to [math], we get the claim.

Let now $\alpha\neq 1$ . Let $\varepsilon\in(0,\varepsilon_{0})$ and set $m=c(\log n)^{1/\alpha}$ , with some $c>0$ which is to be chosen later. By Lemma 4.3

[TABLE]

From the assumption that $(\log n)^{\alpha/2}=o(\log\frac{t(n)}{\sqrt{v(n)}})$ we deduce that for $n$ large enough,

[TABLE]

In particular for $n$ large enough,

[TABLE]

Put in another way

[TABLE]

Thus,

[TABLE]

As by assumption $F+t(n)B_{\ell^{2}}\subset E$ , we get

[TABLE]

As $(\nu_{\alpha}^{n},W_{\alpha,\varepsilon}^{(m)})$ satisfies the $mB_{\ell^{\infty}}$ -truncated $\tau$ -property by Proposition 3.1, we deduce the following deviation inequality (see (28)),

[TABLE]

But,

[TABLE]

Let $\Phi=\varphi^{\otimes n}$ , defined by $\Phi(x)=(\varphi(x_{i}))_{1\leq i\leq n}$ , where $\varphi$ is the monotone rearrangement map sending $\mu_{1}$ to $\mu_{\alpha}$ . Then $\Phi$ sends $\mu_{1}^{n}$ to $\mu_{\alpha}^{n}$ , so that,

[TABLE]

From (37), we deduce

[TABLE]

for some constant $K>0$ . But $\int||x||_{\infty}^{1/\alpha}d\mu_{1}^{n}(x)\leq c_{0}(\log n)^{1/\alpha}$ , for some constant $c_{0}\geq 1$ . Therefore,

[TABLE]

Thus by Markov’s inequality,

[TABLE]

since we chose $m=c(\log n)^{1/\alpha}$ . As $\liminf_{n}\nu_{\alpha}^{n}(F)>0$ by assumption, we deduce that for $c$ large enough,

[TABLE]

Therefore,

[TABLE]

which gives the claim by taking $\varepsilon\to 0$ . ∎

We show in the next proposition that we can bound from below the probability of translates under $\nu_{\alpha}^{n}$ .

4.4 Proposition.

Let $\alpha\in(0,2]$ . Let $v(n)$ be a sequence going to $+\infty$ as $n$ goes to $+\infty$ . Fix some $r>0$ . Let $E$ be some Borel subset of $\mathbb{R}^{n}$ such that

[TABLE]

(i). For any sequence $h_{n}$ of elements of $\mathbb{R}^{n}$ ,

[TABLE]

(ii). If $\alpha\in(0,1]$ , then for any sequence $h_{n}\in\mathbb{R}_{+}^{n}$ ,

[TABLE]

*4.5 Remark**.*

On can obtain the estimate $(ii)$ when $\alpha\in(1,2]$ for the measures $\mu_{\alpha}$ with the additional assumption $n=o(v(n))$ on the speed, which is actually very restrictive in the applications we have in mind. This is one of the reasons of the limitation of Theorem 2.3 to the case $\alpha\leq 1$ , since we do not know how to produce a meaningful lower bound of such translated sets in this case. Similarly, when $\alpha>2$ , one can see, at least for $\alpha$ integer, that the estimate $(i)$ does not hold unless $n=o(v(n))$ .

Proof.

The proof will essentially follow the lines of [23, Theorem 5.1]. Indeed, in the Gaussian case $\alpha=2$ , this lower bound is derived from the translation formula of the Gaussian measure. The proof for $\alpha<2$ will consist in mimicking the Gaussian case.

If the $\limsup$ in the right-hand side of $(i)$ is infinite, then the statement is trivial. If it is finite, we take some $\tau>0$ , such that $||h_{n}||_{\ell^{\alpha}}^{\alpha}\leq\tau$ , for all $n\in\mathbb{N}$ . Let for any $h\in\mathbb{R}^{n}$ , $W_{\alpha}(h)=\sum_{i=1}^{n}|h_{i}|^{\alpha}$ . Then, we have,

[TABLE]

where $\ell_{n}$ denotes the Lebesgue measure on $\mathbb{R}^{n}$ , and $Z_{n}$ is the normalizing factor. If $\alpha\in(0,1]$ , then for any $s,t\in\mathbb{R}$ ,

[TABLE]

Thus,

[TABLE]

Therefore,

[TABLE]

which gives the claim in the case $\alpha\in(0,1)$ . Note that the same argument for $\mu_{\alpha}$ instead of $\nu_{\alpha}$ gives without changes the estimate $(ii)$ .

Now, if $\alpha\in(1,2]$ , we have for any $s,t\in\mathbb{R}$ ,

[TABLE]

where $\mathrm{sg}(st)$ stands for the sign of $st$ . Thus, for any $y,h\in\mathbb{R}^{n}$ ,

[TABLE]

where

[TABLE]

and $v(y,h)=\mathrm{sg}(yh)|y|^{\alpha-1}|h|$ . We have,

[TABLE]

Jensen’s inequality yields,

[TABLE]

But, by Cauchy-Schwarz inequality,

[TABLE]

But $\int v(x,h)d\nu_{\alpha}(x)=0$ for any $h\in\mathbb{R}$ , since $v(-x,h)=-v(x,h)$ and $\nu_{\alpha}$ is symmetric. Thus,

[TABLE]

Using the fact that $\alpha\leq 2$ , we get,

[TABLE]

where $c>0$ is some constant. As $W_{\alpha}(h_{n})\leq\tau$ , we have

[TABLE]

Note that is was actually very important that we did not bound $\mathrm{sg}(xy)$ by $1$ in (41), so that $v(.,h)$ is of mean [math] under $\nu_{\alpha}$ , and $\int V(x,h_{n})^{2}d\nu_{\alpha}^{n}(x)$ is not too big. When one replaces $\nu_{\alpha}$ by $\mu_{\alpha}$ , this is exactly where one needs to make an assumption on the speed to identify the leading term.

By assumption, we know that there is some $\eta>0$ such that for $n$ large enough, $\nu_{\alpha}^{n}(E)>\eta$ . Thus, we get for $n$ large enough,

[TABLE]

Taking the $\liminf$ at the exponential scale $v(n)$ , we get the claim.

∎

We can now give a proof of Theorem 2.1. We will essentially follow the proof of the LDP of Wiener chaoses (see [31]), replacing the use of the Cameron-Martin formula by Proposition 4.4, and the Gaussian isoperimetric inequality with Proposition 4.1.

Proof of Theorem 2.1.

Without loss of generality we can and will assume that $N=\mathbb{N}$ . **Property of the rate function: ** By assumption $(iv)$ , for any $x\in\mathcal{X}$ ,

[TABLE]

This formulation shows that $I_{\alpha}(x)<+\infty$ if and only if there is a sequence $h_{n}\in\mathbb{R}^{n}$ , such that

[TABLE]

Thus, $I_{\alpha}(x)\leq\tau$ , for some fixed $\tau\geq 0$ , if and only if $x$ is a limit point of a sequence $(F_{n}(h_{n}))_{n\in N}$ such that $\limsup_{n}W_{\alpha}(h_{n})\leq\tau$ . Therefore, $I_{\alpha}$ is lower semi-continuous. Moreover,

[TABLE]

As by assumption $(iv)$ the set on the right-hand side is compact, we conclude that $I_{\alpha}$ is a good rate function.

**Lower bound: ** Let $x\in\mathcal{X}$ such that $I_{\alpha}(x)<+\infty$ . By assumption $(iv)$ , there is a sequence $h_{n}\in\mathbb{R}^{n}$ such that

[TABLE]

Let $\delta>0$ . For $n$ large enough,

[TABLE]

Let

[TABLE]

Note that

[TABLE]

By assumption $(i)$ , $\mathbb{P}(X_{n}\in E)$ goes to $1$ as $n$ goes to $+\infty$ . From Proposition 4.4, we deduce

[TABLE]

**Upper bound: ** Let $A$ be a closed subset of $\mathcal{X}$ . We can assume without loss of generality that $\inf_{A}I_{\alpha}>0$ . Let $r>0$ such that $\inf_{A}I_{\alpha}>r$ . Put in another way,

[TABLE]

As $I_{\alpha}$ is a good rate function, we can find a $\delta>0$ such that

[TABLE]

where $V_{\delta}$ denotes the $\delta$ -neighborhood for the distance $d$ . Thus,

[TABLE]

Let

[TABLE]

Define, similarly as for the lower bound, the event

[TABLE]

By assumption $(i)$ , we know that $\mathbb{P}(X_{n}\in E_{\delta})$ goes to $1$ as $n$ goes to $+\infty$ . We claim that

[TABLE]

Indeed, if $h_{n}\in r^{1/\alpha}B_{\ell^{\alpha}}$ and $x\in E_{\delta}$ , then $I_{\alpha}(F_{n}(h_{n}))\leq r$ , from the definition (4) of $I_{\alpha}$ , and

[TABLE]

so that $x+v(n)^{1/\alpha}h_{n}\in U$ . With this observation we get,

[TABLE]

If $\alpha=2$ , we get by the Gaussian isoperimetric inequality (see [32, Theorem 2.5]) for any $n$ large enough so that $\mathbb{P}(X_{n}\in E_{\delta})\geq 1/2$ ,

[TABLE]

which gives the upper bound.

Let now $\alpha<2$ , and $t=t_{\delta/4}$ , where $t_{\delta/4}$ is given by assumption $(ii)$ . With the notation of Theorem 2.1 define,

[TABLE]

By Markov’s inequality and assumption $(ii)$ , we deduce

[TABLE]

From assumption $(i)$ , we deduce that $\liminf_{n}\mathbb{P}(X_{n}\in F)>0$ . Furthermore, we claim that

[TABLE]

Recall that

[TABLE]

Now, if $X_{n}\in F$ and $h\in tB_{\ell^{2}}$ , then by definition of $\mathcal{L}_{n}$ , for all $k\in rB_{\ell^{\alpha}}$

[TABLE]

which yields (43) by triangular inequality. Thus the requirements of Lemma 4.1 are met, and we get

[TABLE]

As this inequality is true for any $r<\inf_{A}I_{\alpha}$ , we get the upper bound.

∎

We will end this section with the proof of Theorem 2.3.

Proof of Theorem 2.3.

We will follow the same steps as for the proof of Theorem 2.1. The compactness assumption $(iii)$ , and the assumption $(iv)^{\prime}$ yield that $I_{\alpha}$ is a good rate function. As shown in the proof of Theorem 2.1, a large deviations upper bound holds with speed $v(n)$ and rate function $I_{\alpha}$ , under the assumptions $(i)-(ii)-(iii)$ . Thus, we only have to prove the lower bound. Let $x\in\mathcal{X}$ such that $I_{\alpha}^{+}(x)<+\infty$ . We know that there is a sequence $h_{n}\in\mathbb{R}_{+}^{n}$ such that

[TABLE]

Proceeding as in the proof of Theorem 2.1, if $\delta>0$ , then for $n$ large enough,

[TABLE]

Let

[TABLE]

Note that

[TABLE]

By assumption $(i)$ , $\mathbb{P}(X_{n}\in E)$ goes to $1$ as $n$ goes to $+\infty$ . From Lemma 4.4, we deduce

[TABLE]

which ends the proof of the lower bound. Due to assumption $(iv)^{\prime}$ the lower bound and upper bound rate functions match so that a full LDP holds. ∎

5 Concentration inequalities

We will prove in this section the concentration inequalities of Propositions 2.14, 2.18 and 2.19 for the linear statistics, the empirical spectral measure and largest eigenvalue of Wigner matrices satisfying the concentration property $\mathcal{C}_{\alpha}$ introduced by definition 2.12.

5.1 Some examples of Wigner matrices satisfying $\mathcal{C}_{\alpha}$

Before going into the proofs, we will review some workable criterion for a Wigner matrix to satisfy the concentration property $\mathcal{C}_{\alpha}$ when $\alpha\in[1,2]$ . The case of $\alpha=2$ of normal concentration has drawn most of the attention, and we refer the reader to [32, section 8.5], [28] or also [27, Part II] for a presentation of the different examples of classical models of random matrices having normal concentration.

When $\alpha\in[1,2]$ we introduce the notion of Poincaré-type inequalities in the finite-dimensional setting. Let $d_{m}$ be some distance on $\mathbb{R}^{m}$ . For a smooth function $f:\mathbb{R}^{m}\to\mathbb{R}$ , we define the length of the gradient of $f$ with respect to the distance $d_{m}$ by,

[TABLE]

We say that a probability measure $\mu$ satisfies a Poincaré-type inequality on $(\mathbb{R}^{m},d_{m})$ if there is some $\lambda>0$ , such that for any smooth $f:\mathbb{R}^{m}\to\mathbb{R}$ ,

[TABLE]

where the length of the gradient is taken with respect to $d_{m}$ .

Following Gozlan [25, Definition 1.1], we will say that a probability measure $\mu$ on $\mathbb{R}^{m}$ satisfies $\mathbb{SP}(\omega_{\alpha},\lambda)$ if it satisfies the Poincaré-type inequality on $(\mathbb{R}^{m},d_{\omega_{\alpha}})$ with spectral gap $\lambda$ , where $d_{\omega_{\alpha}}$ is the distance defined in (18).

By the results of Bobkov-Ledoux [16, Corollary 3.2], and Gozlan [25, Proposition 1.2], we know that if a Wigner matrix $X$ has entries satisfying $\mathbb{SP}(\omega_{\alpha},\lambda)$ , then it satisfies a two-level deviations inequality: for any Borel subset $A$ of $\mathcal{H}_{n}^{(\beta)}$ such that $\mathbb{P}(X\in A)\geq 1/2$ , and $r>0$ ,

[TABLE]

where $L$ only depends on $\lambda$ , and by [25, Proposition 1.2]) can be taken as

[TABLE]

with $w(t)=\min(|t|^{2},|t|)$ for any $t\in\mathbb{R}$ . In particular, such a Wigner matrix has concentration $\mathcal{C}_{\alpha}$ .

*5.1 Remark**.*

We note that when $\alpha>2$ , the Poincaré-type inequality $\mathbb{SG}(\omega_{\alpha},\lambda)$ yields a different deviation inequality (the one above is also true for $\alpha>2$ but not sharp) where the mixed enlargement is replaced by $\sqrt{r}B_{\ell^{2}}\cap r^{\frac{1}{\alpha}}B_{\ell^{\alpha}}$ (see [25] for more details).

A workable criterion for a probability measure on $\mathbb{R}$ of the form $\mu=e^{-V}dx$ is given by Gozlan [25, Proposition 1.2] in terms of a growth condition of the potential $V$ . More precisely, if

[TABLE]

then $\mu$ satisfies $\mathbb{SG}(\omega_{\alpha},\lambda)$ on $\mathbb{R}$ . We mention also that a criterion is available in higher dimension (although more intricate) in [25, Proposition 3.5], which one may use for the complex entries of Wigner matrices.

In the case $\alpha=1$ of the classical Poincaré inequality, we know by Bobkov [17] (or by Bakry, Barthe, Cattiaux, and Guillin [8]) that any log-concave law on $\mathbb{R}^{n}$ satisfies a Poincaré inequality with a certain spectral gap depending on the dimension. Thus, any Wigner matrix with entries whose laws are log-concave will satisfy $\mathcal{C}_{1}$ .

When $\alpha\in[1,2]$ , the concentration property $\mathcal{C}_{\alpha}$ is equivalent (see [32, Proposition 1.3]) to the following deviation inequality of Lipschitz functions around their medians, which will be useful in the applications.

5.2 Lemma.

Let $\alpha\in[1,2]$ . Let $X$ be a Wigner matrices with entries satisfying $\mathcal{C}_{\alpha}$ for some $\kappa>0$ . Let $f:\mathcal{H}_{n}^{(\beta)}\to\mathbb{R}$ be a function respectively $L_{2}$ -Lipschitz and $L_{\alpha}$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ , and $||\ ||_{\ell^{\alpha}}$ . Then, for any $t>0$ ,

[TABLE]

where $m_{f}$ denotes the median of $f(X)$ .

5.2 A deviation inequality for $\nu_{\alpha}^{n}$ , $\alpha\in(0,1)$

In the case $\alpha\in(0,1)$ , we will show that the Wigner matrices in the class $\mathcal{S}_{\alpha}$ satisfy the concentration property $\mathcal{C}_{\alpha}$ . This fact will follow from the study of the concentration property of the product measures $\mu_{\alpha}^{n}$ and $\nu_{\alpha}^{n}$ . It can be shown that the probability measure $\nu_{\alpha}^{n}$ satisfies a weak Poincaré inequality (see [9, Chapter 7 §7.5]). The derivation of a deviations inequality from the weak Poincaré inequality has been investigated by Barthe, Cattiaux and Roberto [11], and yields a concentration inequality with respect to Euclidean enlargements. We will follow another path which consists, as it was the case for $\alpha\geq 1$ , in transporting Talagrand’s deviation inequality for the symmetric exponential law (17) onto $\nu_{\alpha}$ with $\alpha<1$ , using the estimate on the monotone rearrangement map proved in Lemma 3.10. We start with the one-sided probability measure $\mu_{\alpha}$ .

5.3 Proposition.

Let $n\in\mathbb{N}$ , $n\geq 2$ , and $\alpha\in(0,1)$ . There is a constant $c>0$ depending on $\alpha$ , such that for any $r>0$ , $A$ Borel subset of $\mathbb{R}_{+}^{n}$ , and $C>0$ such that $\mu_{\alpha}^{n}(A)>1/C$ ,

[TABLE]

*5.4 Remark**.*

This deviation inequality is not optimal in the sense that it fails to capture the Gaussian fluctuations of empirical means from the central limit theorem. This is due to the $(\log n)^{1/\alpha-1}$ factor in front of the $\ell^{2}$ -ball, which comes from the fact that the increasing rearrangement from $\mu_{1}$ to $\mu_{\alpha}$ is not a Lipschitz function.

But on the other hand, the $(\log n)^{\frac{1}{\alpha}-1}$ factor seems to be sharp, since it yields a non-trivial deviation inequality for

[TABLE]

where $m$ is the median of the maximum function under $\mu_{\alpha}^{n}$ . But from the extreme value theory (see [30, Theorem 1.6.2, Corollary 1.6.3]),

[TABLE]

converges in law to the Gumbel distribution $G$ , where

[TABLE]

for some constant $c_{1},c_{2}$ . Moreover, as the Gumbel distribution has a right-tail behaving like $e^{-t}$ , we see that the $B_{\ell^{1}}$ part in the enlargement of the deviations inequality of Proposition 5.3 is justified.

Proof of Proposition 5.3.

Let $\Phi=\varphi^{\otimes n}:\mathbb{R}^{n}\to\mathbb{R}^{n}$ , defined by $\Phi(x)=(\varphi(x_{i}))_{1\leq i\leq n}$ , which sends $\mu_{1}^{n}$ to $\mu_{\alpha}^{n}$ . Let $r>0$ , and $A$ be a measurable subset of $\mathbb{R}_{+}^{n}$ such that $\mu_{1}^{n}(A)>0$ . In a first step, we will use Lemma 3.10 to see how the map $\Phi$ transform the set $A+\sqrt{r}B_{\ell^{2}}+rB_{\ell^{1}}$ . Actually, to transport the deviation inequality of $\mu_{1}^{n}$ it is sufficient to understand how $\Phi$ deforms $A^{\prime}+\sqrt{r}B_{\ell^{2}}+rB_{\ell^{1}}$ for a well-chosen subset $A^{\prime}$ of $A$ such that $\mu_{1}^{n}(A^{\prime})>0$ . To this end, define

[TABLE]

where $C$ is some constant which will be chosen later. Let $x\in A^{\prime}$ , $y\in B_{\ell^{2}}$ , and $z\in B_{\ell^{1}}$ . By Lemma 3.10, we have

[TABLE]

where the inequality has to be understood coordinate-wise, the functions being applied coordinate by coordinate to the vectors in $\mathbb{R}^{n}$ , and where $K$ is a constant depending on $\alpha$ which will vary in the rest of the proof without changing name. Thus,

[TABLE]

For $C\log n\geq 1$ , we have

[TABLE]

Once again by Lemma 3.10, we get

[TABLE]

where again this inequality is valid coordinate-wise. Using the convexity of the power function $t\mapsto|t|^{\frac{1}{\alpha}-1}$ , or its sub-additivity, we get

[TABLE]

Note that Hölder’s inequality implies

[TABLE]

with $\frac{1}{\gamma}=\frac{1}{2}(\frac{1}{\alpha}+1)$ . Thus,

[TABLE]

Therefore,

[TABLE]

We now simplify the enlargement on the right-hand side. Observe that for any $0<a\leq b\leq c$ ,

[TABLE]

Indeed, if $x\in r^{1/b}B_{\ell^{b}}$ , then

[TABLE]

and

[TABLE]

Thus, $x=x\mathds{1}_{x\geq 1}+x\mathds{1}_{x<1}$ , with $x\mathds{1}_{|x|\geq 1}\in r^{1/a}B_{\ell^{a}}$ and $x\mathds{1}_{|x|<1}\in r^{1/c}B_{\ell^{c}}$ . Therefore, as $\alpha\leq 2\alpha\leq 2$ , $\alpha\leq\gamma\leq 2\alpha$ , and $C\log n\geq 1$ ,

[TABLE]

Thus,

[TABLE]

Applying the deviation inequality (17) of $\mu_{1}^{n}$ , we get

[TABLE]

where $L>0$ is some constant independent of $n$ . But, since

[TABLE]

for some numerical constant $c_{0}>0$ , we have by Markov’s inequality

[TABLE]

Thus,

[TABLE]

But, as $\mu_{\alpha}^{n}=\mu_{1}^{n}\circ\Phi^{-1}$ , and $\Phi$ is a bijection,

[TABLE]

Using (47), we deduce

[TABLE]

Adjusting the constant $c$ we get the claim. ∎

As observed in remark 3.7, the monotone rearrangement $\psi$ of $\nu_{1}$ onto $\nu_{\alpha}$ , satisfies the same estimate of Lemma 3.10 as $\varphi$ . Therefore, the same arguments as for the proof of Proposition 5.3 can be carried out, and yield a similar deviation inequality for $\nu_{\alpha}^{n}$ which we stated in Proposition 2.13.

In view of this deviation inequality for $\nu_{\alpha}^{n}$ , we see that a Wigner matrix in the class $\mathcal{S}_{\alpha}$ when $\alpha\in(0,1)$ satisfies the concentration property $\mathcal{C}_{\alpha}$ .

As for the case where $\alpha\in[1,2]$ , the concentration property $\mathcal{C}_{\alpha}$ can be translated into a deviation inequality for Lipschitz or Hölder functions when $\alpha\in(0,1)$ , as stated in the following lemma.

5.5 Lemma.

Let $\alpha\in(0,1)$ . Assume $X$ satisfies the concentration property $\mathcal{C}_{\alpha}$ for some $\kappa>0$ . Let $f:\mathcal{H}_{n}^{(\beta)}\to\mathbb{R}$ be a function respectively $L_{1}$ -Lipschitz and $L_{2}$ -Lipschitz with respect to $||\ ||_{\ell^{1}}$ , and $||\ ||_{\ell^{2}}$ . There is a constant $c>0$ depending on $\alpha$ , such that if $f$ is moreover $L_{\alpha}$ -Lipschitz with respect to $||\ ||_{\ell^{\alpha}}^{\alpha}$ , then for any $t>0$ ,

[TABLE]

whereas if

[TABLE]

for some $L_{\alpha}^{\prime}>0$ , then for any $t>0$ ,

[TABLE]

where $m_{f}$ is the median of $f(X)$ .

5.3 Concentration inequalities for the largest eigenvalue

We will prove in this section Proposition 2.19. We will see that it will fall easily form Weyl’s inequality [15, Theorem III.2.1], as it enables one to compute the Lipschitz constants of the largest eigenvalue function with respect to the distances $||\ ||_{\ell^{p}}$ when $p\in[1,2]$ and $||\ ||_{\ell^{p}}^{p}$ when $p\in(0,1)$ on $\mathcal{H}_{n}^{(\beta)}$ .

Proof of Proposition 2.19.

Let $\alpha\in(0,2]$ . Let $X$ be a Wigner matrix satisfying the concentration property $\mathcal{C}_{\alpha}$ for some $\kappa>0$ . By Weyl’s inequality [15, Theorem III.2.1], the function

[TABLE]

is $n^{-1/2}$ -Lipschitz with respect to the $p$ -Schatten (pseudo-)norm $||\ ||_{p}$ for any $p>0$ , which is defined by

[TABLE]

Let $m_{f}$ denote the median of $f(X)$ , and $t>0$ . As $\alpha\leq 2$ , we have $||\ ||_{\alpha}\leq||\ ||_{\ell^{\alpha}}$ by [43, Theorem 3.32]. Thus, $f$ is also $n^{-1/2}$ -Lipschitz with respect to $||\ ||_{\ell^{\alpha}}$ . Applying Lemmas 5.2 and 5.5 successively to $f$ and $-f$ , we deduce that for any $t>0$ ,

[TABLE]

with $h_{\alpha}$ defined in Proposition 2.19, and where $c_{\alpha}$ is some constant depending on $\alpha$ . Integrating the above inequality (49), we get

[TABLE]

if $\alpha\in(0,1)$ , and

[TABLE]

if $\alpha\in[1,2]$ , which gives the claim.

∎

5.4 Two lemmas on spectral variation of Hermitian matrices

In view of Lemmas 5.2 and 5.5, proving the concentration inequalities of Propositions 2.14 and 2.18 require to compute the Lipschitz constants of the empirical spectral measure of Hermitian matrices, with respect to $||\ ||_{\ell^{p}}$ when $p\in[1,2]$ , and $||\ ||_{\ell^{p}}^{p}$ when $p\in(0,1)$ , and a well-chosen distance on $\mathcal{P}(\mathbb{R})$ .

We will prove and discuss in this subsection Lemmas 2.20 and 2.21. For $p>0$ , we denote by $\mathcal{W}_{p}$ the $L^{p}$ -Wasserstein distance, defined for any probability measures $\mu$ , $\nu$ on $\mathbb{R}$ with finite $p^{\text{th}}$ -moments by,

[TABLE]

if $p\geq 1$ and by,

[TABLE]

if $p\in(0,1)$ , where the infimum is taken on all coupling $\pi$ between $\mu$ and $\nu$ .

We begin with the proof of Lemma 2.20.

Proof of Lemma 2.20.

By Lidskii’s theorem (see [15, Corollary III 4.2]), we have

[TABLE]

where $\lambda^{\downarrow}(A)$ denotes the vector of eigenvalues of $A$ in decreasing order, and $\prec$ the majorisation relation between vectors of $\mathbb{R}^{n}$ (see [15, Chapter II] for a proper definition). Thus, by [15, Theorem II.3.1] we get, since $x\mapsto|x|^{p}$ is convex as $p\geq 1$ ,

[TABLE]

Using the decreasing coupling between the spectra of $A$ and $B$ , we get

[TABLE]

where $||\ ||_{p}$ denotes the $p$ -Schatten norm, defined in (48). But as $p\leq 2$ , we have by [43, Theorem 3.32],

[TABLE]

which ends the proof of the first inequality of Lemma 2.20.

As a consequence of the Kantorovitch-Rubinstein duality (see [42, Particular case 5.16]), we have

[TABLE]

where $d$ is as in (20). Besides, Jensen’s inequality yields for any $p\geq 1$ ,

[TABLE]

Therefore,

[TABLE]

which gives the second claim of the lemma. ∎

*5.6 Remark**.*

When $p>2$ , the inequality for $A,B\in\mathcal{H}_{n}^{(\beta)}$ ,

[TABLE]

is no longer true, since for $B=0$ it amounts to (52), which is false when $p>2$ , by taking $A=uu^{*}$ , where $u$ is the constant vector.

When $p<1$ , one may hope for the inequality

[TABLE]

to hold. But taking formally $p\to 0$ , would yield

[TABLE]

where $\lambda(A),\lambda(B)$ denote the set of eigenvalues of $A$ and $B$ . But one can see that changing $1$ entry to a matrix can change the whole spectrum, which disproves (55).

The moral of remark 5.6 is that one cannot have (54) with a constant $1$ on the right-hand side. As the cost function $|\ |^{p}$ behaves quite badly when $p<1$ as it is not convex (see [36] for this transportation problem with concave costs), in particular, the optimal transport map is not necessarily the monotone rearrangement contrary to the case $p\geq 1$ , we will not investigate further the question of having a spectral variation inequality involving the $L^{p}$ -Wasserstein distance. We prefer to deal with another distance on $\mathcal{P}_{p}(\mathbb{R})$ , the set probability measures on $\mathbb{R}$ with finite $p^{\text{th}}$ moments, which induces the same topology as $\mathcal{W}_{p}$ and dominates $d$ . This distance is chosen so that, applied to empirical spectral measures, it will be controlled by $||\ ||_{\ell^{p}}^{p}$ in the case where $p\in(0,1)$ .

To this end, let $p\in(0,1)$ and define for any $\mu,\nu\in\mathcal{P}_{p}(\mathbb{R})$ ,

[TABLE]

Taking formally $p$ to [math], we retrieve the Kolmogorov-Smirnov distance $d_{KS}$ . Recall that by integrating by parts, we can write

[TABLE]

where NBV denotes the set of normalized functions with bounded variations, that is, functions which are the integrals of finite signed measures, and

[TABLE]

whenever $f$ is the distribution function of the finite signed measure $\sigma$ , and $|\sigma|$ is the total variation of $\sigma$ .

We can actually have a similar formulation for $d_{p}$ , by introducing the fractional integrals of order $p+1$ on $\mathcal{M}_{s}^{p}$ , the set of finite signed measures $\sigma$ such that $|\sigma|$ has a finite $p^{\text{th}}$ -moment, which we defined in (19). We recall that fractional integrals enjoy the following integration by parts formula (see [37, (5.16)]): for $\mu,\nu\in\mathcal{M}_{s}^{p}$ ,

[TABLE]

Thus, we can write

[TABLE]

where the supremum is taken on all $\sigma\in\mathcal{M}_{s}^{p}$ , such that $|\sigma|(\mathbb{R})\leq 1$ . The inequality $d_{p}\geq$ (58) is the consequence of the integration by parts formula (57), whereas the equality is given by taking $\sigma=\delta_{t}$ , for $t\in\mathbb{R}$ . We investigate now the link between the distances $d$ , defined in (20), $\mathcal{W}_{p}$ and $d_{p}$ when $p\in(0,1)$ .

5.7 Proposition.

Let $p\in(0,1)$ . Then, $d_{p}$ , defined in (56), is a distance on $\mathcal{P}_{p}(\mathbb{R})$ , and metrizes the weak topology. More precisely, there is a constant $C_{p}>0$ such that

[TABLE]

for all $\mu,\nu\in\mathcal{P}_{p}(\mathbb{R})$ . One can choose

[TABLE]

Furthermore,

[TABLE]

*5.8 Remark**.*

We actually do not know if the distances $d_{p}$ and $\mathcal{W}_{p}$ are comparable, meaning that the reversed inequality $d_{p}\geq K_{p}\mathcal{W}_{p}$ is true for some $K_{p}>0$ . We do know however, by the remark 5.6, that such an inequality cannot hold with some constant $K_{p}$ staying bounded when $p\to 0$ .

Proof.

In view of the formulation of $d_{p}$ as (58), the stake behind (59) is to represent the function $t\mapsto(z-t)^{-1}$ as the fractional integral of order $p+1$ of some function. The constant $C_{p}$ will arise as a bound on the $L^{1}$ norm of this function as $\Im z\geq 1$ , over $\Gamma(p+1)$ .

The fractional integral of order $p+1$ of the function $t\mapsto(z-t)^{-1}$ is given in [37], which we state in the next lemma.

5.9 Lemma ([37, Chapter 2 (5.25)]).

Let $p\in(0,1)$ . For any $z\in\mathbb{C}$ , $\Im z>0$ , we have

[TABLE]

with

[TABLE]

where $\zeta^{p}$ is the principal branch of the $\alpha^{\text{th}}$ -root on $\mathbb{C}\setminus\mathbb{R}_{-}$ .

Let $\Im z\geq 1$ and $h$ as in (62). We have

[TABLE]

where we used $\Gamma(p+2)=(p+1)\Gamma(p+1)$ . Therefore,

[TABLE]

But, one can recognize an Euler integral of the first kind in the definition of $C_{p}$ , by making successively the changes of variables $t=\tan u$ , and $v=(\cos u)^{2}$ , which yields,

[TABLE]

Therefore by [3, (2.13)], we deduce the value for $C_{p}$ claimed in (60).

Inequality (61) is the consequence of the sub-additivity of the function $x\mapsto x^{p}$ on $\mathbb{R}^{+}$ . More precisely, for any $x,y,t\in\mathbb{R}$ ,

[TABLE]

Integrating the above inequality under a coupling $P$ of two probability measures with finite $p^{\text{th}}$ -moment yields the claim.

From (59), we deduce that the topology induced by $d_{p}$ on $\mathcal{P}_{p}(\mathbb{R})$ is finer than the weak topology, and by (61) that it is coarser than the one induced by $\mathcal{W}_{p}$ . But $\mathcal{W}_{p}$ induces the weak topology on $\mathcal{P}_{p}(\mathbb{R})$ by [42, Theorem 6.9] (as $|\ |^{p}$ is a metric on $\mathbb{R}$ for $p\leq 1$ ), therefore $d_{p}$ induces the weak topology on this set. ∎

We finally prove that the distance $d_{p}$ we introduced, when applied to spectral measures of Hermitian matrices, is dominated by $||\ ||_{\ell^{p}}^{p}$ for $p\in(0,1)$ , this will directly imply the result of Lemma 2.21.

5.10 Lemma.

Let $p\in(0,1)$ . Let $A,B\in\mathcal{H}_{n}^{(\beta)}$ .

[TABLE]

where $d_{p}$ is defined in (56). In particular,

[TABLE]

where $C_{p}$ is as in (60).

*5.11 Remark**.*

Defining the distance

[TABLE]

for any $\mu,\nu\in\mathcal{P}_{p}(\mathbb{R})$ , we see that we have a similar representation as for $d_{p}$ , that is,

[TABLE]

where $\sigma$ run in $\mathcal{M}_{s}^{p}$ such that $|\sigma|(\mathbb{R})\leq 1$ . Moreover, we clearly get the same inequality as (63) for $d_{p}$ .

Proof.

As $\alpha\leq 2$ , the second inequality of (63) is due to [43, Theorem 3.32]. To prove the first inequality, we begin by recalling an inequality due to Rotfel’d originally, and then to Thompson [41] (for an extension and a simpler proof). Let $F:\mathbb{R}_{+}^{2n}\to\mathbb{R}$ be a concave symmetric function. Then for any $A,B\in\mathcal{H}_{n}^{(\beta)}$ positive semi-definite,

[TABLE]

where $\lambda(C)$ denotes the vector of eigenvalues of a Hermitian matrix $C$ . Note that since $F$ is symmetric, there is no ambiguity in the writing. Let $t\in\mathbb{R}$ . We have,

[TABLE]

In particular, if we denote $\lambda_{1}(C)\geq\lambda_{2}(C)\geq...\geq\lambda_{n}(C)$ the eigenvalues of some Hermitian matrix $C$ , then by Weyl’s inequality [15, Theorem III.2.1], for any $i\in\{1,...,n\}$ ,

[TABLE]

Therefore,

[TABLE]

Define

[TABLE]

Since $A,B$ are Hermitian,

[TABLE]

As $F$ is non-decreasing coordinate-wise,

[TABLE]

Rotfel’d inequality gives

[TABLE]

Thus,

[TABLE]

Applying this inequality with $A+B$ , $-B$ instead of $A$ and $B$ , we get the first claim. The inequality (63) is a just reformulation of the above inequality and a use of the comparison (52) between $\ell^{p}$ -(quasi)-norm and $p$ -Schatten (quasi)-norm. Finally, using Proposition 5.7, we deduce that (64) is true. ∎

With the Lemmas 5.10 and 2.20, we can now give a proof of Propositions 2.14 and 2.18.

Proof of Proposition 2.14.

Let $\alpha\in(0,2]$ and $X$ to be a Wigner matrix satisfying the concentration property $\mathcal{C}_{\alpha}$ with some $\kappa>0$ . Lemma 2.20 and Hölder’s inequality allow us to say that if $f:\mathbb{R}\to\mathbb{R}$ is $1$ -Lipschitz, then the function

[TABLE]

where $\lambda_{1}(Y),...,\lambda_{n}(Y)$ denote the eigenvalues of $Y$ , is $n^{-\frac{1}{2}-\frac{1}{p}}$ -Lipschitz with respect to $||\ ||_{\ell^{p}}$ for any $p\in[1,2]$ . Thus, using Lemma 5.2, we deduce the concentration inequality for the linear statistics of Lipschitz functions of Proposition 2.14 in the case $\alpha\in[1,2]$ .

Assume now that $\alpha\in(0,1)$ and $f$ is $1$ -Lipschitz and moreover can be written $f=\mathcal{I}_{\pm}^{1+\alpha}(\sigma)$ for some $\sigma\in\mathcal{M}_{s}^{\alpha}$ such that $|\sigma|(\mathbb{R})\leq m$ , then by Lemma 5.10 (and remark 5.11), we know that the map (65) is $\Gamma(\alpha+1)^{-1}n^{-1-\frac{\alpha}{2}}m$ -Lipschitz with respect to $||\ ||_{\ell^{\alpha}}^{\alpha}$ . Thus we can deduce from Lemma 5.5 the second concentration inequality of Proposition 2.14.

∎

We prove now Proposition 2.18.

Proof of Proposition 2.18.

Fix some $z\in\mathcal{K}$ . Let $f_{z}$ denote the function on $\mathcal{H}_{n}^{(\beta)}$ defined by,

[TABLE]

As $\Im z\geq 1$ , we see that the function $x\mapsto(z-x)^{-1}$ is $1$ -Lipschitz. Moreover, we know by Lemma 5.9 that when $\alpha\in(0,1)$ ,

[TABLE]

with $||h||_{\ell^{1}}\leq\Gamma(\alpha+1)C_{\alpha}$ , where $C_{\alpha}$ is as in (60). Let $m_{z}$ be the median of $f_{z}(X)$ . Let also $r>0$ . We deduce by Proposition 2.14, and using remark 2.17 in the case $\alpha\in(0,1)$ , that there is a constant $c_{\alpha}$ depending on $\alpha$ such that,

[TABLE]

where $k_{\alpha}$ is defined in the statement of Proposition 2.18. Integrating this inequality, we get

[TABLE]

with $\varepsilon_{n}=O(\kappa n^{-1}(\log n)^{(\frac{1}{\alpha}-1)_{+}})$ , uniformly in $z\in\mathbb{C}$ , $\Im z\geq 1$ . With this notation, we get for any $t>0$ ,

[TABLE]

Let $\mathcal{N}_{t}$ be a $t$ -net of $\mathcal{K}$ . As $z\mapsto f_{z}(X)$ is $1$ -Lipschitz on $\{z\in\mathbb{C}:\Im z\geq 1\}$ , we have

[TABLE]

As $\mathcal{K}$ is a subset of $\mathbb{C}$ of diameter inferior to $1$ , we can find a $t$ -net $\mathcal{N}_{t}$ such that $|\mathcal{N}_{t}|\leq t^{-2}$ . Thus,

[TABLE]

which, adjusting the constant $c_{\alpha}$ , gives the claim.

∎

6 Deterministic equivalents for Wigner matrices

We will prove in this section some uniform deterministic equivalents for the spectral measure and largest eigenvalue of deformed Wigner matrices having concentration $\mathcal{C}_{\alpha}$ for $\alpha\in(0,2)$ (see definition 2.12), using the inequalities proved in the preceding section. We will also prove a deterministic equivalent for traces of polynomials of deformed Wigner matrices, but which will not rely on concentration arguments. In particular, these deterministic equivalents will entail that assumption $(i)$ of Theorem 2.1 holds for the spectral measure, the largest eigenvalue and the traces of polynomials of Wigner matrices in $\mathcal{S}_{\alpha}$ . More precisely, we will prove the following propositions.

6.1 Proposition.

Let $\alpha\in(0,2)$ . Let $X$ be a Wigner matrix such that $\mathbb{E}|X_{1,2}-\mathbb{E}X_{1,2}|^{2}=1$ and satisfying the concentration property $\mathcal{C}_{\alpha}$ . For any $r>0$ ,

[TABLE]

in probability, where $d$ is the distance defined in (20) .

*6.2 Remark**.*

This statement fails when $\alpha=2$ since $X/\sqrt{n}$ is in $rn^{1/2}B_{\ell^{2}}$ for some $r>0$ , with positive probability uniform in $n$ . Whereas on one hand, by Wigner’s theorem (see [2])

[TABLE]

in probability, where for any $a>0$ ,

[TABLE]

On the other hand, by continuity of the free convolution (see [14, Proposition 4.13]),

[TABLE]

in probability, and we have $\mu_{sc}\boxplus\mu_{sc}=\mu_{sc,\sqrt{2}}$ by [2, Example 5.3.26].

6.3 Proposition.

Let $\alpha\in(0,2)$ . Let $X$ be a centered Wigner matrix satisfying the concentration property $\mathcal{C}_{\alpha}$ such that $\mathbb{E}|X_{1,2}|^{2}=1$ . Define the function $\rho$ by,

[TABLE]

For any $r>0$ ,

[TABLE]

in probability.

For the traces of polynomials of independent Wigner matrices we will prove the next proposition.

6.4 Proposition.

Let $\alpha\in(0,2]$ . Let $P\in\mathbb{C}\langle\textbf{X}\rangle$ be a non-commutative polynomial of total degree $d>\alpha$ . Let $\textbf{X}=(X_{1},...,X_{p})$ be a family of independent centered Wigner matrices with entries having finite $(d+1)^{\text{th}}$ -moments, such that $\mathbb{E}|M_{1,2}|^{2}=1$ for any $M\in\{X_{1},...,X_{p}\}$ . For any $r>0$ ,

[TABLE]

in probability, where $P_{d}$ is the homogeneous part of degree $d$ of $P$ , $\textbf{s}=(s_{1},...,s_{p})$ is a free family of $p$ semi-circular variables in a non-commutative probability space $(\mathcal{A},\tau)$ and,

[TABLE]

It is interesting to note that we are able for polynomials, to make the approximation hold uniformly in $H\in rB_{\ell^{2}}$ , which is why we can consider the Gaussian case in our large deviations principle of Theorem 2.9.

6.1 Deterministic equivalents in expectation

Our approach to prove Propositions 6.1 and 6.3 consists is showing in a first step the proposed uniform deterministic equivalents in expectation, and then make use the concentration inequalities of the last section 5 together with a chaining argument to show that these equivalent hold uniformly in probability.

For the empirical spectral measure, we have such a uniform deterministic equivalents in expectation by the following result of Bordenave and Caputo [19].

6.5 Theorem ([19, Theorem 2.6]).

Let $X$ be a Wigner matrix such that $\mathbb{E}|X_{1,2}-\mathbb{E}X_{1,2}|^{2}=1$ , $\mathbb{E}|X_{1,2}|^{3}<+\infty$ , and $\mathbb{E}X_{1,1}^{2}<+\infty$ . There exists a universal constant $c>0$ such that for any $H\in\mathcal{H}_{n}^{(\beta)}$ ,

[TABLE]

where $\delta$ is defined for any $\mu,\nu\in\mathcal{P}(\mathbb{R})$ ,

[TABLE]

where $g_{\mu}$ and $g_{\nu}$ denote the Stieltjes transforms of $\mu$ and $\nu$ .

For the largest eigenvalue, we will prove the following proposition.

6.6 Proposition.

Let $\alpha\in(0,2)$ . Let $X$ be a centered Wigner matrix such that $\mathbb{E}|X_{1,2}|^{2}=1$ and $\mathbb{E}|X_{1,1}|^{4},\mathbb{E}|X_{1,2}|^{4}<+\infty$ . For any $r>0$ ,

[TABLE]

where $\rho$ is the function defined in (67).

Proof.

In a first step, we will perfom a truncation and convolution argument as to the one used in [18, Proposition 4.1, step 1], in order to reduce the problem to the case the entries of $X$ satisfies a Poincaré inequality. Let $\varepsilon>0$ and let $G$ be a GUE matrix, that is, $G=\frac{1}{\sqrt{2}}(B+B^{*})$ where $B$ is a matrix with i.i.d complex Gaussian entries with covariance $\frac{1}{2}I_{2}$ , independent from $X$ . We set $X^{(\varepsilon)}$ to be the Hermitian matrix with $(i,j)$ -entry,

[TABLE]

and $Y^{(\varepsilon)}=(1+\varepsilon^{2})^{-1/2}(X^{(\varepsilon)}+\varepsilon G)$ . By [10, Theorem 1.2], $Y^{(\varepsilon)}$ has entries satisfying a Poincaré inequality .

We know by [29, Theorem 2] that there is some constant $C>0$ such that for any centered Wigner matrix $H$ ,

[TABLE]

This inequality yields as the entries of $X$ have finite fourth moments,

[TABLE]

But, using Weyl’s inequality [15, Theorem III.2.1], and the fact that $\rho$ is $1$ -Lipschitz, we see that $A\mapsto|\lambda_{X/\sqrt{n}+A}-\rho(\lambda_{A})|$ is $2$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ . Thus, we can focus on proving Proposition 6.6 when $X$ has entries satisfying a Poincaré inequality. We make now another reduction of the statement to a convergence in probability and to the case where the supremum is taken on the set of matrices which we denote by $\mathcal{G}$ , consisting of $m$ -sparse matrices $A$ (meaning at most $m$ entries are non-zero) with spectral radius bounded by $r$ , for some fixed $r,m>0$ .

Note that by Weyl’s inequality and (52), we have for any $A\in rB_{\ell^{\alpha}}$ ,

[TABLE]

As $||X/\sqrt{n}||$ converges in $L^{2}$ by [2, Theorem 2.1.22, 27], we deduce that, uniformly in $A\in rB_{\ell^{\alpha}}$ , $|\lambda_{X/\sqrt{n}+A}-\rho(\lambda_{A})|$ is uniformly integrable. Therefore it suffices to prove that for any $t>0$ ,

[TABLE]

Let $A\in rB_{\ell^{\alpha}}$ , and $M_{1}\geq...\geq M_{n^{2}}$ be the values $|A_{i,j}|$ in non-increasing order. We have,

[TABLE]

Let now $m\in\mathbb{N}$ and $v_{1},...,v_{m}$ the locations of the $m$ largest values of $|A_{v}|$ . Define $A^{(m)}$ to be the matrix,

[TABLE]

As $\alpha<2$ , we deduce,

[TABLE]

Thus, again by Weyl’s inequality, it is sufficient to prove for any fixed $m\in\mathbb{N}$ , $r>0$ , and $t>0$ ,

[TABLE]

To prove this claim, we will follow a rather classical argument relying on the Frobenius formula used in the study of finite rank perturbations as in [13] for example, to determine the behavior of the largest eigenvalue of deformed models.

Diagonalize $A=UDU^{*}$ , with $U$ of size $n\times m$ such that $U^{*}U=I_{m}$ . By Frobenius formula (see [13, section 4.1]), $\lambda_{X/\sqrt{n}+A}$ is either in the spectrum of $X/\sqrt{n}$ , denoted $\sigma(X/\sqrt{n})$ , or the largest zero of the function,

[TABLE]

Our main task consists in proving that this function is uniformly close on any compact subset of $\{\Re z>\lambda_{X/\sqrt{n}}\}$ to the following deterministic limit function,

[TABLE]

6.7 Lemma.

Let $\delta>0$ and define,

[TABLE]

For all subset $\Omega$ compactly included in $\{z\in\mathbb{C}:\Re z>2+\delta\}$ and $t>0$ ,

[TABLE]

where $f_{n,A}$ and $f_{A}$ are defined in (70), (71).

Assume for the moment that this lemma is true. Note that the functions $f_{A}$ , $A\in\mathcal{G}$ , form a normal family of holomorphic functions on $\{z\in\mathbb{C}:\Im z>2\}$ . By [1, Chapter 5, Theorem 2], it is thus a pre-compact family in the space of holomorphic functions on $\{z\in\mathbb{C}:\Im z>2\}$ . We deduce by Hurwitz’s theorem [1, Chapter 5, Theorem 10] that for any $\delta>0$ and $\Omega$ open subset compactly included in $\{z:\Re z>2\}$ , there is some $t>0$ such that for any holomorphic function $g$ defined on a neighborhood of $\Omega$ , and $A\in rB_{\ell^{\alpha}}$ such that $\sup_{\Omega}||f_{A}-g||<t$ , then either $f_{A}$ does not have any zeros in $\Omega$ and therefore $g$ neither, or for any zeros of $f_{A}$ in $\Omega$ , corresponds a zero of $g$ in $\Omega$ which is $\delta$ -close.

Let $\delta,r>0$ . We set

[TABLE]

Let also $\Omega$ be some open subset compactly included in $\{z:\Re z>2+\delta\}$ such that $[2+2\delta,\rho(r)]\subset\Omega$ . We deduce that for any $\delta>0$ there is a $t>0$ , such that,

[TABLE]

As this $t$ does not depend on $A\in\mathcal{G}$ , we get from Lemma 6.7

[TABLE]

It remains to show that $\mathbb{P}(V_{\delta,r})$ goes to [math] as $n\to+\infty$ uniformly in $A\in rB_{\ell^{\alpha}}$ . Note that almost surely (taking an arbitrary coupling of the matrices $X$ ), we have by Hoeffman-Weilandt inequality (51),

[TABLE]

Thus, by Wigner’s theorem, almost surely, $\mu_{X/\sqrt{n}+A}$ converges weakly towards $\mu_{sc}$ uniformly in $A\in rB_{\ell^{2}}$ . By lower-semicontinuity of the map

[TABLE]

we deduce that

[TABLE]

almost surely. Using the above convergence, (68) and the convergence of the largest eigenvalue of $X/\sqrt{n}$ to $2$ in probability, we can conclude that

[TABLE]

which gives the claim of Proposition 6.6. Thus, we are reduced to show Lemma 6.7.

∎

Proof of Lemma 6.7.

Let $\delta>0$ and $\Omega$ as in the statement of Lemma 6.7. Let $\eta=\inf\{\Re z:z\in\Omega\}-2$ and $\zeta$ a Lipschitz function such that

[TABLE]

Let $u,v$ be some unit vectors and $z\in\Omega$ . We set

[TABLE]

where $R(z)=(z-Y/\sqrt{n})^{-1}$ . By Weyl’s inequality, this defines a $L_{\Omega}$ -Lipschitz function with respect to $||\ ||_{\ell^{2}}$ , where $L_{\Omega}$ is a constant depending on the set $\Omega$ . As the entries of $X$ satisfies a Poincaré inequality, $X$ has concentration $\mathcal{C}_{1}$ . We deduce from Lemma 5.2 that for $n$ large enough,

[TABLE]

where $\delta_{n}=O(n^{-1/2})$ . Note that $\mathcal{R}_{z}$ defines a $1/(\eta-\delta)^{2}$ -Lipschitz function in $z\in\Omega$ . As $\Omega$ is relatively compact, we deduce by an $\varepsilon$ -net argument that for any $t>0$ ,

[TABLE]

In the following lemma, we show an isotropic-like property.

6.8 Lemma.

Let $\delta>0$ and $\Omega$ a subset compactly included in $\{z\in\mathbb{C}:\Re z>2+\delta\}$ . Let $X$ be a Wigner matrix satisfying the assumptions of Proposition 6.6. For any $m\in\mathbb{N}$ ,

[TABLE]

where $\mathcal{V}_{m}$ denotes the set of unit $m$ -sparse vectors, meaning with at most $m$ non-zero entries, $\zeta$ is as in (72), and $R(z)=(z-X/\sqrt{n})^{-1}$ .

Proof.

By polarization, it is sufficient to prove this lemma where the supremum ranges over vectors $v=u$ . Moreover, by symmetry, it is enough to show this statement for $\Omega\cap\mathbb{C}^{+}$ . Because $\mathcal{R}_{z}$ , as a function of $z$ , is a Lipschitz function on $\Omega$ , we only need to show for any $\varepsilon>0$ ,

[TABLE]

with $\Omega_{\varepsilon}=\{z\in\Omega:\Im z\geq\varepsilon\}$ . Let $u\in\mathcal{V}_{m}$ . For any $z\in\mathbb{C}^{+}$ , we have on one hand,

[TABLE]

On the other hand, expanding the scalar product,

[TABLE]

As $\lambda_{X/\sqrt{n}}$ converges to $2$ in probability, we are reduced to prove for any $\varepsilon>0$ ,

[TABLE]

Even though this is a classical estimate of random matrix theory, for sake of completeness we give here a proof. We start with the case of the off-diagonal entries. We set $H=X/\sqrt{n}$ and we write $R$ as a short-hand for $R(z)$ . Let $i\neq j$ . We have the following resolvent identity (see [12, Lemma 3.5]),

[TABLE]

where $R^{(i)}$ is the resolvent of the matrix $H$ where we removed the $i^{\text{th}}$ -row and $i^{\text{th}}$ -column, and $\sum^{(i)}$ means that the summation is over $\{1,...,n\}\setminus\{i\}$ . By Cauchy-Schwarz inequality we have

[TABLE]

But, as $R^{(i)}$ is independent of $(H_{k,i})_{k}$ and $(H_{k,l})_{k\leq l}$ are centered and independent,

[TABLE]

Recall Ward’s identity (see [12, (3.6)]),

[TABLE]

Thus,

[TABLE]

To deal with the diagonal entries, we start from the Schur complement formula (see [2, Lemma 2.4.6]),

[TABLE]

where $H^{(i)}$ denotes the $i^{\text{th}}$ -column of $H$ where the entry $H_{i,i}$ is removed. Let $\mathcal{F}^{(i)}$ be the $\sigma$ -algebra generated by the variables $H_{k,l}$ for $k,l\neq i$ . We find,

[TABLE]

where $\gamma=\mathbb{E}(X_{1,2}^{2})$ and $\gamma^{\prime}=\mathbb{E}|X_{1,2}|^{4}-1$ . Introducing the missing diagonal terms, using Ward’s identity again and the fact that $|\gamma|\leq 1$ , we find,

[TABLE]

where $c$ is some positive constant depending on $\mathbb{E}|X_{1,2}|^{4}$ . This yields,

[TABLE]

From Wigner’s theorem, we know that $n^{-1}\mathrm{tr}R^{(i)}$ converges to $g_{\mu_{sc}}$ in probability for any $\Im z>0$ . Note that $R^{(i)}$ are identically distributed for $i=1,...,n$ . We deduce from (73) and the fact that $g_{\mu_{sc}}(z)^{-1}=z-g_{\mu_{sc}}(z)$ (see [2, Example 5.3.2.6]),

[TABLE]

which yields,

[TABLE]

for any $z\in\mathbb{C}^{+}$ . As the functions $R_{i,i}$ and $g_{\mu_{sc}}$ are $\varepsilon^{-2}$ -Lipschitz on $\{z\in\mathbb{C}^{+}:\Im z>\varepsilon\}$ , we can extend by an $\varepsilon$ -net argument, this convergence uniformly on any bounded subset of $\{z\in\mathbb{C}^{+}:\Im z>\varepsilon\}$ , for any $\varepsilon>0$ .

∎

We come back now to the proof of Lemma 6.7. The above lemma yields that for any $t>0$ ,

[TABLE]

Note that $m$ -sparse matrices have $m$ -sparse eigenvectors. Using the fact that the spectral radius of matrices in $\mathcal{G}$ is bounded and a union bound, we deduce that for any $s>0$ ,

[TABLE]

where $||Y||_{\ell^{\infty}}=\sup_{i,j}|Y_{i,j}|$ , for any matrix $Y$ . As the matrices $(I_{m}-g_{\mu_{sc}}(z)D)$ , $z\in\Omega$ , $A\in\mathcal{G}$ form a pre-compact subset of $\mathcal{H}_{m}^{(\beta)}$ , the continuity of the determinant on $\mathcal{H}_{m}^{(\beta)}$ , allows us to conclude the proof of Lemma 6.7.

∎

6.2 A chaining argument

We will now give a proof of Propositions 6.1 and 6.3. As it will rely on a chaining argument, we will need the following lemma.

6.9 Lemma.

Let $m\in\mathbb{N}$ and let $B_{\ell^{p}}$ denote the $\ell^{p}$ -ball of $\mathbb{C}^{m}$ for any $p>0$ . Fix some $0<p<q<\infty$ . We denote by $N(B_{\ell^{p}},\varepsilon B_{\ell^{q}})$ , the covering number of $B_{\ell^{p}}$ by $\varepsilon B_{\ell^{q}}$ , that is, the minimal number of translates of $\varepsilon B_{\ell^{q}}$ needed to cover $B_{\ell^{p}}$ . There is a constant $c>0$ depending on $p,q$ , such that for $c(\frac{\log m}{m})^{\frac{1}{p}-\frac{1}{q}}\leq\varepsilon\leq c^{-1}$ ,

[TABLE]

Proof.

This estimate is a consequence of the upper bound on entropy numbers of embeddings of $\ell_{p}^{m}$ in $\ell_{q}^{m}$ given in [24, Proposition 3.2.2]. Let $0<p<q<\infty$ . Denote by $\ell_{p}^{m}$ the space $\mathbb{R}^{m}$ equipped with the (quasi)-norm $||\ ||_{\ell_{p}}$ . We define, for $k\in\mathbb{N}$ ,

[TABLE]

From [24, Proposition 3.2.2], we know that there is a constant $c>0$ such that for $\log_{2}(2m)\leq k\leq 2m$ ,

[TABLE]

Thus, if we set $k=\lambda\log_{2}(2m)$ , for some $\lambda\geq 1$ such that $k\leq 2m$ , we deduce the following rough bound,

[TABLE]

for some constant $c^{\prime}>0$ . Let now $\varepsilon>0$ and set $\lambda$ such that $\varepsilon=c^{\prime}\lambda^{\frac{1}{q}-\frac{1}{p}}$ . The above inequality tells us that if $1\leq\lambda\leq 2m/\log_{2}(2m)$ , then there are $(2m)^{\lambda}$ balls $\varepsilon B_{\ell^{q}}$ covering $B_{\ell^{p}}$ , that is,

[TABLE]

which yields the claim. ∎

We are now ready to give a proof of Proposition 6.1 and 6.3.

Proof of Proposition 6.1.

Let $H\in\mathcal{H}_{n}^{(\beta)}$ . As $X$ satisfies $\mathcal{C}_{\alpha}$ for some constant $\kappa>0$ , we see that $X+\sqrt{n}H$ also satisfies $\mathcal{C}_{\alpha}$ with the same constant $\kappa$ . We know from Propositions 2.18 and 6.5, that for any $t>0$ ,

[TABLE]

with $k_{\alpha}$ defined in Proposition 2.18 and $\varepsilon_{n}=O\big{(}n^{-1/2}(\log n)^{(1/\alpha-1)_{+}}\big{)}$ , uniformly in $H\in\mathcal{H}_{n}^{(\beta)}$ . Note that the map

[TABLE]

is $n^{-1/2}$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ by Lemma 2.20. We deduce using an $\varepsilon$ -net argument that for $n$ large enough,

[TABLE]

where $N(rn^{1/\alpha}B_{\ell^{\alpha}},tn^{1/2}B_{\ell^{2}})$ denotes the covering number of $rn^{1/\alpha}B_{\ell^{\alpha}}$ by $tn^{1/2}B_{\ell^{2}}$ . But, the homogeneity of the norm gives,

[TABLE]

with $t^{\prime}=t/r$ . We get from Lemma 6.9 applied with $m=n^{2}$ ,

[TABLE]

This shows that the covering number is negligible with respect to the speed of the deviations, which concludes the chaining argument. ∎

We finally give a proof of Proposition 6.3.

Proof of Proposition 6.3.

Let $r>0$ . Similarly as in the proof of Proposition 6.1, we deduce from Propositions 2.19 and 6.6, that for any $A\in\mathcal{H}_{n}^{(\beta)}$ and $t>0$ ,

[TABLE]

where $h_{\alpha}$ is defined in Proposition 2.19, $\delta_{n}=O(n^{-1/2}(\log n)^{(1/\alpha-1)_{+}})$ uniformly in $A\in rB_{\ell^{2}}$ , and $\rho$ is as in (67).

Note that the map $x\mapsto\rho(x)$ is $1$ -Lipschitz. From Weyl’s inequality [15, Theorem III.2.1], we deduce that

[TABLE]

is $2$ -Lipschitz with respect to the Hilbert-Schmidt norm on $\mathcal{H}_{n}^{(\beta)}$ . Using an $\varepsilon$ -net argument as in the proof of Proposition 6.1, it is sufficient to prove that for any fixed $t>0$ , the covering number $N(B_{\ell^{\alpha}},tB_{\ell^{2}})$ is negligible at the exponential scale $n^{\alpha/2}$ , that is

[TABLE]

But from Lemma 6.9, we know that,

[TABLE]

which ends the proof of the claim. ∎

6.3 Traces of polynomials of deformed Wigner matrices

We will now prove Proposition 6.4. Contrary to the spectral measure or the largest eigenvalue, the proof will consist in a simple moment computation.

Proof of Proposition 6.4.

By linearity it is sufficient to show the statement when $P$ is a monomial, which we will assume from now on. We can write $P=X_{i_{1}}...X_{i_{q}}$ , with $q\leq d$ . Define the matrix $Q$ with coefficients in $\mathbb{C}\langle\textbf{X}\rangle$ , by

[TABLE]

Observe that by cyclicity of the trace, for any $\textbf{Y}\in(\mathcal{H}_{n}^{(\beta)})^{p}$ , $\mathrm{tr}Q(\textbf{Y})^{q}=q\mathrm{tr}P(\textbf{Y})$ . Therefore,

[TABLE]

Write $Z=Q(\textbf{X}/\sqrt{n})$ and $K=Q(\textbf{H})$ . We know from the proof of [5, Lemma 2.1] that,

[TABLE]

Let us define $q$ -Schatten (quasi-)norm on $(\mathcal{H}_{n}^{(\beta)})^{p}$ , for any $q>0$ by,

[TABLE]

Note that for any $\textbf{Y}\in(\mathcal{H}_{n}^{(\beta)})^{p}$ ,

[TABLE]

Thus, for any $m\in\mathbb{N}$ ,

[TABLE]

As $\textbf{H}\in rB_{\ell^{2}}$ , $\mathrm{tr}|K|^{2}\leq r^{2}$ . Without loss of generality we can assume $r\geq 1$ . Thus,

[TABLE]

But we know from Wigner’s theorem (see [2, Lemma 2.1.6]), that there is a constant $c\geq 1$ , such that

[TABLE]

Besides,

[TABLE]

By Jensen’s inequality, we deduce

[TABLE]

Therefore,

[TABLE]

We deduce from (75) and (77) that

[TABLE]

uniformly in $\textbf{H}\in rB_{\ell^{2}}$ and where $\tau_{n}=\frac{1}{n}\mathrm{tr}$ . It is now sufficient to prove that $n^{q/d-1}\mathrm{tr}P(\textbf{H})$ converges to [math] uniformly in $\textbf{H}\in rB_{\ell^{\alpha}}$ , as soon as $q<d$ . Assume first $q\geq\alpha$ . Using the non-commutative Hölder’s inequality (see [15, Corollary IV.2.6]), we get

[TABLE]

The arithmetic-geometric mean inequality yields,

[TABLE]

As $q\geq\alpha$ , we deduce

[TABLE]

We conclude that when $\alpha\leq q<d$ ,

[TABLE]

If $q<\alpha$ , then $q=1$ and $\alpha>1$ . By Jensen’s inequality,

[TABLE]

Thus, as $d>\alpha$ ,

[TABLE]

Besides, we know by [2, Theorem 5.4.2], that

[TABLE]

where s are a family of $p$ free semi-circular variables defined on a non-commutative probability space $(\mathcal{A},\tau)$ . This ends the proof of the proposition. ∎

7 Deterministic equivalent for the last-passage time

We will prove in this section the analogue of the results for Wigner matrices of the preceding section, for the last-passage time. More precisely, we will provide a deterministic equivalent for the last-passage time when the matrix of weights is deformed by some matrix $nH$ , where $||H||_{\ell^{\alpha}}$ is bounded for some $\alpha\in(0,1)$ .

Let $\mathcal{A}$ denote the set of finite vectors $(v_{1},...,v_{m})$ , which we will call admissible, such that $v_{i}\in\{0,...,n\}^{d}$ , $v_{0}=(0,...,0)$ , $v_{m}=(n,...,n)$ , and for any $i\in\{0,...,m-1\}$ , $v_{i}<v_{i+1}$ , where $<$ denotes the lexicographic order. With this definition we set, for any $H\in\mathbb{R}^{I}$ , where $I=\{0,...,n\}^{d}$ ,

[TABLE]

where $V=(v_{0},...,v_{m})$ for some $m\in\mathbb{N}$ , where $g$ is as in (14), and where we denote here, for better lisibility, $x^{+}$ the positive part of $x\in\mathbb{R}$ ( $x^{+}=x_{+}$ , our former notation). With this notation, we will prove the following proposition.

7.1 Proposition.

Let $\alpha\in(0,1)$ . Let $X=(X_{v})_{v\in\mathbb{Z}_{+}^{d}}$ be a family of i.i.d random variables following the law $\mu_{\alpha}$ . For any $r>0$ ,

[TABLE]

in probability, where $Y^{+}$ denotes the multi-matrix $(Y_{v}^{+})_{v}$ .

We will follow the same arguments as for the proof of the uniform deterministic equivalent of the empirical spectral measure and the largest eigenvalue of Wigner matrices. We will begin by showing that the deterministic equivalent (80) we propose, holds uniformly in expectation. This is the object of the following lemma.

7.2 Lemma.

Let $\alpha\in(0,1)$ . Let $X=(X_{v})_{v\in\mathbb{Z}_{+}^{d}}$ be a family of i.i.d non-negative random variables with common distribution function satisfying (13). For any $r>0$ ,

[TABLE]

where $\mathcal{T}_{n}(H)$ is as in (80).

Proof.

Let $\mathcal{A}_{m}$ denote the subset of vectors of $\mathcal{A}$ of size less or equal than $m$ , and define $\hat{\mathcal{T}}_{n}^{(m)}$ by,

[TABLE]

and $\mathcal{T}_{n}^{(m)}$ ,

[TABLE]

where $V=(v_{0},...,v_{p})$ for some $p\leq m$ , and $g$ is as in (14). We begin by proving that there is some constant $C>0$ depending on $\alpha$ , such that for any $||H||_{\ell^{\alpha}}\leq r$ ,

[TABLE]

In the following $C$ will denote a constant which will depend only on $\alpha$ and which will vary along the lines of the proof. Let $\pi$ be an optimal path for the last-passage time $T(X+nH)^{+}$ , and denote by $v_{1},..,v_{m-1}$ be the $m-1$ largest values of $H^{+}$ on the path $\pi$ , sorted in lexicographic order. Add $v_{0}=(0,...,0)$ and $v_{m}=(n,...,n)$ , to get $V=\{v_{0},...,v_{m}\}\in\mathcal{A}_{m}$ . We have

[TABLE]

As $(x+y)^{+}\leq x^{+}+y^{+}$ , we deduce

[TABLE]

Now observe that if $M_{1}\geq...\geq M_{d(n+1)}$ are the values of $H^{+}$ (or of $H^{-}$ ) along $\pi$ in decreasing order, we have since $\sum_{i}M_{i}^{\alpha}\leq r^{\alpha}$ , for any $k\in\{1,...,d(n+1)\}$ ,

[TABLE]

Therefore,

[TABLE]

for some constant $C>0$ . This proves the upper bound of (81). On the other hand, let $V=\{v_{0},....,v_{p}\}\in\mathcal{A}_{m}$ . Considering the optimal paths from $v_{i}$ to $v_{i+1}$ in the last-passage time $T_{v_{i},v_{i+1}}(X)$ , for $i=0,...,p-1$ and their concatenation $\pi$ , we get,

[TABLE]

Indeed, if $v\in\pi$ , then

[TABLE]

by considering the cases whether $H_{v}\geq 0$ or ( $H_{v}\leq 0$ and $X+nH_{v}\geq 0$ ) or ( $H_{v}\leq 0$ and $X+nH_{v}\leq 0$ ). Turning our attention to the first sum in (83), we deduce by bounding the first $n^{\alpha}$ largest weights of $H_{v}^{-}$ by $X_{v}/n$ , and using the bound (82) for the rest of the terms,

[TABLE]

By (40) we have,

[TABLE]

for some constant $c>0$ . We thus proved,

[TABLE]

On the other hand, focusing now on the second term of (83),

[TABLE]

But $||H||_{\ell^{\alpha}}\leq r$ , thus

[TABLE]

Therefore,

[TABLE]

which concludes the proof of the lower bound of (81). Comparing $\mathcal{T}^{(m)}_{n}$ and $\hat{\mathcal{T}}^{(m)}_{n}$ , we get using the translation invariance in law (by vectors of $\mathbb{Z}^{d}_{+}$ ) of $(X_{v})_{v\in\mathbb{Z}^{d}_{+}}$ ,

[TABLE]

As $\mathbb{E}T_{0,\lfloor nw\rfloor}(X)$ is coordinate-wise non-decreasing as a function of $w\in\mathbb{R}_{+}^{2}$ , and converges to $g(w)$ which is continuous by [34, Theorem 2.3], we deduce that $w\mapsto\mathbb{E}T_{0,\lfloor nw\rfloor}(X)$ converges uniformly to $g$ on $[0,1]^{2}$ by Dini’s Theorem. Thus,

[TABLE]

where $\varepsilon(n)\to+\infty$ when $n\to+\infty$ .

Now, using the same argument as for the upper bound of (81), we see that

[TABLE]

for any $||H||_{\ell^{\alpha}}\leq r$ . Indeed, if $V$ achieves the supremum in $\mathcal{T}_{n}(H)$ , then taking $V^{\prime}$ the $m$ largest values of $H^{+}$ on $V$ , we get

[TABLE]

Thus, using (82), we get the claim. To summarize, we got by (81), (84), and (85),

[TABLE]

for some constant $C>0$ and for any $||H||_{\ell^{\alpha}}\leq r$ , which gives finally the claim by taking the $\limsup$ as $n\to+\infty$ , and then as $m\to+\infty$ . ∎

We can now give a proof of Proposition 7.1.

Proof of Proposition 7.1.

Let $H\in\mathbb{R}^{I}$ . Note that $X\mapsto T(X+nH)$ is $1$ -Lipschitz with respect to $||\ ||_{\ell^{1}}$ on $\mathbb{R}^{I}$ . As $||\ ||_{\ell^{1}}\leq||\ ||_{\ell^{\alpha}}$ since $\alpha<1$ , we deduce that $X\mapsto T(X+nH)$ is also $1$ -Lipschitz with respect to $||\ ||_{\ell^{\alpha}}$ . Moreover by Hölder’s inequality, $X\mapsto T(X+nH)$ is $\sqrt{n}$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ . We get by Lemma 5.5, for any $t>0$ ,

[TABLE]

where $m$ is the median of $T(X+nH)$ , $c$ is some strictly positive constant, and

[TABLE]

Integrating this inequality we get,

[TABLE]

uniformly in $H$ . Using the result of Proposition 7.1, we deduce that for $n$ large enough,

[TABLE]

where $\delta_{n}=O((\log n)^{\frac{1}{\alpha}-1}n^{-\frac{1}{2}})$ . Let now $r>0$ . Note that

[TABLE]

is $2$ -Lipschitz with respect to $||\ ||_{\ell^{1}}$ on $\mathbb{R}^{I}$ . Besides, by Lemma 6.9 for any $\varepsilon>0$ , the covering number of $rB_{\ell^{\alpha}}$ by $\ell^{2}$ -balls of radii $\varepsilon$ satisfies,

[TABLE]

Since this estimate is negligible with respect to the concentration bound (86), we deduce using an $\varepsilon$ -net arguments as in the proofs of Propositions 6.1 and 6.3, that

[TABLE]

which ends the proof of the claim. ∎

8 Applications to Wigner matrices

We apply in this section Theorem 2.1 in the setting of Wigner matrices, and we derive the LDP of Theorems 2.5, 2.7 and 2.9. In all this section, $X$ will designate a Wigner matrix with the class $\mathcal{S}_{\alpha}$ for some $\alpha\in(0,2]$ . It is clear that Theorem 2.1 remains valid in the context of Wigner matrices in the class $\mathcal{S}_{\alpha}$ , making the according change in the rate function $I_{\alpha}$ , by replacing the weight function $||\ ||_{\ell^{\alpha}}^{\alpha}$ by $W_{\alpha}$ , which defines the law of a Wigner matrix in $\mathcal{S}_{\alpha}$ (see (12)).

8.1 Large deviations of the empirical spectral measure

Proof of Theorem 2.5.

From Proposition 6.1, we know that assumption $(i)$ of Theorem 2.1 is satisfied with

[TABLE]

and

[TABLE]

where $m$ is the (real) dimension of $\mathcal{H}_{n}^{(\beta)}$ , with the metric $d$ on $\mathcal{P}(\mathbb{R})$ defined in (20), and $v(m)=n^{1+\frac{\alpha}{2}}$ .

By Lemma 2.20, we see that $f_{m}$ is $n^{-1}$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ on $\mathcal{H}_{n}^{(\beta)}$ and $d$ on $\mathcal{P}(\mathbb{R})$ . By the remark 2.2 (c), and from the fact that $\alpha<2$ , we deduce that the assumption $(ii)$ of Theorem 2.1 holds. Besides, as $\alpha\leq 2$ , we have by [43, Theorem 3.32]

[TABLE]

Thus for any $r>0$ ,

[TABLE]

which shows that $\cup_{m}F_{m}(rB_{\ell^{\alpha}})$ is relatively compact by Prokhorov’s theorem, and that $(iii)$ is verified.

To prove $(iv)$ it is sufficient to show that for a fixed $H\in\mathcal{H}_{p}^{(\beta)}$ , there is a sequence $H_{n}\in\mathcal{H}_{n}^{(\beta)}$ , $n\geq p$ , such that

[TABLE]

Let for any $k\in\mathbb{N}$ , $H_{kp}=\oplus_{i=1}^{k}k^{-1/\alpha}H\in\mathcal{H}_{kp}^{(\beta)}$ . We have $W_{\alpha}(H_{kp})=W_{\alpha}(H)$ , as $W_{\alpha}(\lambda Y)=\lambda^{\alpha}W_{\alpha}(Y)$ for any $\lambda>0$ , and

[TABLE]

Now, if $n=kp+l$ , with $k\in\mathbb{N}$ and $1\leq l\leq p$ , we define

[TABLE]

We have,

[TABLE]

Thus,

[TABLE]

Besides,

[TABLE]

As $W_{\alpha}(H_{kp})=W_{\alpha}(H)$ , and $\mu_{(kp)^{1/\alpha}H_{kp}}=\mu_{p^{1/\alpha}H}$ , we get the claim (87).

∎

8.2 Large deviations of the largest eigenvalue

Proof of Theorem 2.7.

We begin by giving back to $J_{\alpha}$ its variational form. We claim that for any $x\in\mathbb{R}$ ,

[TABLE]

where $\rho$ is the function

[TABLE]

Let us prove first that

[TABLE]

When $x<2$ , both sides of (89) are infinite. If $x\geq 2$ , we denote by $\mathcal{J}_{\alpha}$ the right-hand side of (89). The function $x\in(0,1]\mapsto\rho(1/x)$ is the inverse of the Stieltjes transform of $\mu_{sc}$ on $[2,+\infty)$ (see [2, Example 5.3.2.6]). Thus, we can write

[TABLE]

As $W_{\alpha}$ is $\alpha$ -homogeneous, and $\lambda_{tA}=t\lambda_{A}$ , for any $t\geq 0$ , we get

[TABLE]

Thus, $J_{\alpha}=\mathcal{J}_{\alpha}$ . As $J_{\alpha}$ is clearly lower semi-continuous, the equality (88) holds by the remark 2.2 (e).

We check now the assumptions of Theorem 2.1. Assumption $(i)$ of Theorem 2.1 is met by the result of Proposition 6.3, with

[TABLE]

where as before $m$ is the dimension of $\mathcal{H}_{n}^{(\beta)}$ , and $v(m)=n^{\alpha/2}$ . Weyl’s inequality [15, Theorem III.2.1] shows that $f_{m}$ is $n^{-1/2}$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ , and thus assumption $(ii)$ is satisfied as $\alpha<2$ by the remark 2.2 (c). Besides, note that for any $H\in\mathcal{H}_{n}^{(\beta)}$ ,

[TABLE]

where we used in the second inequality the fact that $\alpha\leq 2$ and [43, Theorem 3.32]. As $\rho$ is non-decreasing, we deduce for any $r>0$ that,

[TABLE]

which proves that $(iii)$ is satisfied. To show that $(iv)$ holds, it suffices to observe that if $H\in\mathcal{H}_{n}^{(\beta)}$ , and if we set for any $m\geq n$ ,

[TABLE]

then $W_{\alpha}(H_{m})=W_{\alpha}(H)$ , and provided $\lambda_{H}\geq 0$ , we have $\lambda_{H}=\lambda_{H_{m}}$ , so that in particular $\rho(\lambda_{H})=\rho(\lambda_{H_{m}})$ . ∎

8.3 Large deviations of non-commutative polynomials

Finally, we give a proof of Theorem 2.9.

Proof of Theorem 2.9.

By a homogeneity argument similar as for the proof of Theorem 2.7, we get for any $x\in\mathbb{R}$ ,

[TABLE]

where $P_{d}$ denotes the homogeneous part of degree $d$ of $P$ . From the remark 2.2 (e), we get as $K_{\alpha}$ is lower semi-continuous, that

[TABLE]

Assumption $(i)$ of Theorem 2.1 is a consequence of Lemma 6.4 with the speed $v(m)=n^{\alpha(\frac{1}{2}+\frac{1}{d})}$ and

[TABLE]

where $m$ is the real dimension of $(\mathcal{H}_{n}^{(\beta)})^{p}$ .

Let us now prove assumption $(ii)$ . Note that by linearity, it suffices to prove assumption $(ii)$ when $P$ is a monomial of total degree $k\geq 1$ less or equal than $d$ , which we will assume from now on. If $k=1$ , then there are two cases to consider. First we see by Hölder’s inequality that $f_{m}$ is $n^{-1}$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ . If $d=1$ then $\alpha\in(0,1)$ , so that as $v(n)=n^{3\alpha/2}$ in this case. We conclude by remark 2.2 (c) that assumption $(ii)$ holds. If $d\geq 2$ and $k=1$ , then we deduce again by remark 2.2 (c) that assumption $(ii)$ is fulfilled as $v(n)=n^{\alpha(\frac{1}{2}+\frac{1}{d})}$ .

In the case $k\geq 2$ , we will need to understand the stability of the function $f_{m}$ with respect to the Euclidean norm. This is the object of the following lemma.

8.1 Lemma.

There is a constant $C_{d,p}>0$ depending on $d$ and $p$ , such that for any monomial $q\in\mathbb{C}\langle\textbf{X}\rangle$ of total degree $d\geq 2$ , and $\textbf{Y},\textbf{H}\in(\mathcal{H}_{n}^{(\beta)})^{p}$ ,

[TABLE]

where for any $q>0$ , $||\ ||_{q}$ denotes the $q$ -Schatten norm on $(\mathcal{H}_{n}^{(\beta)})^{p}$ , defined in (76).

Proof.

Let

[TABLE]

By the mean value theorem, we have

[TABLE]

Note that if $R\in\mathbb{C}\langle\textbf{X}\rangle$ is a monomial of degree $d-1$ in X, then by (79), we have

[TABLE]

As $\nabla_{X_{i}}f$ is the sum of at most $d$ monomials of degree $d-1$ in X, we get by triangular inequality and the above observation,

[TABLE]

Thus,

[TABLE]

As $\textbf{Z}\mapsto||\textbf{Z}||_{2(d-1)}^{d-1}$ is convex, we get

[TABLE]

As $2(d-1)\geq 2$ , we have

[TABLE]

This inequality together with (91) which yields the claim (8.1). ∎

We come back now at the proof of assumption $(ii)$ of Theorem 2.1. Let $r\geq 1$ .

Let $\textbf{K}\in rB_{\ell^{\alpha}}$ , and set $\textbf{Y}=\textbf{X}+n^{\frac{1}{2}+\frac{1}{d}}\textbf{K}$ . As we assumed $P$ is a monomial of total degree $k$ , from the preceding Lemma 8.1, we have for any $\textbf{H}\in(\mathcal{H}_{n}^{(\beta)})^{p}$ ,

[TABLE]

where $c$ is some constant depending $p$ and $d$ . Using the fact that $x^{k-1}\leq 1+x^{d-1}$ for any $1\leq k\leq d$ and $x\geq 0$ , we get,

[TABLE]

Let $\delta\in(0,1)$ and $t_{\delta}=\delta n^{\frac{1}{2}+\frac{1}{d}}$ . For $\textbf{H}\in t_{\delta}B_{\ell^{2}}$ ,

[TABLE]

With the notation of Theorem 2.1, we have

[TABLE]

where $m$ is the dimension of $(\mathcal{H}_{n}^{(\beta)})^{p}$ . By convexity, we deduce

[TABLE]

But by Wigner’s theorem (see [2, Lemma 2.1.6]),

[TABLE]

for some constant $c_{0}>0$ . As $\textbf{K}\in rB_{\ell^{\alpha}}$ with $\alpha\leq 2$ , we deduce as $k\geq 2$ ,

[TABLE]

Thus,

[TABLE]

where $C$ is some positive constant depending on $p$ and $d$ . This shows that assumption $(ii)$ is satisfied.

We show now that assumption $(iii)$ holds. Using (79) for $q=d$ , we get

[TABLE]

where $C^{\prime}$ is some constant depending on $P$ . This proves condition $(iii)$ of Theorem 2.1. To show that the last assumption $(iv)$ is met, it suffices to observe that for any fixed $\textbf{H}\in(\mathcal{H}_{n}^{(\beta)})^{p}$ , with the same construction as in (90), there is a sequence $\textbf{H}_{m}\in(\mathcal{H}_{m}^{(\beta)})^{p}$ , for $m\geq n$ , such that

[TABLE]

and $W_{\alpha}(\textbf{H})=W_{\alpha}(\textbf{H}_{m})$ .

∎

9 Application to last-passage time

We prove in this last section Theorem 2.11.

Proof of Theorem 2.11.

We will verify the assumptions of Theorem 2.3. Assumption $(i)$ holds due to Proposition 7.1 with $v(n)=n^{\alpha}$ , and

[TABLE]

where $\mathcal{T}_{n}$ is defined in (80), $X^{+}$ denotes the matrix with coefficients $(X^{+}_{v})_{v}$ , and $m$ is the dimension of $\mathbb{R}^{I}$ . As

[TABLE]

is $n^{-1/2}$ -Lipschitz with respect to $||\ ||_{\ell^{2}}$ , assumption $(ii)$ is satisfied by the remark 2.2 (c).

Using the fact that $||\ ||_{\ell^{1}}\leq||\ ||_{\ell^{\alpha}}$ when $\alpha\leq 1$ , on $\mathbb{R}^{I}$ , we see that the condition $(iii)$ of Theorem 2.3 is met. To prove $(iv)^{\prime}$ , we first observe that

[TABLE]

Indeed, since the function $g$ is superadditive by [34, Proposition 2.1], we deduce that

[TABLE]

for any $H\in\mathbb{R}^{I}$ . Therefore, both sides of (92) are infinite if $x<g(1,...,1)$ . Now if $x\geq g(1,1)$ , and $H\in\mathbb{R}^{I}$ is such that $\mathcal{T}_{n}(H)=x$ , then denoting $\{v_{0},...,v_{p}\}$ the element of $\mathcal{A}_{m}$ achieving the supremum in (80), we get,

[TABLE]

Using the superadditivity of $g$ , it yields

[TABLE]

with equality for the matrix $H$ whose entries are all zero except $H_{(n,...,n)}=x-g(1,1)$ . This proves the equality (92). In particular, $L_{\alpha}$ is lower semi-continuous and therefore by the remark 2.2 (e), we deduce,

[TABLE]

As the matrices $H\in\mathbb{R}^{I}$ with $H_{v}=(x-g(1,...,1))_{+}\mathds{1}_{v=(n,...,n)}$ , achieves (92) for any $n$ , we deduce,

[TABLE]

Finally, as $\mathcal{T}_{n}(H)=\mathcal{T}_{n}(H^{+})$ , where $H^{+}$ is the matrix $(H_{v}^{+})_{v\in\{0,...,n\}^{d}}$ , we get

[TABLE]

This proves the last assumption $(iv)^{\prime}$ of Theorem 2.3.

∎

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. Ahlfors. Complex analysis: An introduction of the theory of analytic functions of one complex variable . Second edition. Mc Graw-Hill Book Co., New York-Toronto-London, 1966.
2[2] G. W. Anderson, A. Guionnet, and O. Zeitouni. An introduction to random matrices , volume 118 of Cambridge Studies in Advanced Mathematics . Cambridge University Press, Cambridge, 2010.
3[3] E. Artin. The gamma function . Translated by Michael Butler. Athena Series: Selected Topics in Mathematics. Holt, Rinehart and Winston, New York-Toronto-London, 1964.
4[4] F. Augeri. Large deviations principle for the largest eigenvalue of Wigner matrices without Gaussian tails. Electron. J. Probab. , 21:Paper No. 32, 49, 2016.
5[5] F. Augeri. On the large deviations of traces of random matrices. ar Xiv:1605.03894, accepted for publication in the Annales de l’Institut Henri Poincaré , May 2016.
6[6] Z. Bai and J. W. Silverstein. Spectral analysis of large dimensional random matrices . Springer Series in Statistics. Springer, New York, second edition, 2010.
7[7] Z. D. Bai and Y. Q. Yin. Necessary and sufficient conditions for almost sure convergence of the largest eigenvalue of a Wigner matrix. Ann. Probab. , 16(4):1729–1741, 1988.
8[8] D. Bakry, F. Barthe, P. Cattiaux, and A. Guillin. A simple proof of the Poincaré inequality for a large class of probability measures including the log-concave case. Electron. Commun. Probab. , 13:60–66, 2008.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

On heavy-tail phenomena in some large deviations problems

Abstract

1 Introduction

2 Main results

2.1 Theorem**.**

2.2 Remarks*.*

2.3 Theorem**.**

2.4 Remark*.*

2.1 Applications to Wigner matrices

2.5 Theorem**.**

2.6 Remark*.*

2.7 Theorem**.**

2.8 Remark*.*

2.9 Theorem**.**

2.10 Remark*.*

2.2 Application to last-passage percolation

2.11 Theorem**.**

2.3 Concentration inequalities

2.12 Definition**.**

2.13 Proposition**.**

2.14 Proposition**.**

2.15 Remark*.*

2.16 Remark*.*

2.17 Remark*.*

2.18 Proposition**.**

2.19 Proposition**.**

2.4 Spectral variation inequalities

2.20 Lemma**.**

2.21 Lemma**.**

Acknowledgements

2.5 Organization of the paper

3 Inf-convolution inequalities for ναn\nu_{\alpha}^{n}ναn​

3.1 Proposition**.**

3.2 Lemma**.**

3.3 Proposition** ([39, Theorem 1.2]).**

3.4 Corollary**.**

Proof.

3.5 Lemma**.**

Proof.

3.1 Behavior of the monotone rearrangement

3.6 Lemma** ([39, Lemma 2.5]).**

3.7 Remark*.*

3.8 Lemma**.**

Proof.

3.9 Remark*.*

3.10 Lemma**.**

Proof.

3.11 Lemma**.**

Proof.

3.2 A family of weights for να\nu_{\alpha}να​

3.12 Proposition**.**

Proof.

Proof of Proposition 3.1.

4 Large deviations

4.1 Proposition**.**

4.2 Remark*.*

Proof.

4.3 Lemma**.**

Proof.

4.4 Proposition**.**

4.5 Remark*.*

Proof.

Proof of Theorem 2.1.

Proof of Theorem 2.3.

5 Concentration inequalities

5.1 Some examples of Wigner matrices satisfying Cα\mathcal{C}_{\alpha}Cα​

5.1 Remark*.*

5.2 Lemma**.**

5.2 A deviation inequality for ναn\nu_{\alpha}^{n}ναn​, α∈(0,1)\alpha\in(0,1)α∈(0,1)

5.3 Proposition**.**

5.4 Remark*.*

Proof of Proposition 5.3.

5.5 Lemma**.**

2.1 Theorem.

*2.2 Remarks**.*

2.3 Theorem.

*2.4 Remark**.*

2.5 Theorem.

*2.6 Remark**.*

2.7 Theorem.

*2.8 Remark**.*

2.9 Theorem.

*2.10 Remark**.*

2.11 Theorem.

2.12 Definition.

2.13 Proposition.

2.14 Proposition.

*2.15 Remark**.*

*2.16 Remark**.*

*2.17 Remark**.*

2.18 Proposition.

2.19 Proposition.

2.20 Lemma.

2.21 Lemma.

3 Inf-convolution inequalities for $\nu_{\alpha}^{n}$

3.1 Proposition.

3.2 Lemma.

3.3 Proposition ([39, Theorem 1.2]).

3.4 Corollary.

3.5 Lemma.

3.6 Lemma ([39, Lemma 2.5]).

*3.7 Remark**.*

3.8 Lemma.

*3.9 Remark**.*

3.10 Lemma.

3.11 Lemma.

3.2 A family of weights for $\nu_{\alpha}$

3.12 Proposition.

4.1 Proposition.

*4.2 Remark**.*

4.3 Lemma.

4.4 Proposition.

*4.5 Remark**.*

5.1 Some examples of Wigner matrices satisfying $\mathcal{C}_{\alpha}$

*5.1 Remark**.*

5.2 Lemma.

5.2 A deviation inequality for $\nu_{\alpha}^{n}$ , $\alpha\in(0,1)$

5.3 Proposition.

*5.4 Remark**.*

5.5 Lemma.

*5.6 Remark**.*

5.7 Proposition.

*5.8 Remark**.*

5.9 Lemma ([37, Chapter 2 (5.25)]).

5.10 Lemma.

*5.11 Remark**.*

6.1 Proposition.

*6.2 Remark**.*

6.3 Proposition.

6.4 Proposition.

6.5 Theorem ([19, Theorem 2.6]).

6.6 Proposition.

6.7 Lemma.

6.8 Lemma.

6.9 Lemma.

7.1 Proposition.

7.2 Lemma.

8.1 Lemma.