Universal limits of substitution-closed permutation classes

Fr\'ed\'erique Bassino; Mathilde Bouvel; Valentin F\'eray; Lucas; Gerin; Micka\"el Maazoun; Adeline Pierrot

arXiv:1706.08333·math.PR·September 22, 2020

Universal limits of substitution-closed permutation classes

Fr\'ed\'erique Bassino, Mathilde Bouvel, Valentin F\'eray, Lucas, Gerin, Micka\"el Maazoun, Adeline Pierrot

PDF

TL;DR

This paper investigates the limiting behavior of uniform random permutations within substitution-closed classes, revealing a universal limit under certain conditions and identifying two other regimes with distinct limiting objects.

Contribution

It introduces a framework linking permutation class generating series to permuton limits, including a universal limit and regimes related to stable trees, using singularity analysis.

Findings

01

The limit depends on the generating series of simple permutations.

02

Under mild conditions, the limit is a deformation of the Brownian separable permuton.

03

Two other regimes with different limiting objects are identified.

Abstract

We consider uniform random permutations in proper substitution-closed classes and study their limiting behavior in the sense of permutons. The limit depends on the generating series of the simple permutations in the class. Under a mild sufficient condition, the limit is an elementary one-parameter deformation of the limit of uniform separable permutations, previously identified as the Brownian separable permuton. This limiting object is therefore in some sense universal. We identify two other regimes with different limiting objects. The first one is degenerate; the second one is nontrivial and related to stable trees. These results are obtained thanks to a characterization of the convergence of random permutons through the convergence of their expected pattern densities. The limit of expected pattern densities is then computed by using the substitution tree encoding of permutations…

Equations440

pat_{{2, 5, 7}} (65831247) = 312

pat_{{2, 5, 7}} (65831247) = 312

occ (π, σ)

occ (π, σ)

occ (π, σ)

μ_{σ} (d x d y) = n 1_{σ (⌈ x n ⌉) = ⌈ y n ⌉} d x d y .

μ_{σ} (d x d y) = n 1_{σ (⌈ x n ⌉) = ⌈ y n ⌉} d x d y .

S (z) = α \in S \sum z^{∣ α ∣}

S (z) = α \in S \sum z^{∣ α ∣}

R_{S} > 0 and r < R _{S} r \to R _{S} lim S^{'} (r) > \frac{2}{( 1 + R _{S} ) ^{2}} - 1.

R_{S} > 0 and r < R _{S} r \to R _{S} lim S^{'} (r) > \frac{2}{( 1 + R _{S} ) ^{2}} - 1.

\big{(}\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\big{)}_{\pi}\to\operatorname{\widetilde{occ}}(\pi,\bm{\mu}^{(p)}).

\big{(}\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\big{)}_{\pi}\to\operatorname{\widetilde{occ}}(\pi,\bm{\mu}^{(p)}).

E [occ (π, σ_{n})] = \frac{# { σ \in ⟨ S ⟩ _{n} , I \subset [ n ] : pat _{I} ( σ ) = π }}{( k n ) # ⟨ S ⟩ _{n}}

E [occ (π, σ_{n})] = \frac{# { σ \in ⟨ S ⟩ _{n} , I \subset [ n ] : pat _{I} ( σ ) = π }}{( k n ) # ⟨ S ⟩ _{n}}

T_{not \oplus} (z) = z + Λ (T_{not \oplus} (z)),

T_{not \oplus} (z) = z + Λ (T_{not \oplus} (z)),

\int_{[0, 1]^{2}} f d μ_{n} \to n \to + \infty \int_{[0, 1]^{2}} f d μ,

\int_{[0, 1]^{2}} f d μ_{n} \to n \to + \infty \int_{[0, 1]^{2}} f d μ,

μ_{n} \to n \to + \infty μ \Leftrightarrow d_{□} (μ_{n}, μ) \to n \to + \infty 0.

μ_{n} \to n \to + \infty μ \Leftrightarrow d_{□} (μ_{n}, μ) \to n \to + \infty 0.

occ (π, σ) = P^{I_{n, k}} (pat_{I_{n, k}} (σ) = π),

occ (π, σ) = P^{I_{n, k}} (pat_{I_{n, k}} (σ) = π),

\operatorname{\widetilde{occ}}(\pi,\mu)=\mathbb{P}^{\vec{\mathbf{x}},\vec{\mathbf{y}}}\,\big{(}\,\operatorname{Perm}(\vec{\mathbf{x}},\vec{\mathbf{y}})=\pi\,\big{)}.

\operatorname{\widetilde{occ}}(\pi,\mu)=\mathbb{P}^{\vec{\mathbf{x}},\vec{\mathbf{y}}}\,\big{(}\,\operatorname{Perm}(\vec{\mathbf{x}},\vec{\mathbf{y}})=\pi\,\big{)}.

occ (π, μ) = \int_{([0, 1]^{2})^{k}} 1_{Perm (x, y) = π} μ (d x_{1} d y_{1}) \dots μ (d x_{k} d y_{k})

occ (π, μ) = \int_{([0, 1]^{2})^{k}} 1_{Perm (x, y) = π} μ (d x_{1} d y_{1}) \dots μ (d x_{k} d y_{k})

E^{μ, x, y} [H (μ, (x_{1}, y_{1}), \dots, (x_{k}, y_{k}))] = E^{μ} [\int_{([0, 1]^{2})^{k}} μ (d x_{1} d y_{1}) \dots μ (d x_{k} d y_{k}) H (μ, (x_{1}, y_{1}), \dots, (x_{k}, y_{k}))] .

E^{μ, x, y} [H (μ, (x_{1}, y_{1}), \dots, (x_{k}, y_{k}))] = E^{μ} [\int_{([0, 1]^{2})^{k}} μ (d x_{1} d y_{1}) \dots μ (d x_{k} d y_{k}) H (μ, (x_{1}, y_{1}), \dots, (x_{k}, y_{k}))] .

∣ occ (π, σ) - occ (π, μ_{σ}) ∣ \leq \frac{1}{n} (2 k) .

∣ occ (π, σ) - occ (π, μ_{σ}) ∣ \leq \frac{1}{n} (2 k) .

P^{m_{k}} [d_{□} (μ_{Perm (m_{k}, ν)}, ν) \geq 16 k^{- 1/4}] \leq \frac{1}{2} e^{- k} .

P^{m_{k}} [d_{□} (μ_{Perm (m_{k}, ν)}, ν) \geq 16 k^{- 1/4}] \leq \frac{1}{2} e^{- k} .

E^{σ_{n}} [occ (π, σ_{n})] = E^{σ_{n}} [P^{I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π)] = P^{σ_{n}, I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π) .

E^{σ_{n}} [occ (π, σ_{n})] = E^{σ_{n}} [P^{I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π)] = P^{σ_{n}, I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π) .

E^{μ} [occ (π, μ)] = P^{μ, m_{k}} (Perm (m_{k}, μ) = π) .

E^{μ} [occ (π, μ)] = P^{μ, m_{k}} (Perm (m_{k}, μ) = π) .

H (ν, (x_{1}, y_{1}), \dots, (x_{k}, y_{k})) = 1_{d_{□} (μ_{Perm (x, y)}, ν) \geq 16 k^{- 1/4}},

H (ν, (x_{1}, y_{1}), \dots, (x_{k}, y_{k})) = 1_{d_{□} (μ_{Perm (x, y)}, ν) \geq 16 k^{- 1/4}},

P^{ν, m_{k}} [d_{□} (μ_{Perm (m_{k}, ν)}, ν) \geq 16 k^{- 1/4}] \leq \frac{1}{2} e^{- k} .

P^{ν, m_{k}} [d_{□} (μ_{Perm (m_{k}, ν)}, ν) \geq 16 k^{- 1/4}] \leq \frac{1}{2} e^{- k} .

P^{μ, m_{k}} (Perm (m_{k}, μ) = π) = P^{μ^{'}, m_{k}} (Perm (m_{k}, μ^{'}) = π),

P^{μ, m_{k}} (Perm (m_{k}, μ) = π) = P^{μ^{'}, m_{k}} (Perm (m_{k}, μ^{'}) = π),

E^{μ} [ϕ (μ)] - E^{μ^{'}} [ϕ (μ^{'})]

E^{μ} [ϕ (μ)] - E^{μ^{'}} [ϕ (μ^{'})]

+ (E^{μ, m_{k}} [ϕ (μ_{Perm (m_{k}, μ)})] - E^{μ^{'}, m_{k}^{'}} [ϕ (μ_{Perm (m_{k}^{'}, μ^{'})})])

+ E^{μ^{'}, m_{k}^{'}} [ϕ (μ_{Perm (m_{k}^{'}, μ^{'})}) - ϕ (μ^{'})],

E [occ (π, σ_{n})] n \to \infty Δ_{π} .

E [occ (π, σ_{n})] n \to \infty Δ_{π} .

P (ρ_{k} = π) = Δ_{π} = E [Λ_{π}] = E [occ (π, μ)] = P (Perm (m_{k}, μ) = π) .

P (ρ_{k} = π) = Δ_{π} = E [Λ_{π}] = E [occ (π, μ)] = P (Perm (m_{k}, μ) = π) .

\big{(}\operatorname{\widetilde{occ}}(\pi_{i},\mu_{\bm{\sigma}_{n}})\big{)}_{1\leq i\leq r}\stackrel{{\scriptstyle d}}{{\to}}\big{(}\operatorname{\widetilde{occ}}(\pi_{i},\bm{\mu})\big{)}_{1\leq i\leq r}.

\big{(}\operatorname{\widetilde{occ}}(\pi_{i},\mu_{\bm{\sigma}_{n}})\big{)}_{1\leq i\leq r}\stackrel{{\scriptstyle d}}{{\to}}\big{(}\operatorname{\widetilde{occ}}(\pi_{i},\bm{\mu})\big{)}_{1\leq i\leq r}.

E [occ (π, σ_{n})] \to n \to \infty E [Λ_{π}] .

E [occ (π, σ_{n})] \to n \to \infty E [Λ_{π}] .

P^{σ_{n}, I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π) = E^{σ_{n}} [occ (π, σ_{n})],

P^{σ_{n}, I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π) = E^{σ_{n}} [occ (π, σ_{n})],

P^{σ_{n}, I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π) \to P (ρ_{k} = π) .

P^{σ_{n}, I_{n, k}} (pat_{I_{n, k}} (σ_{n}) = π) \to P (ρ_{k} = π) .

E^{σ_{n}} [occ (π, μ_{σ_{n}})] = E^{σ_{n}} [occ (π, σ_{n})] + O (1/ n) = P^{I_{n, k}, σ_{n}} (pat_{I_{n, k}} (σ_{n}) = π) + O (1/ n) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Universal limits of substitution-closed permutation classes

Frédérique Bassino

Université Paris 13, Sorbonne Paris Cité, LIPN, CNRS UMR 7030, F-93430 Villetaneuse, France

[email protected]

,

Mathilde Bouvel

,

Valentin Féray

Institut für Mathematik, Universität Zürich, Winterthurerstr. 190, CH-8057 Zürich, Switzerland

[email protected]

,

Lucas Gerin

CMAP, École Polytechnique, CNRS, Route de Saclay, F-91128 Palaiseau Cedex, France

[email protected]

,

Mickaël Maazoun

École Normale Supérieure de Lyon, UMPA UMR 5669 CNRS, 46 allée d’Italie, F-69364 Lyon Cedex 07, France

[email protected]

and

Adeline Pierrot

LRI, Université Paris-Sud, Bat. 650 Ada Lovelace, F-91405 Orsay Cedex, France

[email protected]

Abstract.

We consider uniform random permutations in proper substitution-closed classes and study their limiting behavior in the sense of permutons.

The limit depends on the generating series of the simple permutations in the class. Under a mild sufficient condition, the limit is an elementary one-parameter deformation of the limit of uniform separable permutations, previously identified as the Brownian separable permuton. This limiting object is therefore in some sense universal. We identify two other regimes with different limiting objects. The first one is degenerate; the second one is nontrivial and related to stable trees.

These results are obtained thanks to a characterization of the convergence of random permutons through the convergence of their expected pattern densities. The limit of expected pattern densities is then computed by using the substitution tree encoding of permutations and performing singularity analysis on the tree series.

Key words and phrases:

permutation patterns, Brownian excursion, permutons

2010 Mathematics Subject Classification:

60C05,05A05

1 Introduction
1.1 Permutation classes and their limit
1.2 The permuton viewpoint
1.3 Substitution-closed classes
1.4 Our results: Universality
1.5 Our results: Beyond universality
1.6 Limits of proportions of pattern occurrences
1.7 Outline of the proof
1.8 Organization of the paper
2 Convergence of random permutons
2.1 Deterministic permutons and extracted permutations
2.2 Random permutons and convergence in distribution
3 Coding permutations by trees
3.1 Substitution trees
3.2 Induced trees
4 Exact enumeration of various families of trees
4.1 Generating functions of $\mathcal{S}$ -canonical trees (possibly with marked leaves)
4.2 Generating function counting trees with marked leaves inducing a given tree
5 Asymptotic analysis: The standard case $S^{\prime}(R_{S})>2/(1+R_{S})^{2}-1$
5.1 Definition of the biased Brownian separable permuton and statement of the theorem
5.2 Asymptotics of the generating function of trees with no or one marked leaf
5.3 Asymptotics of the generating function of marked trees with a given induced tree
5.4 Probability of tree patterns
5.5 Back to permutations
5.6 Occurrences of nonseparable patterns
6 Asymptotic analysis: The degenerate case $S^{\prime}(R_{S})<2/(1+R_{S})^{2}-1$
6.1 Asymptotic behavior of the main series
6.2 Probability of given patterns
6.3 Hypothesis $(CS)$ and convergence of uniform random simple permutations
7 Asymptotic analysis: The critical case $S^{\prime}(R_{S})=2/(1+R_{S})^{2}-1$
7.1 The case $\delta\in(1,2)$ .
7.2 The case $\delta>2$ .
A Complex analysis toolbox
A.1 Aperiodicity and Daffodil Lemma
A.2 Transfer theorem
A.3 Singular differentiation
A.4 Exponents of dominant singularity
A.5 An analytic implicit function theorem
A.6 Proof of Lemma 6.3
A.7 Proof of Lemma 7.3
B On the simulations given in the introduction
B.1 Biased Brownian permuton
B.2 Stable permutons
B.3 Simulations of permutations in classes

1. Introduction

The aim of this paper is to study the asymptotic behavior of a permutation of large size, picked uniformly at random in a substitution-closed permutation class generated by a given (finite or infinite) family of simple permutations satisfying additional conditions. We first give a few definitions necessary to present the recent literature on related problems, and to state our results.

1.1. Permutation classes and their limit

For any positive integer $n$ , the set of permutations of $[n]:=\{1,2,\ldots,n\}$ is denoted by $\mathfrak{S}_{n}$ . We write permutations of $\mathfrak{S}_{n}$ in one-line notation as $\sigma=\sigma(1)\sigma(2)\dots\sigma(n)$ . For a permutation $\sigma$ in $\mathfrak{S}_{n}$ , the size $n$ of $\sigma$ is denoted by $|\sigma|$ .

For $\sigma\in\mathfrak{S}_{n}$ , and $I\subset[n]$ of cardinality $k$ , let $\operatorname{pat}_{I}(\sigma)$ be the permutation of $\mathfrak{S}_{k}$ induced by $\{\sigma(i):i\in I\}$ . For example for $\sigma=65831247$ and $I=\{2,5,7\}$ we have

[TABLE]

since the values in the subsequence $\sigma(2)\sigma(5)\sigma(7)=514$ are in the same relative order as in the permutation $312$ . A permutation $\pi=\operatorname{pat}_{I}(\sigma)$ is a pattern involved (or contained) in $\sigma$ , and the subsequence $(\sigma(i))_{i\in I}$ is an occurrence of $\pi$ in $\sigma$ . When a pattern $\pi$ has no occurrence in $\sigma$ , we say that $\sigma$ avoids $\pi$ . The pattern containment relation defines a partial order on $\mathfrak{S}=\cup_{n}\mathfrak{S}_{n}$ : we write $\pi\preccurlyeq\sigma$ if $\pi$ is a pattern of $\sigma$ .

A permutation class is a family $\mathcal{C}$ of permutations that is downward closed for $\preccurlyeq$ , i.e. for any $\sigma\in\mathcal{C}$ and any pattern $\pi\preccurlyeq\sigma$ , it holds that $\pi\in\mathcal{C}$ . For every set $B$ of patterns, we denote by $\mathrm{Av}(B)$ the set of all permutations that avoid every pattern in $B$ . Clearly, for all $B$ , $\mathrm{Av}(B)$ is a permutation class. Conversely (see for instance [20, Paragraph 5.1.2]), every class $\mathcal{C}$ of permutations can be defined by a set $B$ of excluded patterns. Moreover, for any given $\mathcal{C}$ , we can define uniquely a set $B$ such that $\mathcal{C}=\mathrm{Av}(B)$ : it is enough to impose that $B$ is chosen minimal (for set inclusion) among all $B^{\prime}$ such that $\mathcal{C}=\mathrm{Av}(B^{\prime})$ . This $B$ (which happens to be an antichain) is called the basis of the class $\mathcal{C}$ . The basis of a permutation class may be finite or infinite (see [20, Paragraph 7.2.3]).

One in many ways permutation classes can be studied is by looking at the features of a typical large permutation $\sigma$ in the class. A particularly interesting characteristic is the frequency of occurrence of a pattern $\pi$ , especially when it is considered for all $\pi$ simultaneously. Denote by $\mathrm{occ}(\pi,\sigma)$ the number of occurrences of a pattern $\pi\in\mathfrak{S}_{k}$ in $\sigma\in\mathfrak{S}_{n}$ and by $\operatorname{\widetilde{occ}}(\pi,\sigma)$ the pattern density of $\pi$ in $\sigma$ . More formally

[TABLE]

where $\bm{I}$ is randomly and uniformly chosen among the $\binom{n}{k}$ subsets of $[n]$ with $k$ elements. The study of the asymptotics of $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ , where $\bm{\sigma}_{n}$ is a uniform random permutation of size $n$ in a permutation class $\mathcal{C}$ and $\pi\in\mathfrak{S}$ is a fixed pattern, has been carried out in several cases.

•

The behavior of $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})]$ for various classes $\mathcal{C}$ and fixed $\pi$ was investigated by Bóna [18, 19], Homberger [35], Chang, Eu and Fu [24] and Rudolf [55].

•

Janson, Nakamura and Zeilberger [39] considered higher moments and joint moments, rigourously when $\mathcal{C}=\mathfrak{S}$ , and also empirically for various classes $\mathcal{C}$ . A bit later, Janson [38] has given for $\mathcal{C}=\mathrm{Av}(132)$ the joint limit in distribution of the random variables $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ , properly rescaled as to yield a nontrivial limit.

A parallel line of work consists in studying the asymptotic shape of the diagram of $\bm{\sigma}_{n}$ . The diagram of a permutation $\sigma\in\mathfrak{S}_{n}$ is the set of points $\{(i,\sigma(i)),\,1\leq i\leq n\}$ in the Cartesian plane. To gain understanding of typical large permutations in $\mathcal{C}$ , one can investigate the geometric properties of the diagram of $\bm{\sigma}_{n}$ as $n\to\infty$ , possibly after rescaling this diagram so that it fits into a unit square.

•

Madras and Liu [43], Atapour and Madras [10] and Madras and Pehlivan [44] considered the asymptotic shape of $\bm{\sigma_{n}}$ when $\mathcal{C}=\mathrm{Av}(\tau)$ for small patterns $\tau$ .

•

In parallel, Miner and Pak [48] described very precisely the asymptotic shape of $\bm{\sigma}_{n}$ , when $\mathcal{C}=\mathrm{Av}(\tau)$ for the $6$ patterns $\tau$ in $\mathfrak{S}_{3}$ . These shapes are related to Brownian excursion, as explained by Hoffman, Rizzolo and Slivken [33, 34].

•

Bevan describes the limit shape of permutations in so-called connected monotone grid classes [15, Chapter 6].

These two points of view may seem different, but they are in fact tightly bound together. Indeed, as we shall see in Section 2, it follows from results of [36] that the convergence of pattern densities characterizes the convergence of the diagrams, seen as permutons. This important property was actually the main motivation for the introduction of permutons in [36].

1.2. The permuton viewpoint

A permuton is a probability measure on the unit square $[0,1]^{2}$ with uniform marginals, i.e. its pushforwards by the projections on the axes are both the Lebesgue measure on $[0,1]$ . Permutons generalize permutation diagrams in the following sense: to every permutation $\sigma\in\mathfrak{S}_{n}$ , we associate the permuton $\mu_{\sigma}$ with density

[TABLE]

Note that it amounts to replacing every point $(i,\sigma(i))$ in the diagram of $\sigma$ (normalized to the unit square) by a square of the form $[(i-1)/n,i/n]\times[(\sigma(i)-1)/n,\sigma(i)/n]$ , which has mass $1/n$ uniformly distributed.

Permutons were first considered by Hoppen, Kohayakawa, Moreira, Rath and Sampaio in [36], with the point of view of characterizing permutation sequences with convergent pattern densities. The name permuton and the measure point of view were given afterwards by Glebov, Grzesik, Klimošová and Král [31]. This recently introduced concept has already been the subject of many articles, including:

•

results on the set of possible pattern densities of permutons [31, 32, 40];

•

a large deviation principle in the space of permutons, giving access to the analysis of random permutations with fixed pattern densities [40];

•

the description of the limiting distribution of the number of fixed points (and more generally of cycles of a given length) for “equi-continuous” sequences of permutations with a limiting permuton [50];

•

central limit theorems and refinements for pattern occurrences in random permutation models associated to permutons [29];

•

the permuton convergence of some exponentially tilded models of random permutations [49];

•

a study of permuton-valued processes, in the context of random sorting networks [53].

In the context of this article, the theory of permutons is a nice framework to state scaling limit results for sequences of (random) permutations. Indeed, the space $\mathcal{M}$ of permutons is equipped with the topology of weak convergence of measures, which makes it a compact metric space. This allows one to define convergent sequences of permutations: we say that $(\sigma_{n})_{n}$ converges to $\mu$ when $(\mu_{\sigma_{n}})\to\mu$ weakly. Accordingly, for a sequence $(\bm{\sigma_{n}})_{n}$ of random permutations, we will consider the convergence in distribution of the associated random measures $(\mu_{\bm{\sigma_{n}}})$ in the weak topology. The limiting object is then a random permuton.

By definition, convergence to a permuton encodes the first-order asymptotics of the shape of a sequence of permutations. As we shall see in Section 2, it also encodes the first-order asymptotics of pattern densities: a sequence $(\bm{\sigma}_{n})_{n}$ of random permutations converges in distribution to a random permuton if and only if the sequences of random variables $(\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n}))_{n}$ converge in distribution, jointly for all $\pi\in\mathfrak{S}$ . Moreover, for any pattern $\pi$ , the limit distribution of the density of $\pi$ can be expressed as a function of the limit permuton.

Our previous article [12] studies the limit of the class $\mathcal{C}=\mathrm{Av}(2413,3142)$ of separable permutations, in terms of pattern densities and permutons.

Theorem 1.1.

Let $\bm{\sigma}_{n}$ be a uniform random separable permutation of size $n$ . There exists a random permuton $\bm{\mu}$ , called the Brownian separable permuton, such that $(\mu_{\bm{\sigma}_{n}})_{n}$ converges in distribution to $\bm{\mu}$ .

The result of [12] is more precise, and describes the asymptotic joint distribution of the random variables $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ as a measurable functional of a signed Brownian excursion. This object is a normalized Brownian excursion whose strict local minima are decorated with an i.i.d. sequence of balanced signs in $\{+,-\}$ . In another paper [42], the fifth author gives a direct construction of $\bm{\mu}$ from this signed Brownian excursion. In particular, $\bm{\mu}$ is not equal almost surely to a given permuton; in this regard, separable permutations behave differently from all other classes analysed so far in the literature (which converge to a deterministic permuton).

The class of separable permutations is the smallest nontrivial substitution-closed class, as defined in the next subsection. The present paper aims at showing a convergence result similar to Theorem 1.1 for other substitution-closed classes. We will see that in many cases the limit belongs to a one-parameter family of deformations of the Brownian separable permuton: the biased Brownian separable permuton $\bm{\mu}^{(p)}$ of parameter $p\in(0,1)$ is obtained from a biased signed Brownian excursion (defined similarly to the signed Brownian excursion but with each sign having probability $p$ of being a $+$ ). Simulations of the biased Brownian separable permuton are given in Fig. 1. A precise definition will be given in Section 5 (Eq. 20).

Finally, we mention that although permutons are a very nice, natural, and powerful way of studying “limits of permutation classes” (in particular because they unify many earlier results, as explained above), this approach has its weaknesses. Most importantly, it gives no information beyond the first order. For instance, the results of [48, 33] describe the canoe shape of large permutations in classes avoiding one pattern of length three. As the width of the canoe is a $o(n)$ , one only sees the diagonal (or antidiagonal) in the permuton limit, but has no information about the fluctuations around this limit.

1.3. Substitution-closed classes

Definition 1.2.

*Let $\theta=\theta(1)\cdots\theta(d)$ be a permutation of size $d$ , and let $\pi^{(1)},\dots,\pi^{(d)}$ be $d$ other permutations. The substitution of $\pi^{(1)},\dots,\pi^{(d)}$ in $\theta$ is the permutation of size $|\pi^{(1)}|+\dots+|\pi^{(d)}|$ obtained by replacing each $\theta(i)$ by a sequence of integers isomorphic to $\pi^{(i)}$ while keeping the relative order induced by $\theta$ between these subsequences.

This permutation is denoted by $\theta[\pi^{(1)},\dots,\pi^{(d)}]$ . We sometimes refer to $\theta$ as the skeleton of the substitution.*

When $\theta$ is $12\ldots k$ (resp. $k\ldots 21$ ), for any value of $k\geq 2$ , we rather write $\oplus$ (resp. $\ominus$ ) instead of $\theta$ . Note that the specific value of $k$ does not appear in this notation, but can be recovered counting the number of permutations $\pi^{(i)}$ which are substituted in $\oplus$ (resp. $\ominus$ ).

Examples of substitution (see Fig. 2 below) are conveniently presented representing permutations by their diagrams: the diagram of $\theta[\pi^{(1)},\dots,\pi^{(d)}]$ is obtained by blowing up each point $\theta_{i}$ of $\theta$ onto a square containing the diagram of $\pi^{(i)}$ .

By definition of permutation classes, if $\theta[\pi^{(1)},\dots,\pi^{(d)}]\in\mathcal{C}$ for some permutation class $\mathcal{C}$ , then $\theta,\pi^{(1)},\dots,\pi^{(d)}\in\mathcal{C}$ . The converse is not always true.

Definition 1.3.

A permutation class $\mathcal{C}$ is substitution-closed if, for every $\theta,\pi^{(1)},\dots,\pi^{(d)}$ in $\mathcal{C}$ , $\theta[\pi^{(1)},\dots,\pi^{(d)}]\in\mathcal{C}$ .

The focus of this paper will be substitution-closed classes. To study such classes it is essential to observe that any permutation has a canonical decomposition using substitutions, which can be encoded in a tree. This decomposition is canonical in the same sense as the decomposition of integers into products of primes. In this analogy, simple permutations play the role of prime numbers and the substitution plays the role of the product. We first give a simple definition: a permutation $\sigma$ is $\oplus$ -indecomposable (resp. $\ominus$ -indecomposable) if it cannot be written as $\oplus[\pi^{(1)},\pi^{(2)}]$ (resp. $\ominus[\pi^{(1)},\pi^{(2)}]$ ), (or equivalently, if there is no $d$ such that $\sigma$ can be written as $\oplus[\pi^{(1)},\dots,\pi^{(d)}]$ (resp. $\ominus[\pi^{(1)},\dots,\pi^{(d)}]$ )).

Definition 1.4.

A simple permutation is a permutation of size $n>2$ that does not map any nontrivial interval (i.e. a range in $[n]$ containing at least two and at most $n-1$ elements) onto an interval.

For instance, $451326$ is not simple as it maps $[3;5]$ onto $[1;3]$ . The smallest simple permutations are $2413$ and $3142$ (there is no simple permutation of size $3$ ).

Remark: Usually in the literature, the definition of a simple permutation requires only $n\geq 2$ and not $n>2$ , so that $12$ and $21$ are considered to be simple. However in our work, $12$ and $21$ do not play the same role as the other simple permutations, that is why we do not consider them to be simple.

Theorem 1.5 (Decomposition of permutations, Proposition 2 in [1]).

Every permutation $\sigma$ of size $n\geq 2$ can be uniquely decomposed as either:

•

$\alpha[\pi^{(1)},\dots,\pi^{(d)}]$ , where $\alpha$ is simple (of size $d\geq 4$ ),

•

$\oplus[\pi^{(1)},\dots,\pi^{(d)}]$ , where $d\geq 2$ and $\pi^{(1)},\dots,\pi^{(d)}$ are $\oplus$ -indecomposable,

•

$\ominus[\pi^{(1)},\dots,\pi^{(d)}]$ , where $d\geq 2$ and $\pi^{(1)},\dots,\pi^{(d)}$ are $\ominus$ -indecomposable.

This decomposition theorem can be applied recursively inside the permutations $\pi^{(i)}$ appearing in the items above, until we reach permutations of size $1$ . Doing so, a permutation $\sigma$ can be naturally encoded by a rooted planar tree, whose internal nodes are labeled by the skeletons of the substitutions that are considered along the recursive decomposition process, and whose leaves correspond to the elements of $\sigma$ . This construction provides a one-to-one correspondence between permutations and canonical trees (defined below) that maps the size to the number of leaves.

Definition 1.6.

A canonical tree is a rooted planar tree whose internal nodes carry labels satisfying the following constraints.

•

Internal nodes are labeled by $\oplus,\ominus$ , or by a simple permutation.

•

A node labeled by $\alpha$ has degree111Throughout the paper, by degree of a node in a tree, we mean the number of its children (which is sometimes called arity in other works). Note that it is different from the graph-degree: for us, the edge to the parent (if it exists) is not counted in the degree.* $|\alpha|$ , nodes labeled by $\oplus$ and $\ominus$ have degree at least $2$ .*

•

A child of a node labeled by $\oplus$ (resp. $\ominus$ ) cannot be labeled by $\oplus$ (resp. $\ominus$ ).

Canonical trees are known in the literature under several names: decomposition trees, substitution trees,…We choose the term canonical because we consider many variants of substitution trees in this paper, but only these canonical ones provide a one-to-one correspondence with permutations.

The representation of permutations by their canonical trees is essential in the study of substitution-closed classes. The reason is that, for any such class $\mathcal{C}$ , the set of canonical trees of permutations in $\mathcal{C}$ can be easily described.

Proposition 1.7.

Let $\mathcal{C}$ be a substitution-closed permutation class, and assume222Otherwise, $\mathcal{C}=\{12\ldots k:k\geq 1\}$ or $\mathcal{C}=\{k\ldots 21:k\geq 1\}$ or $\mathcal{C}=\{1\}$ and these cases are trivial. that $12,21\in\mathcal{C}$ . Denote by $\mathcal{S}$ the set of simple permutations in $\mathcal{C}$ . The set of canonical trees encoding permutations of $\mathcal{C}$ is the set of all canonical trees built on the set of nodes $\{\oplus,\ominus\}\cup\{\alpha:\alpha\in\mathcal{S}\}$ .

Proof.

First, if a canonical tree contains a node labeled by a simple permutation $\alpha\notin\mathcal{S}$ , then the corresponding permutation $\sigma$ contains the pattern $\alpha\notin\mathcal{C}$ , and hence $\sigma\notin\mathcal{C}$ . Second, by induction, all canonical trees built on $\{\oplus,\ominus\}\cup\{\alpha:\alpha\in\mathcal{S}\}$ encode permutations of $\mathcal{C}$ , because $\mathcal{C}$ is substitution-closed. If necessary, details can be found in [1, Lemma 11]. ∎

For instance, the class $\mathrm{Av}(2413,3142)$ of separable permutations studied in [12] corresponds to the set of all canonical trees built on $\{\oplus,\ominus\}$ , i.e., to $\mathcal{S}=\emptyset$ . It is therefore the smallest nontrivial substitution-closed class.

*Observation 1.8**.*

Let $\mathcal{C}$ be any substitution-closed permutation class, and let $\mathcal{S}$ be the set of simple permutations in $\mathcal{C}$ . Because $\mathcal{C}$ is a class, it holds that for all $\alpha\in\mathcal{S}$ , if $\alpha^{\prime}$ is a simple permutation such that $\alpha^{\prime}\preccurlyeq\alpha$ , then $\alpha^{\prime}\in\mathcal{S}$ . Whenever a set $\mathcal{S}$ of simple permutations satisfies this property, we say that $\mathcal{S}$ is downward-closed (implicitly: for $\preccurlyeq$ and among the set of simple permutations).

Thanks to their encoding by families of trees, it can be proved that substitution-closed permutation classes (possibly, satisfying additional constraints) share a common behavior. For example, the canonical tree representation of their elements imply that all substitution-closed classes with finitely many simple permutations have an algebraic generating function [1, Corollary 14]. (This is actually easy, the main contribution of [1] being to generalize this algebraicity result to all classes containing a finite number of simple permutations, again using canonical trees as a key tool.) Our work illustrates this universality paradigm in probability theory: we prove that the biased Brownian separable permuton is the limiting permuton of many substitution-closed classes (see Theorem 1.10 and to a lesser extent Theorem 7.8).

1.4. Our results: Universality

Let $\mathcal{S}$ be a (finite or infinite) set of simple permutations. We denote by $\langle\mathcal{S}\rangle_{n}$ the set of permutations of size $n$ whose canonical trees use only nodes $\oplus$ , $\ominus$ and $\alpha\in\mathcal{S}$ , and we define $\langle\mathcal{S}\rangle=\cup_{n}\langle\mathcal{S}\rangle_{n}$ . From Proposition 1.7 and Observation 1.8, every substitution-closed permutation class $\mathcal{C}$ containing $12$ and $21$ can be written as $\mathcal{C}=\langle\mathcal{S}\rangle$ for a downward-closed set $\mathcal{S}$ of simple permutations (which is just the set of simple permutations in $\mathcal{C}$ ).

*Remark 1.9**.*

For a generic (not necessarily downward-closed) set $\mathcal{S}$ of simple permutations, $\langle\mathcal{S}\rangle$ is a family of permutations more general than a substitution-closed permutation class. The results that we obtain apply not only to permutation classes but also to such sets of permutations.

Note however that our work does not consider substitution-closed sets of permutations not containing either $12$ or $21$ (as mentioned above, a permutation class not containing one of these two permutations is necessary trivial, but there might be interesting such substitution-closed sets). In principle, such sets of permutations could also be studied by the approach developed in this paper, but we prefer to leave such cases outside of our study. Indeed, to cover them, it would require to re-do all computations, modifying the combinatorial equations that we start from (see Proposition 12 p. 12) and all equations that follow, so as not to allow the nodes labeled $\oplus$ and/or $\ominus$ .

We are interested in the asymptotic behavior of a uniform permutation $\bm{\sigma}_{n}$ in $\langle\mathcal{S}\rangle_{n}$ which we describe in terms of permutons. Let

[TABLE]

be the generating function of $\mathcal{S}$ and let $R_{S}\in[0,+\infty]$ be the radius of convergence of $S$ .

Theorem 1.10 (Main Theorem: the standard case).

Let $\mathcal{S}$ be a set of simple permutations such that

[TABLE]

For every $n\geq 1$ , let $\bm{\sigma}_{n}$ be a uniform permutation in $\langle\mathcal{S}\rangle_{n}$ , and let $\mu_{\bm{\sigma}_{n}}$ be the random permuton associated with $\bm{\sigma}_{n}$ . The sequence $(\mu_{\bm{\sigma}_{n}})_{n}$ tends in distribution in the weak convergence topology to the biased Brownian separable permuton $\bm{\mu}^{(p)}$ whose parameter $p$ is given in (21) p. 21.

An important point in Theorem 1.10 is that the limiting object depends on $\mathcal{S}$ only through the parameter $p$ . It turns out that $p$ only depends on the number of occurrences of the patterns $12$ and $21$ in the elements of $\mathcal{S}$ . We illustrate this universality of the limiting object on Fig. 3, by showing large uniform random permutations in two different substitution-closed classes: the first one has a finite set of simple permutations $\mathcal{S}=\{2413,3142,24153,42513\}$ , while the second is the substitution closure of $\mathrm{Av}(321)$ , which contains infinitely many simple permutations and satisfies (H1) (as we will explain below). Although this is hard to see on the picture, the corresponding values of the biaised parameter are different, namely .5 and around .6 respectively (see Section 5, Examples 5.3 and 5.5).

In the following, to lighten the notation, we write $S^{\prime}(R_{S}):=\lim_{r\rightarrow R_{S}\atop r<R_{S}}S^{\prime}(r)$ . Note that $S^{\prime}(R_{S})$ may be $\infty$ .

The case when Condition (H1) of Theorem 1.10 is not satisfied is discussed in the next section. When Condition (H1) is satisfied the case is called standard because there are natural and easy sufficient conditions to ensure this case (that are given below). Moreover, this case includes most sets $\mathcal{S}$ studied so far in the literature on permutation classes, to our knowledge. This gives a fairly precise (and positive) answer to an important question raised in our previous article [12]: is the Brownian separable permuton universal (in the sense that it describes the limit of a large family of substitution-closed classes)?

We now give several cases in which Condition (H1) of Theorem 1.10 is satisfied.

•

If $S$ is a generating function with radius of convergence $R_{S}>\sqrt{2}-1$ , (H1) is satisfied. Indeed, the condition $R_{S}>\sqrt{2}-1$ implies $\frac{2}{(1+R_{S})^{2}}-1<0$ , and $S^{\prime}(R_{S})$ is nonnegative since $S^{\prime}$ (like $S$ ) is a series with nonnegative coefficients. In particular, the situation where $R_{S}>\sqrt{2}-1$ covers the cases where there are finitely many simple permutations in the class (then $S$ is a polynomial and $R_{S}=\infty$ ), and more generally where $R_{S}=1$ (i.e. the number of simple permutations of size $n$ grows subexponentially).

•

If $S^{\prime}$ is divergent at $R_{S}$ , (H1) is trivially verified. In particular, this happens when $S$ is a rational generating function, or when $S$ has a square root singularity at $R_{S}$ .

In the literature, there are quite a few examples of permutations classes whose set $\mathcal{S}$ of simple permutations has been enumerated. We can therefore ask whether Condition (H1) applies to them. In most examples we could find, it is indeed satisfied, and this follows from the discussion above. We record these examples here.

•

Classes with finitely many simple permutations have attracted a fair amount of attention, see [1] and subsequently [13, 21, 23].

•

Several families of simple permutations with a bounded number of elements of each size have appeared in the literature: the family of exceptional simple permutations (also called simple parallel alternations in [22]), the family of wedge simple permutations (see also [22]), the families of oscillations and quasi-oscillations (see [14]), and the families of simple permutations contained in the following three classes: $\mathrm{Av}(4213,3142)$ , $\mathrm{Av}(4213,1342)$ and $\mathrm{Av}(4213,3124)$ – see [5].

•

The family of simple pin-permutations has a rational generating function – see [14].

•

The generating function $S$ is also rational when $\mathcal{S}$ is the set of simple permutations contained in several permutation classes defined by the avoidance of two patterns of size $4$ , namely $\mathrm{Av}(3124,4312)$ – see [51], $\mathrm{Av}(2143,4312)$ and $\mathrm{Av}(1324,4312)$ – see [2], $\mathrm{Av}(2143,4231)$ – see[3], $\mathrm{Av}(1324,4231)$ – see [6], $\mathrm{Av}(4312,3142)$ and $\mathrm{Av}(4231,3124)$ – see [5].

•

The set $\mathcal{S}$ of simple permutations of the class $\mathrm{Av}(4231,35142,42513,351624)$ enumerated in [7] is also rational.

•

We come back to the above example, where $\mathcal{C}$ is the substitution of $Av(321)$ . This class has been studied in [11], where an explicit basis of avoided patterns is given. In this case, $\mathcal{S}$ is the set of simple permutations avoiding $321$ , whose generating function $S$ is computed in [8]: it has a square-root singularity at $R_{S}=\tfrac{1}{3}$ , which proves that (H1) is fulfilled.

In addition to verifying Condition (H1), we have computed the numerical value of the parameter $p$ for some of the above-mentioned sets $\mathcal{S}$ of simple permutations; see Examples 5.3, 5.4 and 5.5 (p. 5.3).

Notably absent from the above list is the class $\mathrm{Av}(2413)$ , enumerated in [56, 17]. Since the avoided pattern, $2413$ , is simple, this class is substitution-closed. Its generating series behaves as $C(\rho-z)^{\mathbf{3/2}}$ around its dominant singularity $\rho=1/8$ . This prevents the set of simple permutations in this class to satisfy Condition (H1); compare with Proposition 5.8.

1.5. Our results: Beyond universality

When $R_{S}>0$ , for the two remaining cases $S^{\prime}(R_{S})<2/(1+R_{S})^{2}-1$ and $S^{\prime}(R_{S})=2/(1+R_{S})^{2}-1$ , the asymptotic behavior of $\mu_{\bm{\sigma}_{n}}$ is qualitatively different, and the results require slight additional hypotheses and notation. As a consequence, for the moment we only briefly describe these behaviors, the results being stated with full rigor later.

•

Case $S^{\prime}(R_{S})<2/(1+R_{S})^{2}-1$ . This is a degenerate case.

We first show in Theorem 6.9 that, with a small additional assumption which will be called $(CS)$ , the sequence $(\mu_{\bm{\sigma}_{n}})$ of random permutons converges. If uniform simple permutations in $\mathcal{S}\cap\mathfrak{S}_{n}$ have a limit (in the sense of permutons), we show that the limit of permutations in $\langle\mathcal{S}\rangle$ is the same (see Proposition 6.10 and the subsequent comment). This explains the terminology “degenerate”: all permutations in the class (or set) $\langle\mathcal{S}\rangle$ are close to the simple ones, and the “composite” structure of permutations does not appear in the limit.

•

Case $S^{\prime}(R_{S})=2/(1+R_{S})^{2}-1$ . This critical case is more subtle.

We again need to assume the above mentioned hypothesis $(CS)$ . According to the behavior of $S$ near $R_{S}$ , the limiting permuton of $(\mu_{\bm{\sigma}_{n}})$ can either be the (biased) Brownian separable permuton (Theorem 7.8) or belong to a new family of stable permutons (Theorem 7.6). Finite substructures of stable permutons are connected to those of the random stable tree (see [28]), which explains the terminology. Two simulations are presented in Fig. 4.

We believe that the above-mentioned class $\mathrm{Av}(2413)$ belongs to the degenerate regime. Indeed, in the critical regime, the singularity exponent of the class should be smaller than $1$ , and cannot be $3/2$ , as for $\mathrm{Av}(2413)$ . Since there is no direct description of the simple permutations in $\mathrm{Av}(2413)$ , it seems however out of reach to prove our hypothesis $(CS)$ for this specific class. We are therefore unable to describe its limiting permuton and let this open for further research. We refer to [45, Fig. 7] for simulations of uniform random permutations in this class.

*Remark 1.11**.*

The variety of behaviors that we observe can be informally understood in terms of trees. We have seen in Section 1.3 that permutations in $\langle\mathcal{S}\rangle$ can be encoded by trees. Taking a uniform element in $\langle\mathcal{S}\rangle_{n}$ , we can prove that the corresponding tree is a multi-type Galton-Watson tree conditioned on having $n$ leaves. This link with conditioned Galton-Watson trees is not used in this paper, but may give intuition on our results.

In the standard and critical case, these Galton-Watson trees are critical. It is therefore not surprising to see two different limiting behaviors. When the law of reproduction has finite variance, we get one behavior related to the Brownian excursion and the Brownian continuum random tree (that we describe as the universal one). When the law of reproduction has infinite variance, we get the behavior related to stable trees. In the standard case, the law of reproduction always has finite variance.

On the contrary, in the degenerate case, the underlying Galton-Watson tree model is subcritical. At the limit, such trees conditioned to being large have one internal node of very high degree ([37, Theorem 7.1, case (ii)]). This node corresponds to a large simple permutation in the tree encoding a uniform random permutation $\bm{\sigma}_{n}$ in $\langle\mathcal{S}\rangle$ . It is therefore not surprising that $\bm{\sigma}_{n}$ is asymptotically close to a uniform simple permutation in $\langle\mathcal{S}\rangle$ .

*Remark 1.12**.*

The reader may have noticed that all cases where we describe the asymptotic behavior of $\mu_{\bm{\sigma}_{n}}$ are such that $R_{S}>0$ .

Observe that it is always the case for proper permutation classes (i.e., permutation classes different from $\mathfrak{S}$ ). Indeed, from the Marcus-Tardos Theorem [46], the number of permutations of size $n$ in a proper class is at most $c^{n}$ , for some constant $c$ . For the class $\mathfrak{S}$ , we however do have $R_{S}=0$ , since there are asymptotically $e^{-2}n!(1+\mathcal{O}(1/n))$ simple permutations of size $n$ [4, Theorem 5]. In this case, the sequence $(\mu_{\bm{\sigma}_{n}})_{n}$ of permutons associated with a uniform permutation $\bm{\sigma}_{n}$ in $\mathfrak{S}$ converges in distribution to the uniform measure on $[0,1]^{2}$ . The situation where $R_{S}=0$ may happen as well for sets $\langle\mathcal{S}\rangle$ where $\mathcal{S}$ is not downward-closed, but we leave these cases open.

1.6. Limits of proportions of pattern occurrences

Let us change our approach and discuss in this section the asymptotic behavior (as $n\to\infty$ ) of the proportion $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ of occurrences of a fixed pattern $\pi$ in $\bm{\sigma}_{n}$ (as done in [18, 19, 24, 35, 38, 39, 55] for uniform random permutations in various classes). Since most examples fit in that regime, we focus here on the standard case (when (H1) is satisfied).

As mentioned in Section 1.2 and explained in more details in Section 2, the convergence of $\mu_{\bm{\sigma}_{n}}$ towards $\bm{\mu}^{(p)}$ implies the (joint) convergence in distribution

[TABLE]

The limiting random variables $\operatorname{\widetilde{occ}}(\pi,\bm{\mu}^{(p)})$ have been studied in [12, Section 9] (for $p=.5$ ): in particular, $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ is non-deterministic if and only if $\pi$ is separable of size at least 2 and it is possible to compute their moment algorithmically. These results are easily extended to the general case $p\in(0,1)$ . Therefore, for separable patterns $\pi$ , (2) establishes the convergence of $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ to a non-deterministic limit. Since these are bounded variables, their (joint) moments also converge to the (joint) moments of the limiting vector, which can be computed algorithmically (even if in practice only low order moments can be effectively computed; see the discussion in [12, Section 9]). Note that all these limiting moments are trivially nonzero, since these are moments of nondeterministic nonnegative random variables.

For nonseparable patterns however, the situation is different: (2) only entails the convergence of $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ to [math]. Indeed, if $\pi$ is nonseparable, the limiting quantity $\operatorname{\widetilde{occ}}(\pi,\bm{\mu}^{(p)})$ is identically [math] (this is a consequence of [12, Proposition 9.1] when $p=.5$ , the result being easily extended to $p\in(0,1)$ ).

We can go further and ask whether $\big{(}\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\big{)}_{n}$ has a limit in distribution with some appropriate normalization. We therefore investigate the moments of $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ . In Section 5.6, we define some permutation statistics $\operatorname{db}(\pi)$ (see Eq. 33) and show that under the hypothesis (H1) we have the following asymptotic behavior333We say that the sequence $(a_{n})$ behaves as $\Theta(b_{n})$ if there are $c,C>0$ such that $c|b_{n}|\leq|a_{n}|\leq C|b_{n}|$ for every $n\geq 1$ ..

Proposition 5.13. * For each $\pi\in\mathcal{C}$ and $m\geq 1$ , we have $\mathbb{E}[(\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n}))^{m}]=\Theta(n^{-\operatorname{db}(\pi)/2})$ . *

Proposition 5.13 also holds for separable patterns $\pi$ : in that case $\operatorname{db}(\pi)$ =0 and we have $\mathbb{E}[(\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n}))^{m}]=\Theta(1)$ . No news here, since the moments of $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ have nonzero limits, as previously explained. For nonseparable patterns, $\operatorname{db}(\pi)$ is positive and measures in some sense how nonseparable $\pi$ is. Note that the order of magnitude of $\mathbb{E}[(\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n}))^{m}]$ is independent of $m$ , which implies that there is a set of probability $\Theta(n^{-\operatorname{db}(\pi)/2})$ on which the variables $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ stays bounded away from [math] (see Corollary 5.15). This event of small probability contributes to the asymptotic behavior of moments, and thus the method of moments is inappropriate to find a limiting distribution for some appropriate normalization of $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ . Finding such a limiting distribution is therefore left as an open question.

1.7. Outline of the proof

In our previous paper [12] (i.e. when the family of simple permutations is $\mathcal{S}=\emptyset$ ), the proof of the convergence to the Brownian separable permuton strongly relied on a connection to Galton-Watson trees conditioned on having a given number of leaves. This allowed us to use fine results by Kortchemski [41] or Pitman and Rizzolo [52] on such conditioned random tree models.

For a general family $\mathcal{S}$ , generalizing this approach would require delicate results on the asymptotic behavior of conditioned multitype Galton-Watson trees. Moreover, there are several other steps in the main proofs of [12], in particular the subtree exchangeability argument, that are not easily adapted.

The strategy developed in the present paper is different. We strongly use the framework of permutons. Indeed, we first show that to establish the convergence in distribution of $(\mu_{\bm{\sigma}_{n}})_{n}$ to some random permuton $\bm{\mu}$ , it is enough to prove the convergence of $\Big{(}\mathbb{E}\left[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\right]\Big{)}_{n}$ for every pattern $\pi$ (see Theorem 2.5). By definition, if $\pi\in\mathfrak{S}_{k}$ and $n\geq k$ ,

[TABLE]

The asymptotic behavior of the numerator and denominator is then obtained with analytic combinatorics, which allows us to transfer from the behavior of a generating series near its singularity to the asymptotic behavior of its coefficients. This goes in three steps.

Step 1: Enumeration. We compute (or characterize by an implicit equation) some generating series. For instance to estimate the denominator of (3) we consider $\sum_{n\geq 1}\#\langle\mathcal{S}\rangle_{n}\ z^{n}$ . We readily use the size-preserving bijection between $\langle\mathcal{S}\rangle$ and the class $\mathcal{T}$ of $\mathcal{S}$ -canonical trees, counted by the number of leaves. Hence the generating function we want to compute is the same as that of $\mathcal{T}$ , denoted $T$ .

Using again the encoding of permutations by trees, the numerator can be described as a number of trees with marked leaves and some conditions on the tree induced by these marked leaves. Obtaining generating functions for such combinatorial classes is possible, and needs the introduction of several intermediary functions which count trees with various contraints, and possibly one marked leaf. This is detailed in Section 4.

Step 2: Singularity analysis. Then we want to know the singular behavior of the generating functions we computed so far. As it turns out, the singular behavior of some intermediate function $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ drives the singular behavior of all the other series. The function $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is characterized by the implicit equation

[TABLE]

where $\Lambda$ is a known analytic function with radius of convergence $R_{\Lambda}$ that involves $S$ and some rational functions (see Eq. 23 p. 23). Hence the behavior of $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ depends on whether there is a point inside the disk of convergence $D(0,R_{\Lambda})$ where $\Lambda^{\prime}=1$ , because around such a critical point, the equation (4) is not invertible. Since $\Lambda$ is a series with positive integer coefficients, it suffices to check the sign of $\Lambda^{\prime}(R_{\Lambda})-1$ , which can easily be translated in terms of the function $S$ . This is where the sign of $S^{\prime}(R_{S})-2/(1+R_{S})^{2}+1$ appears, leading to the three different cases. More precisely444In this informal description, we left out some conditions on the singularity of $S$ that appear in the critical and degenerate cases.

•

The standard case $S^{\prime}(R_{S})>2/(1+R_{S})^{2}-1$ is equivalent to $\Lambda^{\prime}(R_{\Lambda})>1$ . In this case there is a unique critical point $\tau\in(0,R_{\Lambda})$ . As a result, the radius of convergence of $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ is $\rho=\tau-\Lambda(\tau)$ , and the analyticity of $\Lambda$ around $\tau=T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}(\rho)$ implies that $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ has a singularity of exponent $1/2$ (Proposition 5.8). Such a behavior is sometimes called branch point in the literature: $\Lambda$ is analytic at $\tau$ but the equation (4) has two solutions (called branches) near $\rho$ and one cannot find an analytic solution in a neighbourhood of $\rho$ . The solution $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ is therefore singular at $\rho$ .

•

The degenerate case $S^{\prime}(R_{S})<2/(1+R_{S})^{2}-1$ is equivalent to $\Lambda^{\prime}(R_{\Lambda})<1$ . In this case there is no critical point in the disk $D(0,R_{\Lambda})$ nor at its boundary. As a result, the unique dominant singularity of $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ is the point $\rho=R_{\Lambda}-\Lambda(R_{\Lambda})$ where $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}(\rho)$ reaches the singularity $R_{\Lambda}$ of $\Lambda$ . Moreover, $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ has a bounded derivative at its singularity, and so has exponent $\delta>1$ , which is the same as the exponent of $S$ (Lemma 6.3).

•

The critical case $S^{\prime}(R_{S})=2/(1+R_{S})^{2}-1$ is equivalent to $\Lambda^{\prime}(R_{\Lambda})=1$ . In this case there is no critical point inside the disk $D(0,R_{\Lambda})$ , but the singularity $R_{\Lambda}$ of $\Lambda$ is a critical point. Once again the radius of convergence of $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ is $\rho=R_{\Lambda}-\Lambda(R_{\Lambda})$ , but $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ has no first derivative at its singularity. Here the exponent of the singularity of $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ depends on that of the singularity of $S$ , and belongs to $[1/2,1)$ (see Lemma 7.3).

Once we have found the asymptotic behavior of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , we should analyze the tree series found in Step 1. It is purely routine from an analytic point of view, but involves some combinatorial arguments, regarding the encoding of permutations by substitution trees.

Step 3: Transfer. Finally we use a transfer theorem of analytic combinatorics (Theorem A.3) to translate the singularity exponents we found in Step 2 into a limiting behavior for (3). Informally, a square-root singularity, which is the same as in "usual" families of trees, will lead to the Brownian separable permuton. A singularity of exponent in $(1/2,1)$ will lead to the $\delta$ -stable tree, where $\delta\in(1,2)$ is the inverse of the exponent. A singularity of exponent $\delta>1$ will invariably lead to the degenerate case.

1.8. Organization of the paper

The paper is organized as follows (see also Fig. 5).

•

Section 2 is devoted to proving useful results on the convergence of random permutons. The proofs heavily rely on previous estimates for deterministic permutons [36]. We believe that these general results regarding random permutons are interesting on their own, therefore these are presented in a self-contained way.

•

In Sections 3 and 4, we prove nonasymptotic enumeration results for the number of permutations encoded by some given families of (decorated) trees. The main result is Proposition 4.5, which is the first step towards the estimation of $\mathbb{E}\left[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\right]$ .

•

In Sections 5, 6, 7 we prove our main results: the convergence of the sequence $(\mu_{\bm{\sigma}_{n}})_{n}$ of permutons. As already mentioned, the quantitative behavior depends on the family $\mathcal{S}$ , more precisely on the sign of $S^{\prime}(R_{S})-2/(1+R_{S})^{2}+1$ :

–

Section 5 is devoted to the standard case $S^{\prime}(R_{S})>2/(1+R_{S})^{2}-1$ . We show in Theorem 5.2 the convergence to the biased Brownian separable permuton.

–

In Section 6, we consider the degenerate case $S^{\prime}(R_{S})<2/(1+R_{S})^{2}-1$ .

–

In Section 7, we consider the critical case $S^{\prime}(R_{S})=2/(1+R_{S})^{2}-1$ . This case itself is divided into two subcases, according to whether the exponent $\delta$ (defined in Definition 6.6) is smaller (Section 7.1) or greater (Section 7.2) than $2$ .

•

We postpone to Appendix A many useful results of complex analysis.

•

Finally, Appendix B discusses how Figs. 1, 4 and 3 have been obtained.

2. Convergence of random permutons

In this section, we first recall the terminology of (deterministic) permutons, introduced in [36]. We also adapt their results to obtain criteria for the convergence in distribution of random permutons.

Notation. * Since this section involves many different probability spaces, we use a superscript on $\mathbb{P}$ (and similarly on expectation symbols $\mathbb{E}$ ) to record the source of randomness. In the case where the event $A(\bm{u},\bm{v})$ (or the function $H(\bm{u},\bm{v})$ ) depends on two random variables $\bm{u}$ and $\bm{v}$ , we interpret $\mathbb{P}^{\bm{u}}(A(\bm{u},\bm{v}))$ (or $\mathbb{E}^{\bm{u}}[H(\bm{u},\bm{v})]$ ) as the conditional probability (expectation) with respect to $\bm{v}$ . *

2.1. Deterministic permutons and extracted permutations

Recall from Section 1.2 that a permuton is a probability measure on the unit square with uniform marginals. To a permutation $\sigma$ of size $n$ , we can associate the permuton $\mu_{\sigma}$ which is essentially the (normalized) diagram of $\sigma$ , where each dot has been replaced with a small square of dimension $1/n\times 1/n$ carrying a mass $1/n$ .

Let $\mathcal{M}$ be the set of permutons. We need to equip $\mathcal{M}$ with a topology. We say that a sequence of (deterministic) permutons $(\mu_{n})_{n}$ converges weakly to $\mu$ (simply denoted $\mu_{n}\to\mu$ ) if

[TABLE]

for every bounded and continuous function $f:[0,1]^{2}\to\mathbb{R}$ . With this topology, $\mathcal{M}$ is compact and metrizable by a metric $d_{\square}$ which has been introduced in [36] (see Lemmas 2.5 and 5.3 in [36]):

[TABLE]

Since $\mathcal{M}$ is compact, Prokhorov’s theorem ensures that the space of probability distributions on $\mathcal{M}$ is compact (for convergences of measure, we refer to [16]).

Recall from Section 1.1 that for $\sigma\in\mathfrak{S}_{n}$ and $\pi\in\mathfrak{S}_{k}$ , we have

[TABLE]

where ${\bm{I}}_{n,k}$ is randomly and uniformly chosen among the $\binom{n}{k}$ subsets of $[n]$ with $k$ elements. The random permutation $\operatorname{pat}_{{\bm{I}}_{n,k}}(\sigma)$ is called the induced subpermutation (of size $k$ ) in $\sigma$ . We will define the pattern density $\operatorname{\widetilde{occ}}(\pi,\mu)$ of a pattern $\pi\in\mathfrak{S}_{k}$ in a permuton $\mu$ by analogy with this formula.

Take a sequence of $k$ random points $(\vec{\mathbf{x}},\vec{\mathbf{y}})=((\bm{x}_{1},\bm{y}_{1}),\dots,(\bm{x}_{k},\bm{y}_{k}))$ in $[0,1]^{2}$ , independently with common distribution $\mu$ . Because $\mu$ has uniform marginals and the $\bm{x}_{i}$ ’s (resp. $\bm{y}_{i}$ ’s) are independent, it holds that the $\bm{x}_{i}$ ’s (resp. $\bm{y}_{i}$ ’s) are almost surely distinct. We denote by $(\bm{x}_{(1)},\bm{y}_{(1)}),\dots,(\bm{x}_{(k)},\bm{y}_{(k)})$ the $x$ -ordered sample of $(\vec{\mathbf{x}},\vec{\mathbf{y}})$ , i.e. the unique reordering of the sequence $((\bm{x}_{1},\bm{y}_{1}),\dots,(\bm{x}_{k},\bm{y}_{k}))$ such that $\bm{x}_{(1)}<\cdots<\bm{x}_{(k)}$ . Then the values $(\bm{y}_{(1)},\cdots,\bm{y}_{(k)})$ are in the same relative order as the values of a unique permutation, that we denote $\operatorname{Perm}(\vec{\mathbf{x}},\vec{\mathbf{y}})$ . Since the points are taken at random, $\operatorname{Perm}(\vec{\mathbf{x}},\vec{\mathbf{y}})$ is a random permutation of size $k$ . We call it the induced subpermutation (of size $k$ ) in $\mu$ . Then we set

[TABLE]

Rewriting this probability in an integral form, we get immediately:

[TABLE]

which identifies $\operatorname{\widetilde{occ}}(\pi,\cdot)$ as a measurable function on the space of permutons.

In the following, as we consider a random permuton $\bm{\mu}$ , we need to construct a finite sequence of points $(\bm{x}_{1},\bm{y}_{1}),\dots,(\bm{x}_{k},\bm{y}_{k})$ , which are independent with common distribution $\bm{\mu}$ conditionally on $\bm{\mu}$ . This is possible up to considering a new probability space where the joint distribution of $(\bm{\mu},(\bm{x}_{1},\bm{y}_{1}),\ldots,(\bm{x}_{k},\bm{y}_{k}))$ is characterized as follows: for every positive measurable functional $H:\mathcal{M}\times([0,1]^{2})^{k}\to\mathbb{R}$ ,

[TABLE]

In this new probability space, we call ${\vec{\mathbf{m}}_{k}}$ the vector $(\vec{\mathbf{x}},\vec{\mathbf{y}})=(\bm{x}_{i},\bm{y}_{i})_{1\leq i\leq k}$ , and we use the notation $\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\bm{\mu})=\operatorname{Perm}(\vec{\mathbf{x}},\vec{\mathbf{y}})$ , to highlight the two levels of randomness.

We end this section by the following two estimates, proved in [36].

Lemma 2.1 (Occurrences in a permutation and its associated permuton [36, Lemma 3.5]).

If $\pi\in\mathfrak{S}_{k}$ and $\sigma\in\mathfrak{S}_{n}$ , then

[TABLE]

Lemma 2.2 (Approximation of a permuton by a permutation [36, Lemma 4.2]).

There is a $k_{0}$ such that if $k>k_{0}$ , for any permuton $\nu$ ,

[TABLE]

2.2. Random permutons and convergence in distribution

We now consider a sequence of random permutations $(\bm{\sigma}_{n})$ (with $\bm{\sigma}_{n}$ of size $n$ ). An example of interest for the present paper is when, for each $n\geq 1$ , $\bm{\sigma}_{n}$ is a uniform random permutation of size $n$ in a given class $\mathcal{C}$ . Another example are the random permutations $(\bm{\sigma}_{n})_{n\geq 1}=(\operatorname{Perm}({\vec{\mathbf{m}}_{n}},\bm{\mu}))_{n\geq 1}$ constructed above from a given random permuton $\bm{\mu}$ . In the case where $\bm{\mu}$ is deterministic, these correspond to the $Z$ -random permutations from [36], used to prove that each permuton is the limit of some permutation sequence.

Taking ${\bm{I}}_{n,k}$ independently from $(\bm{\sigma}_{n})$ , we have for every $\pi$ of size $k$ :

[TABLE]

Similar, for a random permuton $\bm{\mu}$ , we have

[TABLE]

This is a consequence of (6) above, applied to $H(\mu,(x_{1},y_{1}),\ldots,(x_{k},y_{k}))=\bm{1}_{\operatorname{Perm}(\vec{x},\vec{y})=\pi}$ and combined with (5). The same argument may be applied to

[TABLE]

yielding a randomized version of Lemma 2.2.

Lemma 2.3 (Approximation of a random permuton by a random permutation).

There is a $k_{0}$ such that if $k>k_{0}$ , for any random permuton $\bm{\nu}$ ,

[TABLE]

This result has an important consequence for the distribution of random permutons.

Proposition 2.4 (Subpermutations characterize the distribution of $\bm{\mu}$ ).

Let $\bm{\mu}$ , $\bm{\mu}^{\prime}$ be two random permutons. If there exists $k_{1}$ such that for $k\geq k_{1}$ and every $\pi$ of size $k$ we have

[TABLE]

then $\bm{\mu}\stackrel{{\scriptstyle d}}{{=}}\bm{\mu}^{\prime}$ .

Proof.

We need to prove that $\mathbb{E}^{\bm{\mu}}[\phi(\bm{\mu})]=\mathbb{E}^{\bm{\mu}^{\prime}}[\phi(\bm{\mu}^{\prime})]$ for every bounded and continuous function $\phi:\mathcal{M}\to\mathbb{R}$ . Fix $k\geq k_{1}$ . It holds that

[TABLE]

where ${\vec{\mathbf{m}}_{k}}^{\prime}$ denotes a sequence of $k$ independent points with common distribution $\bm{\mu}^{\prime}$ , conditionally on $\bm{\mu}^{\prime}$ . The second term in the above display is zero by assumption. Moreover, from Lemma 2.3 the first and third terms go to zero when $k\to+\infty$ . ∎

Our main theorem in this section deals with the convergence of sequences of random permutations to a random permuton. It generalizes the result of [36] which states that deterministic permuton convergence is characterized by convergence of pattern densities. We extend their proof to the case of random sequences, where permuton convergence in distribution is characterized by convergence of average pattern densities, or equivalently of the induced subpermutations of any (fixed) size.

Theorem 2.5.

For any $n$ , let $\bm{\sigma}_{n}$ be a random permutation of size $n$ . Moreover, for any fixed $k$ , let ${\bm{I}}_{n,k}$ be a uniform random subset of $[n]$ with $k$ elements, independent of $\bm{\sigma}_{n}$ . The following assertions are equivalent.

(a)

$(\mu_{\bm{\sigma}_{n}})_{n}$ * converges in distribution for the weak topology to some random permuton $\bm{\mu}$ .* 2. (b)

The random infinite vector $\big{(}\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\big{)}_{\pi\in\mathfrak{S}}$ converges in distribution in the product topology to some random infinite vector $(\bm{\Lambda}_{\pi})_{\pi\in\mathfrak{S}}$ . 3. (c)

For every $\pi$ in $\mathfrak{S}$ , there is a $\Delta_{\pi}\geq 0$ such that

[TABLE] 4. (d)

For every $k$ , the sequence $\big{(}\operatorname{pat}_{{\bm{I}}_{n,k}}(\bm{\sigma}_{n})\big{)}_{n}$ of random permutations converges in distribution to some random permutation $\bm{\rho}_{k}$ .

Whenever these assertions are verified, we have $(\bm{\Lambda}_{\pi})_{\pi}\stackrel{{\scriptstyle d}}{{=}}(\operatorname{\widetilde{occ}}(\pi,\bm{\mu}))_{\pi}$ and for every $\pi\in\mathfrak{S}_{k}$ ,

[TABLE]

*Observation 2.6**.*

In item (c) above, it is enough to consider all $\pi$ of size at least $2$ . Indeed, for $\pi=1$ , the statement is trivial, since $\operatorname{\widetilde{occ}}(\pi,\cdot)$ is identically $1$ .

**Proof of (a) $\Rightarrow$ (b). ** Let $\pi_{1},\ldots,\pi_{r}$ be a finite sequence of patterns. By [36, Lemma 5.3], the map $\mu\mapsto(\operatorname{\widetilde{occ}}(\pi_{i},\mu))_{1\leq i\leq r}$ is continuous. Therefore, $\mu_{\bm{\sigma}_{n}}\stackrel{{\scriptstyle d}}{{\to}}\bm{\mu}$ implies

[TABLE]

Using Lemma 2.1, one can replace each $\operatorname{\widetilde{occ}}(\pi_{i},\mu_{\bm{\sigma}_{n}})$ by $\operatorname{\widetilde{occ}}(\pi_{i},\bm{\sigma}_{n})$ in the above convergence. This proves the convergence in distribution of all induced permutations $\big{(}\operatorname{\widetilde{occ}}(\pi_{i},{\bm{\sigma}_{n}})\big{)}_{1\leq i\leq k}$ , and hence of $\big{(}\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\big{)}_{\pi\in\mathfrak{S}}$ in the product topology (see for instance [16, ex. 2.4 p. 19]).

**Proof of (b) $\Rightarrow$ (c). ** If $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})\stackrel{{\scriptstyle d}}{{\to}}\bm{\Lambda}_{\pi}$ , as $\operatorname{\widetilde{occ}}$ takes values in $[0,1]$ , we have

[TABLE]

**Proof of (c) $\Rightarrow$ (d). ** Fix $\pi\in\mathfrak{S}_{k}$ and consider the sequence

[TABLE]

which converges if (c) holds (the equality comes from Eq. 7). Since $\operatorname{pat}_{{\bm{I}}_{n,k}}(\bm{\sigma}_{n})$ is a random variable taking its values in the finite set $\mathfrak{S}_{k}$ , this says exactly that the sequence $\big{(}\operatorname{pat}_{{\bm{I}}_{n,k}}(\bm{\sigma}_{n})\big{)}_{n}$ converges in distribution.

Proof of (d) $\Rightarrow$ (a). Consider a sequence of random permutations $(\bm{\sigma}_{n})$ satisfying (d), i.e. for every $k$ , there is a random permutation $\bm{\rho}_{k}$ such that $\operatorname{pat}_{{\bm{I}}_{n,k}}(\bm{\sigma}_{n})\stackrel{{\scriptstyle d}}{{\to}}\bm{\rho}_{k}$ . Put differently, for every pattern $\pi$ of size $k$ , we have

[TABLE]

From Lemma 2.1 and Eq. 7, we get

[TABLE]

Set $\bm{\theta}_{k,n}=\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\mu_{\bm{\sigma}_{n}})$ . Then, using Eq. 8, for every $\pi\in\mathfrak{S}_{k}$ , we have

[TABLE]

In other words, $\bm{\theta}_{k,n}\stackrel{{\scriptstyle d}}{{\to}}\bm{\rho}_{k}$ . Since $\mu_{\bm{\rho}_{k}}$ takes its values in a finite set of permutons, this implies

[TABLE]

Let $H:(\mathcal{M},d_{\square})\to\mathbb{R}$ be a bounded continuous functional. It holds that

[TABLE]

The first term can be bounded by introducing the modulus of continuity of $H$ , which is defined as $\omega(\varepsilon)=\sup_{d_{\square}(\xi,\zeta)\leq\varepsilon}|H(\xi)-H(\zeta)|$ . Since $\mathcal{M}$ is compact, it goes to [math] when $\varepsilon$ goes to [math]. Hence,

[TABLE]

As for the second term, for $k$ large enough, Lemma 2.3 yields

[TABLE]

Putting things together, we obtain

[TABLE]

Assume that $(\mu_{\bm{\sigma}_{n}})_{n}$ has a subsequence converging in distribution to a random permuton $\bm{\mu}^{\prime}$ . Taking the limit when $n\to\infty$ of (10) along this subsequence, we get

[TABLE]

(Recall indeed that $({\bm{\theta}_{k,n}})_{n}$ converges to ${\bm{\rho}_{k}}$ in distribution.) The right-hand side tends to [math] when $k$ tends to infinity, which proves that $(\mu_{\bm{\rho}_{k}})_{k}$ converges to $\bm{\mu}^{\prime}$ in distribution as well.

Therefore, all converging subsequences of $(\mu_{\bm{\sigma}_{n}})_{n}$ converge to the same limit $\bm{\mu}^{\prime}$ , which is the limit of $(\mu_{\bm{\rho}_{k}})_{k\geq 1}$ . Thanks to the compactness of the space of probability distributions on $\mathcal{M}$ , this is enough to conclude that $(\mu_{\bm{\sigma}_{n}})$ has indeed a limit. Item (a) is proved.

**Proof of additional statements. ** Assume that (a)–(d) hold. That $(\bm{\Lambda}_{\pi})_{\pi}\stackrel{{\scriptstyle d}}{{=}}(\operatorname{\widetilde{occ}}(\pi,\bm{\mu}))_{\pi}$ follows from the proof of (a) $\Rightarrow$ (b). Fix any integer $k$ , and any permutation $\pi$ of size $k$ . The above equality in distribution implies $\mathbb{E}[\bm{\Lambda}_{\pi}]=\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\mu})]$ . That $\Delta_{\pi}=\mathbb{E}[\bm{\Lambda}_{\pi}]$ is clear from the proof of (b) $\Rightarrow$ (c). The equality $\mathbb{P}(\bm{\rho}_{k}=\pi)=\Delta_{\pi}$ follows from the proof of (c) $\Rightarrow$ (d). Finally, $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\mu})]=\mathbb{P}(\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\bm{\mu})=\pi)$ comes from Eq. 8. ∎

*Remark 2.7**.*

In some sense, Theorem 2.5 can be seen as an analogue of a theorem of Aldous for random trees [9, Theorem 18]. Both in permutations and trees, there is a natural way to construct a smaller structure from $k$ elements of a big structure (induced subpermutations or subtrees). The goal is then to reduce the convergence of the big structure to the convergence, for each $k$ , of the induced substructures. For trees, we need an extra tightness assumption (that the family of trees is “leaf-tight” in Aldous’ terminology). In our case, since the space of permutons is compact, we do not need such an assumption.

We finish this section by a comment on the existence of random permutons with prescribed induced subpermutations.

Definition 2.8.

A family of random permutations $(\bm{\rho}_{n})_{n}$ is consistent if

i)

for every $n\geq 1$ , $\bm{\rho}_{n}\in\mathfrak{S}_{n}$ , 2. ii)

for every $n\geq k\geq 1$ , if $\bm{I}_{n,k}$ is a uniform subset of $[n]$ of size $k$ , independent of $\bm{\rho}_{n}$ , then $\operatorname{pat}_{{\bm{I}}_{n,k}}(\bm{\rho}_{n})\stackrel{{\scriptstyle d}}{{=}}\bm{\rho}_{k}$ .

It turns out that consistent family of random permutations and random permutons are essentially equivalent:

Proposition 2.9.

If $\bm{\mu}$ is a random permuton, then the family defined by $\bm{\rho}_{k}\stackrel{{\scriptstyle d}}{{=}}\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\bm{\mu})$ is consistent. Conversely, for every consistent family of random permutations $(\bm{\rho}_{k})_{k\geq 1}$ , there exists a random permuton $\bm{\mu}$ whose distribution is uniquely determined, such that $\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\bm{\mu})\stackrel{{\scriptstyle d}}{{=}}\bm{\rho}_{k}$ . In that case, $\mu_{\bm{\rho}_{n}}\xrightarrow[n\to\infty]{d}\bm{\mu}$ .

Proof.

Set $n\geq k\geq 1$ . The first assertion follows from the following coupled construction of ${\vec{\mathbf{m}}_{n}}$ and ${\vec{\mathbf{m}}_{k}}$ : ${\vec{\mathbf{m}}_{k}}$ is a uniform random subset of ${\vec{\mathbf{m}}_{n}}$ , chosen independently of it. It follows that $\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\bm{\mu})=\operatorname{pat}_{{\bm{I}}_{n,k}}(\operatorname{Perm}({\vec{\mathbf{m}}_{n}},\bm{\mu}))$ , for some random subset ${\bm{I}}_{n,k}$ of $[n]$ . By construction, the distribution of ${\bm{I}}_{n,k}$ is uniform and independent of $\operatorname{Perm}({\vec{\mathbf{m}}_{n}},\bm{\mu})$ . Hence the consistency follows.

The converse is immediate, by applying the implication (d) $\rightarrow$ (a) and the last assertion of Theorem 2.5 to the sequence $(\bm{\rho}_{k})_{k\geq 1}$ . Consistency ensures that we get the prescribed induced subpermutations, and uniqueness in distribution follows by Proposition 2.4. ∎

3. Coding permutations by trees

3.1. Substitution trees

As seen in Section 1.3 (Theorem 1.5), any permutation $\sigma$ can be recursively decomposed using substitutions in a canonical way and this decomposition can be encoded in a canonical tree. However, if we do not impose conditions on $\theta$ and the $\pi^{(i)}$ ’s (as done in Theorem 1.5), a permutation $\sigma$ may be represented in many ways as a substitution $\sigma=\theta[\pi^{(1)},\dots,\pi^{(d)}]$ , where the $\pi^{(i)}$ ’s themselves may be further decomposed using substitutions. Such decompositions can be recorded in substitution trees.

Definition 3.1.

*A rooted planar tree is either a leaf, or consists of a root node $\varnothing$ with an ordered $k$ -tuple of subtrees attached to the root, which are themselves rooted planar trees.

In our context, the size of a tree $t$ is its number of leaves. It is denoted $|t|$ , whereas $\#t$ denotes the number of nodes of $t$ (including both leaves and internal nodes).*

Internal vertices of all trees considered in this paper have degree at least $2$ . It is natural (and also convenient for counting purposes in Section 4) to consider that the single leaf of the tree of size $1$ is also its root (and is therefore also denoted $\varnothing$ ).

Since we work with planar trees, we can label their leaves canonically with the integers from $1$ to $|t|$ : the leaf labeled by $i$ is the $i$ th leaf met in the depth-first traversal of $t$ which choses left before right. A subset of the set of leaves of a tree $t$ is therefore canonically represented by a subset $I$ of $[|t|]$ .

Definition 3.2.

A substitution tree of size $n$ is a labeled rooted planar tree with $n$ leaves, where any internal node with $k\geq 2$ children is labeled by a permutation of size $k$ . Internal nodes with only one child are forbidden.

Internal nodes labeled by the ascending permutation $12\cdots r$ or the descending permutation $r\cdots 21$ (for some $r\geq 2$ ) will play a particular role. Therefore we replace every such label with a $\oplus$ (for ascending permutations) or a $\ominus$ (for descending permutations). Since the size of a label corresponds to the number of children and since there is exactly one ascending (resp. descending) permutation of each size, there is no loss of information in this replacement. Internal nodes labeled $\oplus$ or $\ominus$ are called linear nodes, the other nodes being called nonlinear. Among nonlinear nodes, the ones labeled by simple permutations are called simple nodes.

An example of substitution tree is shown in Fig. 6, left.

Definition 3.3.

Let $t$ be a substitution tree. We define inductively the permutation $\operatorname{perm}(t)$ associated with $t$ :

•

if $t$ is just a leaf, then $\operatorname{perm}(t)=1$ ;

•

if the root of $t$ has $r\geq 2$ children with corresponding subtrees $t_{1},\ldots,t_{r}$ (from left to right), and is labeled with the permutation $\theta$ , then $\operatorname{perm}(t)$ is the permutation obtained as the substitution of $\operatorname{perm}(t_{1}),\dots,\operatorname{perm}(t_{r})$ in $\theta$ :

[TABLE]

Fig. 6 illustrates this construction. When $\operatorname{perm}(t)=\sigma$ , when say that $t$ is a tree that encodes $\sigma$ , or a tree associated with $\sigma$ . Nonsimple permutations $\sigma$ are encoded by several trees $t$ . However, if we restrict ourselves to canonical trees (which are particular cases of substitution trees; see Definition 1.6), we have uniqueness. Indeed, from Theorem 1.5, to any permutation $\sigma$ we can associate uniquely a canonical tree $t$ such that $\operatorname{perm}(t)=\sigma$ .

The remaining of Section 3.1 is devoted to the proof of simple combinatorial lemmas on the structure of the set of substitution trees associated with a given permutation $\sigma$ . These lemmas are useful in Section 5.

We first make the following observation. Take a substitution tree $\tau$ of some permutation $\pi$ with a marked node $v$ labeled by $\theta$ . Consider also a substitution tree $\tau^{\prime}$ of $\theta$ . Then replacing $v$ by the tree $\tau^{\prime}$ yields a new substitution tree $\tau^{\prime\prime}$ of the same permutation $\pi$ . (When doing this replacement the $|\theta|$ subtrees attached to $v$ are glued on the leaves of $\tau^{\prime}$ , respecting their order, see Fig. 7.) This operation will be referred to as the inflation of $v$ with $\tau^{\prime}$ .

Conversely, consider a connected set $A$ of internal nodes in a substitution tree $\tau^{\prime\prime}$ of $\pi$ . From this set we build a substitution tree $\tau^{\prime}$ whose set of internal nodes is $A$ , the ancestor-descendant relation in $\tau^{\prime}$ is inherited from the one in $\tau^{\prime\prime}$ , and we add leaves so that the degree of each node of $A$ is the same in $\tau^{\prime}$ than in $\tau^{\prime\prime}$ . We denote $\theta=\operatorname{perm}(\tau^{\prime})$ . Then merging all nodes in $A$ into a single node labeled by $\theta$ turns $\tau^{\prime\prime}$ into a new substitution tree $\tau$ of the same permutation $\pi$ . We call this a merge operation. For example, the tree $\tau$ of Fig. 7 can be obtained from the tree $\tau^{\prime\prime}$ of the same figure by merging the nodes labeled $132$ and $\ominus$ .

We now consider a last family of substitution trees. An expanded tree is a substitution tree where nonlinear nodes are labeled by simple permutations, while linear nodes are required to be binary.

Lemma 3.4.

Any expanded tree of $\pi$ is obtained from its canonical tree by inflating all nodes labeled by $\oplus$ (resp. $\ominus$ ) with binary trees whose internal nodes are all labeled by $\oplus$ (resp. $\ominus$ )

Proof.

Let $\tau$ be an expanded tree of $\pi$ . Consider, if any, two adjacent linear nodes of $\tau$ with the same label (either both $\oplus$ or both $\ominus$ ) and merge them. Note that the resulting node will still have label $\oplus$ or $\ominus$ . We repeat this operation until there is no adjacent linear nodes with the same label. Nonlinear nodes in the resulting tree $\tau^{\prime}$ are all labeled by simple permutations: it is the case in $\tau$ (by definition of expanded trees) and we did not create any new nonlinear nodes. Therefore $\tau^{\prime}$ satisfy all conditions of canonical trees (see Definition 1.6). By uniqueness, $\tau^{\prime}$ is the canonical tree of $\pi$ . Reversing the merge operations, $\tau$ can be obtained from $\tau^{\prime}$ by inflating its linear nodes, which proves the proposition. ∎

We recall a fact well-known to combinatorialists: the number of complete binary trees (i.e. plane rooted trees, whose internal vertices have all degree $2$ ) with $d$ leaves is $\operatorname{Cat}_{d-1}$ . Therefore each linear node of degree $d$ of the canonical tree, can be inflated with a binary tree in $\operatorname{Cat}_{d-1}$ ways. We therefore get the following interesting corollary, regarding the number and properties of expanded trees.

Corollary 3.5.

Let $\pi$ be a permutation and $d_{1},\cdots,d_{r}$ (resp. $e_{1},\cdots,e_{s}$ ) be the degrees of the nodes labeled $\oplus$ (resp. $\ominus$ ) in the canonical tree of $\pi$ . Then

•

the number $\widetilde{N_{\pi}}$ of expanded trees of $\pi$ is $\prod_{i=1}^{r}\operatorname{Cat}_{d_{i}-1}\,\prod_{j=1}^{s}\operatorname{Cat}_{e_{j}-1}$ , where we denote by $\operatorname{Cat}_{k}:=\frac{1}{k+1}\binom{2k}{k}$ the $k$ -th Catalan number, which counts complete binary trees with $k$ leaves.

•

each expanded tree of $\pi$ has $\sum_{i=1}^{r}(d_{i}-1)$ nodes labeled $\oplus$ and $\sum_{j=1}^{s}(e_{j}-1)$ nodes labeled $\ominus$ .

•

the labels of the nonlinear nodes in any expanded tree of $\tau$ are the same as in its canonical tree.

Lemma 3.6.

Any substitution tree of $\pi$ can be obtained from some expanded tree of $\pi$ by merge operations.

Proof.

The proof is similar to that of Lemma 3.4. Starting from any substitution tree of $\pi$ and inflating every node that is neither simple nor binary by an expanded tree encoding its label, we get an expanded tree. Reversing these inflation operations, we can obtain any substitution tree from some expanded tree of $\pi$ , using only merge operations. ∎

3.2. Induced trees

Since permutations are encoded by trees and since we are interested in patterns in permutations, we consider an analogue of patterns in trees: this leads to the notion of induced trees.

Definition 3.7 (First common ancestor).

Let $t$ be a tree, and $u$ and $v$ be two nodes (internal nodes or leaves) of $t$ . The first common ancestor of $u$ and $v$ is the node furthest away from the root $\varnothing$ that appears on both paths from $\varnothing$ to $u$ and from $\varnothing$ to $v$ in $t$ .

The following simple observation allows to read the relative order of $\sigma_{i}$ and $\sigma_{j}$ in any substitution tree encoding $\sigma$ .

*Observation 3.8**.*

Let $i\neq j$ be two leaves of a substitution tree $t$ and $\sigma=\operatorname{perm}(t)$ . Let $v$ be the first common ancestor of $i,j$ in $t$ and $\theta$ be the permutation labeling $v$ . We define $k$ (resp. $\ell$ ) such that the $k$ -th (resp. $\ell$ -th) child of $v$ is an ancestor of $i$ (resp. $j$ ).

Then $\sigma_{i}>\sigma_{j}$ if and only if $\theta_{k}>\theta_{\ell}$ .

Definition 3.9 (Induced tree).

Let $t$ be a substitution tree, and let $I$ be a subset of the leaves of $t$ . The tree $t_{I}$ induced by $I$ is the substitution tree of size $|I|$ defined as follows. The tree structure of $t_{I}$ is given by:

•

the leaves of $t_{I}$ are the leaves of $t$ labeled by elements of $I$ ;

•

the internal nodes of $t_{I}$ are the nodes of $t$ that are first common ancestors of two (or more) leaves in $I$ ;

•

the ancestor-descendant relation in $t_{I}$ is inherited from the one in $t$ ;

•

the order between the children of an internal node of $t_{I}$ is inherited from $t$ .

The label of an internal node $v$ of $t_{I}$ is defined as follows:

•

if $v$ is labeled by a permutation $\theta$ in $t$ , the label of $v$ in $t_{I}$ is given by the pattern of $\theta$ induced by the children of $v$ having a descendant that belongs to $t_{I}$ (or equivalently, to $I$ ).

A detailed example of the induced tree construction is given in Fig. 8.

Note that if $v$ has label $\oplus$ in $t$ , it has also label $\oplus$ in $t_{I}$ . Indeed, $\oplus$ nodes correspond to increasing permutations and all patterns of increasing permutations are increasing permutations. The same holds with $\ominus$ . The converse is however not true: a node can be linear in $t_{I}$ but nonlinear in $t$ ( e.g. the bottommost green node in Fig. 8).

*Observation 3.10**.*

By definition, for any substitution tree $t$ with $k$ leaves and subset $I$ of $[k]$ , $t_{I}$ is a substitution tree. However, if $t$ is a canonical tree, $t_{I}$ is a substitution tree which is not necessarily canonical (see for example Fig. 8).

An important feature of induced trees is the following, which follows from 3.8 and is illustrated in Fig. 9.

Lemma 3.11.

Let $t$ be a substitution tree with $k$ leaves, and $I$ be a subset of $[k]$ . We have

[TABLE]

As a consequence of this formula, counting the total number of occurrences of a given pattern in some family of permutations can be reduced to counting the total number of induced trees equal to a given $t_{0}$ in the corresponding family of canonical trees. This is precisely the goal of the next section.

4. Exact enumeration of various families of trees

Let $\mathcal{S}$ be a fixed family of simple permutations. Recall that its generating function is

[TABLE]

where $s_{n}$ is the number of permutations of size $n$ in $\mathcal{S}$ . An $\mathcal{S}$ -canonical tree is any canonical tree whose simple nodes carry labels in $\mathcal{S}$ . We denote by $\mathcal{T}$ the combinatorial class of $\mathcal{S}$ -canonical trees, the size of $|t|$ a tree $t$ being its number of leaves. Recall that $\langle\mathcal{S}\rangle$ is by definition the set of permutations whose canonical tree is in $\mathcal{T}$ . Since canonical trees encode permutations in a unique way, $\operatorname{perm}$ defines a size-preserving bijection between $\mathcal{T}$ and $\langle\mathcal{S}\rangle$ . Both have therefore the same generating function which we denote by

[TABLE]

In Section 4.1 below, we explain how to compute $T(z)$ starting from the datum $S(z)$ . We then study families of $\mathcal{S}$ -canonical trees with one marked leaf, with constraints on the root and/or on the marked leaf. These are building blocks for Section 4.2, where we consider the family of $\mathcal{S}$ -canonical trees with $k$ marked leaves, inducing a given tree $t_{0}$ .

4.1. Generating functions of $\mathcal{S}$ -canonical trees (possibly with marked leaves)

In order to compute $T(z)$ in terms of $S(z)$ , we need to introduce the auxiliary family $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ (resp. $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ ) of $\mathcal{S}$ -canonical trees with a root (always denoted $\varnothing$ ) that is not labeled $\oplus$ (resp. $\ominus$ ), and its generating function $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ (resp. $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ ):

[TABLE]

Note that replacing all labels $\ominus$ by $\oplus$ and $\oplus$ by $\ominus$ defines an involution on $\mathcal{S}$ -canonical trees. This implies in particular $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}=T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ and will be used to get other similar identities below.

Proposition 4.1.

Together with the condition $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(0)=0$ , the generating function $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is determined by the following implicit equation

[TABLE]

The main series $T$ is then simply given in terms of $\,T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ by

[TABLE]

Proof.

A tree of $\mathcal{T}$ is either a leaf, or a root labeled $\oplus$ and a sequence of at least two trees in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , or a root labeled $\ominus$ and a sequence of at least two trees in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ , or a root labeled by $\alpha\in\mathcal{S}$ and a sequence of $|\alpha|$ unconstrained trees. Therefore

[TABLE]

Similarly,

[TABLE]

By combining these two equations we get $T=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}+\frac{T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{2}}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}=\frac{T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}$ , that is Eq. 12. Substituting it back in Eq. 13 gives Eq. 11.

Observe, that under the assumption $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(0)=0$ , Eq. 11 allows one to compute inductively the coefficients of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ . Hence $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is uniquely determined by $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(0)=0$ and Eq. 11, as claimed. ∎

We now consider trees with a marked leaf. As before, subscripts indicate a constraint on the root. The generating function of trees with a marked leaf counted by their number of unmarked leaves is obtained by differentiating the generating function of trees without marked leaf: $T^{\prime}$ , $T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , $T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ . Indeed,

[TABLE]

Accordingly, we denote by $\mathcal{T}^{\prime}$ , $\mathcal{T}^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ and $\mathcal{T}^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ the families of trees counted by $T^{\prime}$ , $T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ and $T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ .

Consistently, we use superscripts when we consider families of trees with a marked leaf that satisfies an additional constraint, and similarly for their generating function. We say that a leaf is $\oplus$ -replaceable (resp. $\ominus$ -replaceable) if it may be replaced by a tree whose root is labeled $\oplus$ (resp. $\ominus$ ) without violating the definition of canonical trees (see the third item in Definition 1.6). In other words, its parent (if it exists) should be labeled by $\ominus$ or by a simple permutation (resp. by $\oplus$ or by a simple permutation). We then denote $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ (resp. $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{-}$ ) the families of trees in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ with a $\oplus$ -replaceable marked leaf (resp. a $\ominus$ -replaceable marked leaf). Similar definitions hold for $\mathcal{T}^{+}$ , $\mathcal{T}^{-}$ , $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{+}$ and $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{+}$ .

As for $T^{\prime}$ , we take the convention that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ and all generating functions with superscript count trees according to the number of unmarked leaves. By definition, $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{-}$ has constant coefficient $1$ (corresponding to the tree consisting of a single leaf). We however take the convention that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ has constant coefficient [math]: in other words, the single leaf is excluded from the family $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ (intuitively, a single leaf cannot be replaced by a tree with root labeled $\oplus$ , since the trees in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ should not have a root labeled $\oplus$ ).

Proposition 4.2.

The generating functions $T^{+}$ , $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{+}$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ are given by the following formulas:

[TABLE]

where $T$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ are given by Eq. 12 and $W=(\tfrac{1}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}})^{2}-1$ .

Quantities with a minus superscript are obtained by symmetry: $T^{-}=T^{+}$ , $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{-}=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{-}=T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{+}$ .

Proof.

Consider a tree $t$ in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ . As explained above, $|t|\neq 1$ and we distinguish cases according to the label of the root of $t$ , which may be either $\ominus$ or a simple permutation.

i)

The root of $t$ is labeled $\ominus$ (see left of Fig. 10). Then $t$ can be decomposed as a tree in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{+}$ (which may be a single leaf) and a nonempty pair of sequences of unmarked trees in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ . 2. ii)

The root of $t$ is labeled by a simple permutation $\alpha\in\mathcal{S}$ of size $d$ (see right of Fig. 10). Then $t$ can be decomposed as a $d$ -uple of unconstrained trees, with one of them having a $\oplus$ -replaceable marked leaf.

Therefore we have

[TABLE]

where $W=(\tfrac{1}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}})^{2}-1$ counts nonempty pairs of sequences of unmarked trees in $\mathcal{T}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ (since $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ ).

Similarly, we have

[TABLE]

The above three equations form a system with three indeterminates: $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ , $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{+}$ and $T^{+}$ ( $W$ and $T$ are known thanks to Eq. 12). Solving this system gives Eqs. 14, 15 and 16.

The symmetry argument giving $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{-}$ , $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{-}$ and $T^{-}$ consist as before in exchanging $\ominus$ and $\oplus$ labels in $\mathcal{S}$ -canonical trees. ∎

4.2. Generating function counting trees with marked leaves inducing a given tree

To enumerate trees with marked leaves inducing a given tree, we introduce another kind of generating functions. Recall that for any permutations $\alpha$ and $\theta$ , $\mathrm{occ}(\theta,\alpha)$ is the number of occurrences of $\theta$ in $\alpha$ . For a permutation $\theta$ , we set

[TABLE]

*Observation 4.3**.*

For $d\geq 1$ and any fixed $\alpha$ , $\sum_{\theta\in\mathfrak{S}_{d}}\mathrm{occ}(\theta,\alpha)=\binom{|\alpha|}{d}$ . Therefore $\sum_{\theta\in\mathfrak{S}_{d}}\operatorname{Occ}_{\theta}$ is related to the $d$ -th derivative of $S$ by $\sum_{\theta\in\mathfrak{S}_{d}}\operatorname{Occ}_{\theta}=\tfrac{S^{(d)}}{d!}$ . This implies that the radius of convergence of each $\operatorname{Occ}_{\theta}$ is at least $R_{S}$ , the radius of convergence of $S$ .

Fix a substitution tree $t_{0}$ with $k$ leaves. Let us call $\mathcal{T}_{t_{0}}$ the family of $\mathcal{S}$ -canonical trees $t$ with $k$ marked leaves $\vec{\ell}=(\ell_{1},\dots,\ell_{k})$ such that these leaves induce $t_{0}$ :

[TABLE]

We define the size of an object $(t,\vec{\ell})$ as the number of leaves in $t$ (both marked and unmarked). The corresponding generating series is denoted $T_{t_{0}}(z)$

Let $(t,\vec{\ell})\in\mathcal{T}_{t_{0}}$ . As noted after the definition of induced trees, a nonlinear node of $t_{\vec{\ell}}$ has to come from a nonlinear node of $t$ , whereas a linear node of $t_{\vec{\ell}}$ may come from a linear or a nonlinear node of $t$ . In order to ease the enumeration, we partition $\mathcal{T}_{t_{0}}$ according to the set of nodes of $t_{\vec{\ell}}=t_{0}$ coming from nonlinear nodes of $t$ (that is, simple nodes of $t$ since $t$ is canonical).

More formally, let $\mathrm{Int}(t)$ be the set of internal nodes of a tree $t$ . With each $(t,\vec{\ell})\in\mathcal{T}_{t_{0}}$ , we associate the set $\mathrm{FCA}(\vec{\ell})\subseteq\mathrm{Int}(t)$ of the first common ancestors of $\vec{\ell}$ in $t$ . From the definition of induced tree, a node $v$ in $t_{0}$ corresponds to a unique node in $\mathrm{FCA}(\vec{\ell})$ , that we denote $\varphi(v)$ . For $V_{s}\subseteq\mathrm{Int}(t_{0})$ , let

[TABLE]

Clearly $\mathcal{T}_{t_{0},V_{s}}$ is nonempty if and only if $V_{s}$ contains every nonlinear node of $t_{0}$ . An example of a marked tree $(t,\vec{\ell})$ with the corresponding pair $(t_{0},V_{s})$ is shown on Fig. 11. In pictures, we will always circle nodes $v$ in $V_{s}$ and the corresponding nodes $\varphi(v)$ in $t$ .

Definition 4.4.

A decorated tree is a pair $(t_{0},V_{s})$ where

•

$t_{0}$ * is a substitution tree;*

•

$V_{s}$ * is a subset of $\mathrm{Int}(t_{0})$ that contains all nonlinear nodes.*

Therefore we have the following decomposition:

[TABLE]

Let $(t_{0},V_{s})$ be a decorated tree. We consider the generating function $T_{t_{0},V_{s}}$ of $\mathcal{T}_{t_{0},V_{s}}$ , the size of $(t,\vec{\ell})$ being its number of leaves (both marked and unmarked):

[TABLE]

To compute $T_{t_{0},V_{s}}$ , we introduce some notation. For every internal node $v$ of $t_{0}$ , let

•

$\theta_{v}$ be the permutation labeling $v$ ,

•

$d^{\prime}_{v}$ be its number of children which are leaves or in $V_{s}$ ,

•

$d^{+}_{v}$ be its number of children which are not in $V_{s}$ and are labeled by $\oplus$ ,

•

$d^{-}_{v}$ be its number of children which are not in $V_{s}$ and are labeled by $\ominus$ ,

•

$d_{v}=d^{\prime}_{v}+d^{+}_{v}+d^{-}_{v}$ be its total number of children.

We also set the type of root to be $\prime$ if the root of $t_{0}$ is in $V_{s}$ , and $+$ (resp. $-$ ) if the root is not in $V_{s}$ and labeled $\oplus$ (resp. $\ominus$ ).

Proposition 4.5 (Enumeration of trees with marked leaves inducing a given decorated tree).

Let $(t_{0},V_{s})$ be a decorated tree and $k$ be its number of leaves. Then

[TABLE]

where

[TABLE]

Proof.

The proof is based on a decomposition of marked $\mathcal{S}$ -canonical trees $(t,\vec{\ell})$ of $\mathcal{T}_{t_{0},V_{s}}$ followed by a study of the series $A_{v}$ depending on the type of the node $\varphi(v)$ in $t$ .

**First step: Decomposing a tree in $\mathcal{T}_{t_{0},V_{s}}$ .

**We fix a decorated tree $(t_{0},V_{s})$ with $k$ leaves and a marked $\mathcal{S}$ -canonical tree $(t,\vec{\ell})\in\mathcal{T}_{t_{0},V_{s}}$ . We want to decompose $t$ into subtrees, one for each internal node of $t_{0}$ plus one attached to the root of $t$ . Recall that $\varphi:\mathrm{Int}(t_{0})\to\mathrm{FCA}(\vec{\ell})\subseteq\mathrm{Int}(t)$ is the correspondence between the internal nodes of $t_{0}$ and the set of first common ancestors of leaves $\vec{\ell}$ in $t$ .

For every internal node $v$ of $t_{0}$ , let $t_{v}$ be the subtree of $t$ defined as follows.

•

The root of $t_{v}$ is $\varphi(v)$ .

•

The nodes of $t_{v}$ are descendants of $\varphi(v)$ .

•

A descendant of $\varphi(v)$ in $t$ belongs to $t_{v}$ if and only if its first proper ancestor in $\mathrm{FCA}(\vec{\ell})$ is $\varphi(v)$ (proper meaning different from the node itself).

Moreover we define $t_{B}$ as the subtree of $t$ rooted at the root of $t$ and containing the nodes of $t$ having no proper ancestor in $\mathrm{FCA}(\vec{\ell})$ ; $B$ stands for “bottom” and is used here as a symbol, not as a variable. (If $\varphi$ maps the root of $t_{0}$ to the root of $t$ , then $t_{B}$ is reduced to a leaf.)

A schematic representation of the trees $t_{v}$ and $t_{B}$ is given in Fig. 12.

By definition, a node $u$ of $t$ that is not in $\mathrm{FCA}(\vec{\ell})$ belongs to exactly one $t_{v}$ . On the contrary, if $u$ is in $\mathrm{FCA}(\vec{\ell})$ , then $u$ is the root of $t_{\varphi^{-1}(u)}$ and is a leaf of another $t_{v}$ , where $v$ is the parent of $\varphi^{-1}(u)$ . (If $\varphi^{-1}(u)$ is the root of $t_{0}$ , then there is no such $v$ , and $u$ is a leaf of $t_{B}$ .)

By construction of $t_{v}$ and $t_{B}$ , their leaves are either leaves of $t$ or belong to $\mathrm{FCA}(\vec{\ell})$ . We mark the leaves that belong to $\mathrm{FCA}(\vec{\ell})$ or that are marked leaves of $t$ . In this way, the trees $t_{v}$ and $t_{B}$ that we have constructed are marked trees.

The following properties are straightforward to check.

i)

The tree $t_{v}$ is an $\mathcal{S}$ -canonical tree with $d_{v}$ marked leaves. 2. ii)

The root of $t_{v}$ is nonlinear if and only if $v\in V_{s}$ . 3. iii)

The root of $t_{v}$ is $\oplus$ if and only if $v\notin V_{s}$ and is labeled $\oplus$ . 4. iv)

The root of $t_{v}$ is $\ominus$ if and only if $v\notin V_{s}$ and is labeled $\ominus$ . 5. v)

The $d_{v}$ marked leaves of $t_{v}$ belong to $d_{v}$ subtrees coming from $d_{v}$ distinct children of the root of $t_{v}$ . The pattern induced by the position of those $d_{v}$ children on the permutation labeling the root of $t_{v}$ is $\theta_{v}$ . (For example, in Fig. 11, four marked leaves are branched on the node labeled $362514$ at positions $1,2,5,6$ . This implies that the corresponding node in $t_{0}$ is labeled with $\theta_{v}=2413$ .) 6. vi)

Let $w$ be the $i$ -th child of $v$ in $t_{0}$ . If $w\in\mathrm{Int}(t_{0})\setminus V_{s}$ , and its label is a $\oplus$ (resp. a $\ominus$ ), then the $i$ -th marked leaf of $t_{v}$ must be $\oplus$ -replaceable (resp. $\ominus$ -replaceable).

The combinatorial class of trees satisfying properties i) to vi) will be denoted $\mathcal{A}_{v}$ .

In addition, we observe that $t_{B}$ is an $\mathcal{S}$ -canonical tree with one marked leaf; moreover, if the root of $t_{0}$ is not in $V_{s}$ and is labeled by $\oplus$ (resp. $\ominus$ ), then the marked leaf of $t_{B}$ must be $\oplus$ -replaceable (resp. $\ominus$ -replaceable).

This yields a map

[TABLE]

We claim that this map is a bijection, and that the inverse map is obtained as follows. Let us be given $t_{B}$ and a collection of trees $t_{v}$ , one for each internal node of $t_{0}$ . We first take $t_{B}$ and glue $t_{\text{root of }t_{0}}$ on it, the root of $t_{\text{root of }t_{0}}$ replacing the marked leaf of $t_{B}$ . We then proceed inductively: if $t_{v}$ has already been glued and $w$ is the $i$ -th child of $v$ , we glue $t_{w}$ on $t_{v}$ , by replacing the $i$ -th marked leaf of $t_{v}$ with the root of $t_{w}$ . This yields a tree $t$ with $k$ marked leaves, denoted $\vec{\ell}$ . This tree is $\mathcal{S}$ -canonical because of items i), iii), iv) and vi) of the definition of $\mathcal{A}_{v}$ : we only glue trees with root $\oplus$ (resp. $\ominus$ ) on $\oplus$ -replaceable leaves (resp. $\ominus$ -replaceable leaves). By construction, the $k$ marked leaves $\vec{\ell}$ induce a tree having the same structure as $t_{0}$ and item v) of the definition of $\mathcal{A}_{v}$ ensures that the labels in the induced tree and in $t_{0}$ do match. Because of item ii), the tree $(t,\vec{\ell})$ is indeed in $\mathcal{T}_{t_{0},V_{s}}$ . We have therefore constructed a map from $\mathcal{T}^{\,\text{type of root}}\times\prod_{v\in\mathrm{Int}(t_{0})}\mathcal{A}_{v}$ to $\mathcal{T}_{t_{0},V_{s}}$ . By construction, this map indeed inverts $\bm{T}$ , and $\bm{T}$ is a bijection.

Let $A_{v}$ be the generating function of the combinatorial class $\mathcal{A}_{v}$ , counted by the number of unmarked leaves. If $A_{v}$ verifies (19), then (18) follows from the fact that $\bm{T}$ is a bijection. Note indeed that the factor $z^{k}$ in (18) comes from the fact that we count marked leaves in the series in the left-hand side and but not in the series in the right hand side (the bijection $\bm{T}$ leaves the number of unmarked leaves invariant).

We are left to show that the generating function $A_{v}$ verifies (19).

**Second step (i): Computing $A_{v}$ when $v\in V_{s}$ .

**Recall that the $d_{v}$ marked leaves of any tree $t\in\mathcal{A}_{v}$ belong to $d_{v}$ subtrees coming from $d_{v}$ distinct children of the root of $t$ . Since $v\in V_{s}$ , the elements $t$ of $\mathcal{A}_{v}$ can be uniquely decomposed as follows (this decomposition is illustrated on Fig. 13).

i)

The root of $t$ should be labeled by a simple permutation $\alpha$ in $\mathcal{S}$ ; among the $|\alpha|$ children of the root, $d_{v}$ are marked (corresponding to the subtrees containing a marked leaf) and the pattern of $\alpha$ corresponding to the positions of these marked children should be $\theta_{v}$ . (In Fig. 13, $\alpha=3142$ , the marked leaves are the first, third and fourth subtrees, and the pattern of $3142$ corresponding to positions $\{1,3,4\}$ is indeed $\theta_{v}=231$ ) 2. ii)

We glue $|\alpha|-d_{v}$ unmarked $\mathcal{S}$ -canonical trees with arbitrary roots on the unmarked children of $\alpha$ . (In Fig. 13, we have only one such tree, which is glued on the second child of the root); 3. iii)

We glue $d_{v}$ $\mathcal{S}$ -canonical trees with one leaf marked and an arbitrary root on the marked children of $\alpha$ . In addition,

•

for $d^{\prime}_{v}$ of these trees, there is no constraint on the marked leaf. (In Fig. 13, the trees glued on the third and fourth children of the root are unconstrained trees with a marked leaf.)

•

For $d^{+}_{v}$ (resp. $d^{-}_{v}$ ) of these trees, the marked leaf must be $\oplus$ -replaceable (resp. $\ominus$ -replaceable). (In Fig. 13, we must glue a tree with a $\ominus$ -replaceable marked leaf on the first child of the root.)

The generating functions of the first two steps can be computed as

[TABLE]

where $\operatorname{Occ}_{\theta_{v}}$ is defined in Eq. 17 p. 17. Indeed, once the label $\alpha$ of the root is chosen, $\mathrm{occ}(\theta_{v},\alpha)$ counts the number of ways to mark children of the root in step i), and $T(z)^{|\alpha|-d_{v}}$ comes from step ii). Step iii) yields an additional factor $(T^{\prime})^{d^{\prime}_{v}}(T^{+})^{d^{+}_{v}}(T^{-})^{d^{-}_{v}}$ . This proves the formula (19) in the case where $v$ is in $V_{s}$ .

**Second step (ii): Computing $A_{v}$ when $v\notin V_{s}$ .

**When $v$ is not in $V_{s}$ and labeled by $\oplus$ , the elements of the class $\mathcal{A}_{v}$ can be uniquely decomposed as follows (this decomposition is illustrated on Fig. 14).

i)

The root is labeled by $\oplus$ . 2. ii)

We attach to the root $d_{v}$ $\mathcal{S}$ -canonical trees whose root is not labeled by $\oplus$ , each with one marked leaf. In addition,

•

for $d^{\prime}_{v}$ of these trees, there is no constraint on the marked leaf. (In Fig. 14, the two right-most nonhatched trees attached to the root are trees with an unconstraint marked leaf.)

•

for $d^{+}_{v}$ (resp. $d^{-}_{v}$ ) of these trees, the marked leaf must be $\oplus$ -replaceable (resp. $\ominus$ -replaceable). (In Fig. 14, the left-most nonhatched tree attached to the root should have a $\ominus$ replaceable marked leaf.) 3. iii)

Between and around these $d_{v}$ trees, we attach $d_{v}+1$ possibly empty sequences of unmarked $\mathcal{S}$ -canonical trees whose root is not labeled by $\oplus$ . (In Fig. 14, each of these sequences is represented by a hatched blob.)

Item i) does not involve any choice. Choices in item ii) are counted by $(T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{\prime})^{d^{\prime}_{v}}(T^{+}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}})^{d^{+}_{v}}(T^{-}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}})^{d^{-}_{v}}$ , while item iii) yields a factor $\left(\tfrac{1}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}\right)^{d_{v}+1}$ . This proves the formula (19) in the case where $v$ is not in $V_{s}$ and labeled by $\oplus$ .

The case when $v$ is not in $V_{s}$ and is labeled by $\ominus$ follows by symmetry. This ends the proof of the combinatorial identity (19) and therefore of Proposition 4.5. ∎

5. Asymptotic analysis: The standard case $S^{\prime}(R_{S})>2/(1+R_{S})^{2}-1$

Let $\mathcal{S}$ be a set of simple permutations. The goal of this section is to precisely state, and then prove, Theorem 1.10 (p.1.10): the convergence to the biased Brownian separable permuton of uniform random permutations in $\langle\mathcal{S}\rangle$ when $\mathcal{S}$ satisfies Condition (H1).

5.1. Definition of the biased Brownian separable permuton and statement of the theorem

The (unbiased) Brownian separable permuton was defined in [12]. Because the biased Brownian separable permuton is a one-parameter deformation of it, it is useful to first recall some facts about the substitution trees encoding separable permutations and the (unbiased) Brownian separable permuton.

As noted in Section 1.3, the canonical trees (called decomposition trees in [12]) of separable permutations are those whose internal nodes are all labeled $\oplus$ or $\ominus$ . If we consider more generally substitution trees, the following implication still holds: if $\tau$ is a substitution tree whose nodes are labeled $\oplus$ or $\ominus$ , then $\operatorname{perm}(\tau)$ is a separable permutation.

Recall from Section 3.1 that an expanded tree is a substitution tree where nonlinear nodes are labeled by simple permutations, while linear nodes are required to be binary. In the case of separable permutations, we do not have simple nodes, so that expanded tree are binary trees labeled with $\oplus$ and $\ominus$ . These are also referred to as separation trees in the literature. Fig. 15 shows a separable permutations together with two separation trees associated with it.

For any separable permutation $\pi$ , we denote by $N_{\pi}$ its number of separation trees. If $\pi$ is not separable, we set $N_{\pi}=0$ . It is shown in [12, Prop. 9.1] that the Brownian separable permuton $\bm{\mu}$ satisfies the following property: for any $k\geq 2$ and any $\pi\in\mathfrak{S}_{k}$ ,

[TABLE]

where, as before, we denote by $\operatorname{Cat}_{k}:=\frac{1}{k+1}\binom{2k}{k}$ the $k$ -th Catalan number, which counts complete binary trees with $k$ leaves. In other words, the random permutation of size $k$ extracted from $\bm{\mu}$ is distributed like the permutation encoded by a uniform complete binary tree with $k$ leaves, and hence $k-1$ internal nodes, whose signs are chosen uniformly and independently in $\{\oplus,\ominus\}$ . In light of Proposition 2.4 and Eq. 8 (p. 8), this characterizes the law of the Brownian separable permuton $\bm{\mu}$ among random permutons.

The biased Brownian separable permuton of parameter $p\in(0,1)$ has a similar characterization, except that the signs are now chosen with a bias. For a separable $\pi$ , let $r_{+}(\pi)$ (resp. $r_{-}(\pi)$ ) be the number of internal nodes labeled $\oplus$ (resp. $\ominus$ ) in a separation tree of $\pi$ . Even if this is not relevant for the present paper, let us observe that $r_{+}(\pi)$ (resp. $r_{-}(\pi)$ ) is simply the number of ascents (resp. descents) of $\pi$ 555 To see this, observe that each internal node $v$ of a separation tree is the first common ancestor of exactly one pair of consecutive leaves (the right-most leaf of its left subtree and the left-most leaf of its right subtree). This two consecutive leaves, corresponding to consecutive elements of the permutation, form an ascent (resp. a descent) if and only if $v$ is labeled by $\oplus$ (resp. $\ominus$ ).. In particular, $r_{+}(\pi)$ and $r_{-}(\pi)$ do not depend on the choice of a separation tree (this is also a particular case of Corollary 3.5).

Definition 5.1.

The biased Brownian separable permuton of parameter $p\in(0,1)$ is the random permuton $\bm{\mu}^{(p)}$ characterized by the following relations: for all $k\geq 2$ and all $\pi\in\mathfrak{S}_{k}$ ,

[TABLE]

(Note that the right-hand side is zero if $\pi$ is not separable.)

Several remarks are in order.

•

For $p=1/2$ , we get the unbiased Brownian separable permuton.

•

This characterization of $\bm{\mu}^{(p)}$ is equivalent to the following: for every $k\geq 1$ ,

[TABLE]

where $\bm{b}_{k}^{(p)}$ is a uniform binary planar tree with $k$ leaves, where each internal node is labeled $\oplus$ (resp. $\ominus$ ) with probability $p$ (resp. $1-p$ ), independently from each other.

•

The existence of $\bm{\mu}^{(p)}$ is not immediate from this definition, but according to Proposition 2.9, it suffices to show that $\operatorname{perm}(\bm{b}_{k}^{(p)})$ forms a consistent family of random permutations. This is indeed the case, and follows from the fact that a uniform induced subtree of $\bm{b}_{n}^{(p)}$ of size $k$ is distributed like $\bm{b}_{k}^{(p)}$ (this is, e.g., a consequence of Rémy’s algorithm to generate uniform random binary trees [54]).

•

The definition of $\bm{\mu}^{(p)}$ , and the above argument justifying its existence, are not constructive. For an explicit construction of $\bm{\mu}^{(p)}$ starting from a Brownian excursion, see [42].

•

Knowing a priori that such a permuton exists is not necessary for the proof of our main theorem. Indeed, we will prove that the quantity $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})]$ converges to the right-hand side of Eq. 20 (for all patterns $\pi$ ), where $\bm{\sigma}_{n}$ is a uniform permutation in $\langle\mathcal{S}\rangle_{n}$ , and the parameter $p$ depends on $S$ . From Theorem 2.5, this implies the existence of a random permuton $\bm{\mu}^{(p)}$ satisfying (20) (only for the relevant value of $p$ ; not for all $p$ ) and the convergence in distribution of $(\mu_{\bm{\sigma}_{n}})_{n}$ to $\bm{\mu}^{(p)}$ .

Now, we have all the necessary definitions to make explicit the parameter $p$ of the statement of Theorem 1.10, that we restate in a full version. (Recall also the definition of $\operatorname{Occ}_{\theta}(z)$ from Eq. 17, p. 17.)

Theorem 5.2.

Let $\mathcal{S}$ be a set of simple permutations such that

[TABLE]

For every $n\geq 1$ , let $\bm{\sigma}_{n}$ be a uniform permutation in $\langle\mathcal{S}\rangle_{n}$ , and let $\mu_{\bm{\sigma}_{n}}$ be the random permuton associated with $\bm{\sigma}_{n}$ . The sequence $(\mu_{\bm{\sigma}_{n}})_{n}$ tends in distribution in the weak convergence topology to the biased Brownian separable permuton $\bm{\mu}^{(p)}$ of parameter $p$ , where

[TABLE]

and $\kappa$ is the unique point in $(0,R_{S})$ such that $S^{\prime}(\kappa)=\tfrac{2}{(1+\kappa)^{2}}-1$ .

Since $S$ is a power series with nonnegative coefficients, $t\mapsto S^{\prime}(t)-\tfrac{2}{(1+t)^{2}}+1$ is increasing and continuous on $[0,R_{S})$ (as the sum of two increasing and continuous functions). It therefore takes all values from $-1$ to some positive number (possibly $+\infty$ ) exactly once. This entails the existence and uniqueness of $\kappa$ .

*Example 5.3**.*

In many cases $\operatorname{Occ}_{12}=\operatorname{Occ}_{21}$ , and then $p=1/2$ and $\bm{\mu}^{(p)}$ is the unbiased Brownian separable permuton. This is the case with separable permutations ( $\mathcal{S}=\emptyset$ ), with $\mathcal{S}=\{2413\}$ or $\mathcal{S}=\{3142\}$ , and with any set of simple permutations stable by taking reverse or complement, like the one considered in the introduction $\mathcal{S}=\{2413,3142,24153,42513\}$ .

*Example 5.4**.*

When $\mathcal{S}$ is the family of increasing oscillations (see for instance [14]), we can compute

[TABLE]

We get through numerical approximation $\kappa\approx 0.2709$ and deduce $p\approx 0.5353$ .

*Example 5.5**.*

Taking $\mathcal{S}$ to be the family of simple permutations in $\mathrm{Av}(321)$ , we are interested in the class $\mathcal{C}=\langle\mathcal{S}\rangle$ which is the substitution-closure of $\mathrm{Av}(321)$ . In this case, [8] gives

[TABLE]

We get through numerical approximation $\kappa\approx 0.2486$ . It seems hard to compute the generating series $\operatorname{Occ}_{12}$ , but we can locate its value at $\kappa$ by exhaustively computing the number of inversions of each permutation in $\mathcal{S}$ up to a certain order $N$ , and controlling the rest of the series using the fact that a permutation of size $n$ in $\mathrm{Av}(321)$ cannot have more than $n^{2}/4$ inversions666Permutations avoiding $321$ consist of two increasing subsequences. The number of inversions of $\sigma\in\mathrm{Av}(321)$ of size $n$ is therefore at most $\max_{0\leq k\leq n}k(n-k)\leq\tfrac{n^{2}}{4}$ . The claim follows.. Performing this with $N=12$ yields $p\in[0.577,0.622]$ .

The remainder of Section 5 is devoted to the proof of Theorem 5.2, using generating functions from Section 4 and methods of analytic combinatorics. More precisely, using Theorem 2.5 we are interested in the limit of $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})]$ , which we will express in terms of probability that a tree with marked leaves induces a given subtree. This probability itself will be expressed as the ratio of the coefficients of the generating function of trees with marked leaves inducing a given subtree and of the generating function of trees without marked leaves. We begin with the study of the asymptotics of these generating functions.

Notation: throughout the article, the class $\mathcal{C}$ , or equivalently its set of simple permutations $\mathcal{S}$ , are considered as fixed, and so is the pattern $\pi$ or the tree $t_{0}$ of which we are studying the proportion of occurrences (and therefore their size $k$ ). Constants in asymptotic expansions, including the ones in $o$ , $\mathcal{O}$ and $\Theta$ symbols, may therefore depend on these objects.

5.2. Asymptotics of the generating function of trees with no or one marked leaf

From Eq. 11 p.11, we have

[TABLE]

where

[TABLE]

We denote by $R_{\Lambda}$ the radius of convergence of $\Lambda$ . Note that $R_{\Lambda}=\tfrac{R_{S}}{1+R_{S}}\leq 1$ . We will also use repeatedly the inverse equation $R_{S}=\tfrac{R_{\Lambda}}{1-R_{\Lambda}}$ . In the following, to lighten the notation, we write $\Lambda^{\prime}(R_{\Lambda}):=\lim_{r\rightarrow R_{\Lambda}\atop r<R_{\Lambda}}\Lambda^{\prime}(r)$ . Note that $\Lambda^{\prime}(R_{\Lambda})$ may be $\infty$ .

*Observation 5.6**.*

Differentiating Eq. 23, we get

[TABLE]

In particular, it follows that the condition (H1) is equivalent to $R_{\Lambda}>0$ and $\Lambda^{\prime}(R_{\Lambda})>1$ .

*Observation 5.7**.*

Since $S$ is analytic at [math] with nonnegative coefficients, the same holds for $\Lambda$ . Moreover, the series expansion of $\Lambda$ is

[TABLE]

In particular it is aperiodic, in the sense given in Section A.1.

Proposition 5.8 (Asymptotics of the generating function of $\mathcal{S}$ -canonical trees with no marked leaf).

Assume that (H1) holds, and recall that $\kappa$ is defined by $S^{\prime}(\kappa)=\tfrac{2}{(1+\kappa)^{2}}-1$ . There is a unique $\tau\in(0,R_{\Lambda})$ such that $\Lambda^{\prime}(\tau)=1$ , and we have $\tau=\frac{\kappa}{1+\kappa}$ . The generating functions $T$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ have the same radius of convergence $\rho=\tau-\Lambda(\tau)\in(0,\tau)$ and have a unique dominant singularity777For the reader who is not familiar with complex analysis, all useful definitions and results are given in Appendix A. In particular, ”near $\rho$ ” means ”in a $\Delta$ -neighborhood of $\rho$ ”, where ” $\Delta$ -neighborhood” is defined in Definition A.2. The formal definition of (unique) dominant singularity is given in Eq. 57 p.57. in $\rho$ . Their asymptotic expansions near $\rho$ are:

[TABLE]

where $\displaystyle{\beta=\sqrt{\frac{2\rho}{\Lambda^{\prime\prime}(\tau)}}}$ and $\lambda=\displaystyle{\frac{1}{(1-\tau)^{2}}}$ . In particular, $T$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ are convergent at $z=\rho$ and

[TABLE]

This type of behavior with a square-root dominant singularity is classical for series defined by an implicit equation (such as $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , which is characterized by $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}(z)=z+\Lambda(T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}(z))$ ), that belong to the smooth implicit function schema [30, Def. VII.4]. This schema is defined by the existence of a solution to some characteristic equation, which in our case reduces to hypothesis (H1), as explained in Section 1.7. Our result is a special case of [21, Th. 1], a general result for equations of the form $U(z)=z+\Lambda(U(z))$ . Implicit equations of this form characterize generating functions of weighted trees counted by their number of leaves, and are also considered in [52, Prop. 8].

Proof.

From 5.7, $\Lambda^{\prime}$ is strictly increasing in the real interval $(0,R_{\Lambda})$ . Together with the fact that $\Lambda^{\prime}(0)=0$ and the assumption $\Lambda^{\prime}(R_{\Lambda})>1$ (see 5.6), this proves the existence and uniqueness of $\tau>0$ such that $\Lambda^{\prime}(\tau)=1$ .

Setting $v=\tfrac{u}{1-u}$ in $\Lambda^{\prime}$ , which is given by (24), we have $\Lambda^{\prime}\left(\tfrac{v}{1+v}\right)=(1+v)^{2}(1+S^{\prime}(v))-1$ . It follows that $\Lambda^{\prime}(\frac{\kappa}{1+\kappa})=1$ . By uniqueness of $\tau$ , we conclude $\tau=\frac{\kappa}{1+\kappa}$ .

We now consider the expansion of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ and deduce afterwards the one of $T$ . From Eq. 22, we have $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)=z+\Lambda(T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z))$ . Then Theorem 1 in [21] gives that $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ is analytic at [math] and has a unique dominant singularity of exponent $\tfrac{1}{2}$ in $\rho=\tau-\Lambda(\tau)$ , with the expansion given in Eq. 26: $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)=\tau-\beta\sqrt{1-\tfrac{z}{\rho}}+\mathcal{O}(1-\tfrac{z}{\rho})$ .

The next step is to justify that $\rho\in(0,\tau)$ . That $\rho<\tau$ follows from $\Lambda(\tau)>0$ (since $\tau>0$ ). Moreover, since $\Lambda$ has nonnegative coefficients and no constant term, we have $\Lambda(\tau)<\tau\Lambda^{\prime}(\tau)=\tau$ , so that $\rho>0$ .

Finally, we look at the series $T=\tfrac{T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}$ (see Eq. 12). Observe that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=\tau<R_{\Lambda}\leq 1$ . Consequently, the dominant singularity of $T$ is the same as $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , i.e. $\rho$ , and is still unique – indeed this singularity is reached before that the denominator vanishes; more formally, this is a particular case of subcritical composition (Lemma A.6). The asymptotic expansion of $T$ near $\rho$ is obtained through the following computation:

[TABLE]

Proposition 5.9 (Asymptotics of the generating function of $\mathcal{S}$ -canonical trees with marked leaves).

All generating functions $T^{\prime}$ , $T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , $T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ , $T^{+}$ , $T^{+}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , $T^{+}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ , $T^{-}$ , $T^{-}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ and $T^{-}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ have a unique dominant singularity in $\rho$ . They diverge at the singularity $z=\rho$ and behave as $K(1-\tfrac{z}{\rho})^{-1/2}$ , where the constant $K$ is given in the table below:

[TABLE]

with $\gamma=\frac{\beta(1-\tau)^{2}}{2\rho}$ (recall that $\lambda=\frac{1}{(1-\tau)^{2}}$ was defined in Proposition 5.8).

Note that the table is in fact a rank $1$ matrix. Namely, passing from a root different from $\oplus$ (resp. $\ominus$ ) to a nonconditioned root always adds a factor $\lambda$ , independently of the condition on the marked leaf. Similarly, removing the leaf condition always yields the same factor $\lambda$ independently of the conditions on roots.

Proof.

By singular differentiation (see Theorem A.4) of Eqs. 25 and 26, we have (near $\rho$ )

[TABLE]

Since $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , we have obtained the constants in the first line of the table. We now turn to the two last lines. In order to use Eqs. 14, 15 and 16 p.14, we first compute the expansions of all intermediate quantities appearing in these formulas. From Eq. 26, we obtain the following expansion near $\rho$ :

[TABLE]

We turn to the expansion of $S^{\prime}(T)$ . Putting $u=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ in $\Lambda^{\prime}(u)$ (which is given by (24)) and using Eq. 12, we have

[TABLE]

Recall that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=\tau<R_{\Lambda}$ . Therefore, the composition $\Lambda^{\prime}\circ T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ is subcritical (see Lemma A.6). This implies that $\Lambda^{\prime}\circ T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ has a unique dominant singularity at $\rho$ , and plugging in the asymptotic expansion (26) of $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ at $\rho$ , we obtain

[TABLE]

where we used the equalities $\Lambda^{\prime}(\tau)=1$ (by definition of $\tau$ ) and $\Lambda^{\prime\prime}(\tau)=\tfrac{2\rho}{\beta^{2}}$ (by definition of $\beta$ ). Combining Eq. 26 and Section 5.2 into Eq. 28, we obtain, after simplification:

[TABLE]

The expansion of $WS^{\prime}(T)+W+S^{\prime}(T)$ then follows from Eq. 27 and Eq. 30:

[TABLE]

We can now derive the expansions of our generating functions, using Eqs. 14, 15 and 16. First,

[TABLE]

Then,

[TABLE]

Since $WS^{\prime}(T)+W+S^{\prime}(T)$ takes value $1$ at $\rho$ , the series $T^{+}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has the same first-order expansion:

[TABLE]

By symmetry, we have $T^{-}=T^{+}$ , $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{-}=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{+}$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{-}=T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{+}$ , and this completes the proof of the proposition. ∎

5.3. Asymptotics of the generating function of marked trees with a given induced tree

We recall some notation introduced in Section 4.2. Let $t_{0}$ be a substitution tree with $k\geq 2$ leaves and $e$ edges.

Let $V_{*}$ (resp. $V_{+}$ , $V_{-}$ ) be the set of nonlinear nodes (resp. nodes labeled $\oplus$ , $\ominus$ ) in $\mathrm{Int}(t_{0})$ . Recall that, for $v\in\mathrm{Int}(t_{0})$ , $d_{v}$ is the degree of $v$ and $\theta_{v}$ the permutation labeling $v$ , and that $\mathcal{T}_{t_{0}}$ is the set of $\mathcal{S}$ -canonical trees $t$ with $k$ marked leaves such that these leaves induce $t_{0}$ . Denote by $T_{t_{0}}$ the generating function of $\mathcal{T}_{t_{0}}$ (where the size is the number of leaves, both marked and unmarked).

Proposition 5.10.

The series $T_{t_{0}}$ has a unique dominant singularity in $\rho$ , with the asymptotic expansion $T_{t_{0}}=B_{t_{0}}(1-\tfrac{z}{\rho})^{-(e+1)/2}(1+o(1))$ , where the constant $B_{t_{0}}$ is

[TABLE]

Proof.

By definition, $T_{t_{0}}=\sum_{V_{s}}T_{t_{0},V_{s}}$ , where the sum runs over sets $V_{s}$ such that $(t_{0},V_{s})$ is a decorated tree. We start from the formula for $T_{t_{0},V_{s}}$ , which is given by Proposition 4.5. From Proposition 5.9, the nine series $T^{\prime},\ldots,T^{-}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ of trees with one marked leaf all have unique dominant singularities in $\rho$ . This is also the case for the functions $\operatorname{Occ}_{\theta_{v}}(T)$ and $\left(\tfrac{1}{1-T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}}\right)^{d_{v}}$ by subcritical composition (see Lemma A.6). Indeed, from 4.3 the radius of convergence of $\operatorname{Occ}_{\theta}$ is at least $R_{S}$ and from Proposition 5.8, $T$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ are convergent at $\rho$ with $T(\rho)=\tfrac{\tau}{1-\tau}<\tfrac{R_{\Lambda}}{1-R_{\Lambda}}=R_{S}$ and $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}(\rho)=\tau<1$ . As a consequence, $T_{t_{0}}$ has a unique dominant singularity in $\rho$ (see Lemma A.5).

For exact asymptotics near $\rho$ , note that $\operatorname{Occ}_{\theta_{v}}(T)$ and $\tfrac{1}{1-T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ converge respectively to $\operatorname{Occ}_{\theta_{v}}(\tfrac{\tau}{1-\tau})$ and $\tfrac{1}{1-\tau}$ at $\rho$ (see Proposition 5.8), while the nine series $T^{\prime},\ldots,T^{-}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ behave as $\text{cst}(1-\tfrac{z}{\rho})^{-1/2}$ , where the constants are given in Proposition 5.9. We thus get using the notation of Proposition 4.5:

[TABLE]

The asymptotic behavior of $T_{t_{0},V_{s}}$ near $\rho$ is then obtained by multiplying the above expressions. The formula can be simplified by observing that $\sum_{v\in\mathrm{Int}(t_{0})}(d^{\prime}_{v}+d_{v}^{+}+d_{v}^{-})=\sum_{v\in\mathrm{Int}(t_{0})}d_{v}=e$ and $\sum_{v\in\mathrm{Int}(t_{0})}d^{\prime}_{v}+\bm{1}_{\text{\scriptsize root}\in V_{s}}=|V_{s}|+k$ , and we obtain:

[TABLE]

To write the second line, we have used that

[TABLE]

Now we have that $T_{t_{0}}$ is the sum of $T_{t_{0},V_{s}}$ over sets $V_{s}$ such that $(t_{0},V_{s})$ is a decorated tree. By definition, such $V_{s}$ can be written as $V_{*}\cup\widetilde{V_{s}}$ for some $\widetilde{V_{s}}\subset V_{+}\cup V_{-}$ (the notation $V_{*}$ , introduced right before the proposition, is the set of nonlinear nodes of $t_{0}$ ). This change of variables leads to

[TABLE]

We first observe that since $\lambda=(1-\tau)^{-2}$ , the last factor simplifies as $(1-\tau)^{d_{v}+1}$ . The proposition then follows by writing the sum of products on the second line as a product of sums. ∎

5.4. Probability of tree patterns

Recall that $\mathcal{T}$ is the set of $\mathcal{S}$ -canonical trees (ie canonical trees of permutations in $\langle\mathcal{S}\rangle$ ). We take a uniform random tree with $n$ leaves in $\mathcal{T}$ and mark $k$ of its leaves, also chosen uniformly at random. We denote by $\mathbf{t}^{(n)}_{k}$ the tree induced by the $k$ marked leaves.

Proposition 5.11.

Let $k\geq 2$ , and let $t_{0}$ be any substitution tree with $k$ leaves. Then

[TABLE]

where $B_{t_{0}}$ is given by Eq. 32, and $e(t_{0})$ is the number of edges of $t_{0}$ .

Proof.

Directly from the definition, we have:

[TABLE]

The Transfer Theorem (Theorem A.3) gives us the asymptotic behavior of $[z^{n}]T(z)$ and $[z^{n}]T_{t_{0}}(z)$ from the asymptotic expansions in Proposition 5.8 (Eq. 25) and Proposition 5.10. Deriving the result from there is a routine exercise. ∎

5.5. Back to permutations

Let $\pi$ be a permutation of size $k$ . Recall from Section 3.1 that an expanded tree is a substitution tree where nonlinear nodes are labeled by simple permutations, while linear nodes are required to be binary. As in Corollary 3.5, we denote $\widetilde{N_{\pi}}$ the number of expanded trees of $\pi$ . We know (see Corollary 3.5) that they have all the same number of linear nodes labeled $\oplus$ (resp. $\ominus$ ), this number being denoted by $r_{+}$ (resp. $r_{-}$ ) and they all contain the same $r_{*}$ simple nodes, whose labels will be denoted as $\theta_{1},\ldots,\theta_{r_{*}}$ .

We introduce the default of binarity of the permutation $\pi$ :

[TABLE]

Observe that $\operatorname{db}(\pi)=0$ if and only if $\pi$ is separable.

Finally, to state the next proposition, we also need to introduce the quantities

[TABLE]

Note that this is the same $p$ as in Theorem 5.2.

Proposition 5.12.

Let $\pi\in\mathfrak{S}_{k}$ with $k\geq 2$ and let $\bm{\sigma}_{n}$ be a uniform random permutation in $\langle\mathcal{S}\rangle_{n}$ . With notation as above, we have

[TABLE]

Proof.

We denote by $\bm{I}$ a uniform random $k$ -element subset of $[n]$ and by $\bm{t}^{(n)}$ a uniform random $\mathcal{S}$ -canonical tree with $n$ leaves. It holds that $\bm{\sigma}_{n}\stackrel{{\scriptstyle d}}{{=}}\operatorname{perm}(\bm{t}^{(n)})$ . As a consequence of Eqs. 7 and 3.11, we have

[TABLE]

After plugging in the estimate of Proposition 5.11, we get

[TABLE]

where $\operatorname{db}(t_{0})=2k-2-e(t_{0})$ is the default of binarity of the tree $t_{0}$ .

We claim that if $t_{0}$ is a substitution tree of $\pi$ , then $\operatorname{db}(t_{0})\geq\operatorname{db}(\pi)$ with equality if and only if $t_{0}$ is an expanded tree. Indeed,

[TABLE]

Moreover from Lemma 3.6 any substitution tree can be obtained from an expanded tree of $\pi$ by merging some internal nodes along edges connecting them and such merges always increase (strictly) the considered sum, which proves the claim.

It follows that, in the sum of Eq. 34, only expanded trees appear asymptotically. Moreover, $e(t_{0})$ and the constant $B_{t_{0}}$ does not depend on the choice of an expanded tree $t_{0}$ of $\pi$ . As a result, we get $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})]=(1+o(1))B_{\pi}\,n^{-\operatorname{db}(\pi)/2}$ where

[TABLE]

Differentiating Eq. 24 p.24 and using $\Lambda^{\prime}(\tau)=1$ yield the identity $\Lambda^{\prime\prime}(\tau)=\tfrac{4}{1-\tau}+\tfrac{1}{(1-\tau)^{4}}S^{\prime\prime}(\tfrac{\tau}{1-\tau})$ . Moreover, since $\operatorname{Occ}_{12}+\operatorname{Occ}_{21}=\tfrac{S^{\prime\prime}}{2}$ (see 4.3), this gives us

[TABLE]

Finally, after collecting everything together, we get

[TABLE]

where the last equality above has been obtained using that, for any expanded tree $t_{0}$ of $\pi$ , we have $r_{+}+r_{-}+r_{*}=|\mathrm{Int}(t_{0})|=e-k+1$ and $\operatorname{db}(\pi)=2k-2-e$ . This allows us to simplify Eq. 35 and yields the desired value of $B_{\pi}$ . ∎

We can now conclude the proof of Theorem 5.2. Let $\bm{\sigma}_{n}$ be a uniform random permutation in $\langle\mathcal{S}\rangle_{n}$ . Our goal is to show that $\mu_{\bm{\sigma}_{n}}$ converges to the biased Brownian separable permuton of parameter $p$ . Let $\pi$ be any permutation of size $k\geq 2$ . As a consequence of Theorem 2.5 (with 2.6) and Eq. 20, we just have to show that

[TABLE]

Assume first that $\pi$ is not separable. In this case, we have $N_{\pi}=0$ . It also holds that $\operatorname{db}(\pi)>0$ , and Proposition 5.12 implies that $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})]\to 0$ .

Assume on the contrary that $\pi$ is separable. In this case, $\operatorname{db}(\pi)=0$ , $\widetilde{N_{\pi}}=N_{\pi}$ and $r_{*}=0$ . Therefore, from Proposition 5.12 we get that

[TABLE]

where we have used the identity $\Gamma(k-\tfrac{1}{2})=\frac{2^{3-2k}\sqrt{\pi}\,\Gamma(2k-2)}{\Gamma(k-1)}=\frac{2^{3-2k}\sqrt{\pi}(2k-3)!}{(k-2)!}$ coming from the duplication formula of the Gamma function. This concludes the proof.∎

5.6. Occurrences of nonseparable patterns

Since $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})]$ tends to [math] whenever $\pi$ is a nonseparable pattern and the random variable takes only nonnegative values, $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ tends to [math] in probability.

Here, we discuss more precisely the asymptotic behavior of $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ in this case. The first result gives the order of magnitude of its moments; we then present a consequence for the random variable itself.

Proposition 5.13.

For $\pi\in\langle\mathcal{S}\rangle$ and $m\geq 1$ , $\mathbb{E}[(\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n}))^{m}]=\Theta(n^{-\operatorname{db}(\pi)/2})$ .

*Remark 5.14**.*

This result does consider separable patterns $\pi$ , but in this case it is a direct consequence of our main theorem. Indeed, according to Theorem 2.5, Theorem 5.2 entails convergence in distribution of $(\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n}))_{n}$ to $\operatorname{\widetilde{occ}}(\pi,\bm{\mu}^{(p)})$ , jointly for all $\pi\in\mathfrak{S}$ , and hence of all moments and mixed moments (since those random variables are bounded by $1$ ). Namely, we have $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})^{m}]\xrightarrow[n\to\infty]{}\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\mu}^{(p)})^{m}]$ . This limiting value is positive if and only if $\pi$ is separable, and can be computed exactly by adapting the method exposed in [12, Section 9.1].

Proof.

By definition, $\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})=\tbinom{n}{k}^{-1}\sum_{I\subset[n],|I|=k}\bm{1}_{\operatorname{pat}_{I}(\bm{\sigma}_{n})=\pi}$ , where we use $k$ for the size of the pattern $\pi$ , as usual. Consequently,

[TABLE]

We split the sum according to the different possible values of $K=\bigcup_{i}I_{i}$ and $j=|K|$ . Denoting $B^{K}_{k,m}$ the set of possible ordered covers of $K$ by $m$ sets of size $k$ , this gives

[TABLE]

Let us now remark that the unique increasing bijection between $K$ and $[j]$ induces a bijection between $B^{K}_{k,m}$ and $B^{[j]}_{k,m}$ . Let $(J_{i})_{1\leq i\leq m}$ denote the image of $(I_{i})_{1\leq i\leq m}$ by this bijection. Clearly,

[TABLE]

The sum can now be decomposed according to the different values of $\rho=\operatorname{pat}_{K}(\bm{\sigma}_{n})$ yielding

[TABLE]

Since the summation index sets do not depend on $n$ , it is enough to consider each summand separately to get the asymptotics. From Proposition 5.12, the summand $\tbinom{n}{k}^{-m}\tbinom{n}{j}\,\mathbb{E}[\operatorname{\widetilde{occ}}(\rho,\bm{\sigma}_{n})]$ is of order $n^{j-km-\operatorname{db}(\rho)/2}$ .

Whenever Eq. 36 holds, $\pi$ is a pattern of $\rho=\operatorname{pat}_{K}(\bm{\sigma}_{n})$ . As a consequence, an expanded tree of $\rho$ must have a substitution tree of $\pi$ as an induced tree. Since the default of binarity may only decrease when taking induced trees, this implies that $\operatorname{db}(\rho)\geq\operatorname{db}(\pi)$ . Since additionally $j\leq km$ , we deduce that $j-km-\operatorname{db}(\rho)/2\leq-\operatorname{db}(\pi)/2$ which gives $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})^{m}]=\mathcal{O}(n^{-\operatorname{db}(\pi)/2})$ .

To prove that $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})^{m}]=\Theta(n^{-\operatorname{db}(\pi)/2})$ , it is then enough to find one summand, which grows as $n^{-\operatorname{db}(\pi)/2}$ for large $n$ . This is achieved considering the summand indexed by

[TABLE]

Indeed in this case, $\operatorname{db}(\rho)=\operatorname{db}(\pi)$ , so that $j-km-\operatorname{db}(\rho)/2=-\operatorname{db}(\pi)/2$ , which concludes the proof of the proposition. ∎

Corollary 5.15.

For $\pi\in\langle\mathcal{S}\rangle$ and $\varepsilon>0$ small enough, $\mathbb{P}(\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})>\varepsilon)=\Theta(n^{-\operatorname{db}(\pi)/2})$ , where the constant in the $\Theta$ symbol depends on $\varepsilon$ .

Proof.

The upper bound is an immediate consequence of Markov’s inequality. For the lower bound, let $X$ be a random variable in $[0,1]$ , we have

[TABLE]

The corollary follows by taking $X=\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})$ and $\varepsilon$ small enough. ∎

6. Asymptotic analysis: The degenerate case $S^{\prime}(R_{S})<2/(1+R_{S})^{2}-1$

In this section, we are interested in the case where the generating function $S$ of simple permutations in $\mathcal{S}$ satisfies the following condition.

Definition 6.1 (Hypothesis $(H2)$ ).

The generating function $S$ of a family $\mathcal{S}$ of simple permutations is said to satisfy hypothesis $(H2)$ if $S$ meets the following conditions at its radius of convergence $R_{S}>0$ :

i)

$S^{\prime}$ * is convergent at $R_{S}$ and*

[TABLE] 2. ii)

$S$ * has a dominant singularity of exponent $\bm{\delta>1}$ in $R_{S}$ .*

Item ii means that, around the singularity $R_{S}$ , one has

[TABLE]

for some analytic function $g_{S}$ and constant $C_{S}\neq 0$ . We refer to Section A.4 for a precise definition. Clearly, under $(H2)$ , it holds that $R_{S}<\infty$ . Note also that the assumption $\delta>1$ is redundant with the convergence of $S^{\prime}$ at $R_{S}$ .

6.1. Asymptotic behavior of the main series

As in Section 5.2, the first step is to derive the asymptotic behavior of all generating functions for marked trees around their common dominant singularity. In this section, we will not compute constants explicitly, but only focus on the singularity exponent. Indeed, keeping track only of singularity exponents is here sufficient to determine the limiting permuton.

The function $\Lambda$ is defined in (23) by:

[TABLE]

Lemma 6.2.

Assume that $S$ satisfies hypothesis $(H2)$ . Then $\Lambda$ has a unique dominant singularity of exponent $\delta$ in $R_{\Lambda}:=\tfrac{R_{S}}{1+R_{S}}<1$ . Moreover, $\Lambda^{\prime}$ is convergent at $R_{\Lambda}$ and $\Lambda^{\prime}(R_{\Lambda})<1$ .

Proof.

The first assertion follows from Lemma A.6 (Supercritical case), using also that $R_{S}<\infty$ . The convergence of $\Lambda^{\prime}$ at $R_{\Lambda}$ follows from that of $S^{\prime}$ at $R_{S}$ . Finally, the inequality $\Lambda^{\prime}(R_{\Lambda})<1$ is a straightforward computation from (37) (recall that $\Lambda^{\prime}$ is given in (24)). ∎

Recall that from Eq. 11 (p. 11) $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is implicitly defined by the equation

[TABLE]

As explained in Section 1.7, the condition $\Lambda^{\prime}(R_{\Lambda})<1$ implies that the singularity of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ is not a branch point, but is inferred from the singularity of $\Lambda$ .

Lemma 6.3.

Assume that $S$ satisfies hypothesis $(H2)$ . Then there is a unique $\rho>0$ such that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=R_{\Lambda}$ . Moreover, $\rho$ is the radius of convergence of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has a unique dominant singularity of exponent $\delta$ in $\rho$ .

The proof is rather technical and postponed to Section A.6.

Now since from Eq. 12: $T=\tfrac{T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}}$ , at the singularity $\rho$ of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ , the denominator of $T(\rho)$ is $1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=1-R_{\Lambda}>0$ , hence the singularity of $T$ is inherited from the one of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ . More precisely, from Lemma A.6 (Subcritical case), we have the following corollary.

Corollary 6.4.

Assume that $S$ satisfies hypothesis $(H2)$ . Then $T$ has a unique dominant singularity of exponent $\delta$ in $\rho$ , with $T(\rho)=\tfrac{R_{\Lambda}}{1-R_{\Lambda}}=R_{S}$ .

We now turn to the behavior of generating function of trees with one marked leaf.

Lemma 6.5.

Assume that $S$ satisfies hypothesis $(H2)$ . Then each of the nine generating functions $T^{\prime}$ , $T^{+}$ , …, $T_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}^{-}$ has a unique dominant singularity of exponent $\delta-1$ in $\rho$ .

Proof.

For $T^{\prime}$ (resp. $T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}=T^{\prime}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ ) this follows immediately from Corollary 6.4 (resp. Lemma 6.3) by singular differentiation (Lemma A.7).

Recall that $T^{+}$ is explicitly given in Proposition 4.2 as

[TABLE]

where $W=(\tfrac{1}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}})^{2}-1$ . We examine $W$ and $S^{\prime}(T)$ to determine the exponent of the singularity of $T^{+}$ at its radius of convergence (which we will prove to be $\rho$ ).

Since $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=R_{\Lambda}<1$ , again by subcritical composition, Lemma 6.3 gives that $W$ has a unique dominant singularity of exponent $\delta$ in $\rho$ . Moreover, $W(\rho)=(\tfrac{1}{1-R_{\Lambda}})^{2}-1$ .

As for $S^{\prime}(T)$ we need to analyse $S^{\prime}$ and $T$ and determine whether the composition is critical, supercritical or subcritical (see Lemma A.6).

•

By singular differentiation (Lemma A.7), it follows from $(H2)$ that $S^{\prime}$ has a dominant singularity of exponent $\delta-1$ in $R_{S}$ . In addition, $S^{\prime}(R_{S})<\frac{2}{(1+R_{S})^{2}}-1$ by $(H2)$ .

•

Recall that from Corollary 6.4 $T$ has a unique dominant singularity of exponent $\delta$ in $\rho$ , with $T(\rho)=\tfrac{R_{\Lambda}}{1-R_{\Lambda}}=R_{S}$ . The composition $S^{\prime}\circ T$ is therefore critical.

•

Moreover, $T$ is aperiodic since it counts a superset of the separable permutations.

From Lemma A.6 (Critical case-A), we obtain that $S^{\prime}(T)$ has a unique dominant singularity of exponent $\delta-1$ in $\rho$ , and therefore so does $WS^{\prime}(T)+W+S^{\prime}(T)$ (using Lemma A.5).

In $\rho$ , the value of the series $WS^{\prime}(T)+W+S^{\prime}(T)=(W+1)(S^{\prime}(T)+1)-1$ is less than $(\tfrac{1}{1-R_{\Lambda}})^{2}\tfrac{2}{(1+R_{S})^{2}}-1=1$ . Therefore, by subcritical composition, $T^{+}=\frac{1}{1-WS^{\prime}(T)-W-S^{\prime}(T)}$ has a unique dominant singularity of exponent $\delta-1$ in $\rho$ . Since $\frac{1}{1+W}$ and $(WS^{\prime}(T)+W+S^{\prime}(T))$ have unique dominant singularities in $\rho$ of respective exponents $\delta$ and $\delta-1$ , the other cases follow by Lemma A.5, using the formulas given in Proposition 4.2. ∎

6.2. Probability of given patterns

Recall that the function $\operatorname{Occ}_{\theta}$ was defined in Eq. 17 by $\operatorname{Occ}_{\theta}(z)=\sum_{\alpha\in\mathcal{S}}\mathrm{occ}(\theta,\alpha)z^{|\alpha|-|\theta|}$ .

Unlike in the previous section, the functions $\operatorname{Occ}_{\theta}(z)$ will appear in the asymptotic behaviors, thus we need some additional assumptions on them. First, as noticed in 4.3, we have

[TABLE]

which, under $(H2)$ , has a dominant singularity of exponent $\delta-k$ in $R_{S}$ (see Lemma A.7, about singular differentiation). The following hypothesis is thus reasonable.

Definition 6.6 (Hypothesis $(CS)$ ).

Let $S$ have a dominant singularity of exponent $\delta>1$ in $R_{S}$ . The family of simple permutations $\mathcal{S}$ satisfies the hypothesis $(CS)$ if, for each pattern $\theta$ of size $k$ , the corresponding series $\operatorname{Occ}_{\theta}(z)$ has a dominant singularity of exponent at least $\delta-k$ in $R_{S}$ .

Let $t_{0}$ be a substitution tree with $k$ leaves. Recall that $\mathrm{Int}(t_{0})$ is the set of internal nodes of $t_{0}$ , and that for any node $v\in\mathrm{Int}(t_{0})$ , $d_{v}$ denotes its number of children. Recall also from Section 4.2 (p. 4.2) that $\mathcal{T}_{t_{0}}$ is the family of canonical trees with $k$ marked leaves which induce a tree equal to $t_{0}$ , and that $T_{t_{0}}$ is its generating function.

Combining Proposition 4.5 and the above results, we obtain the following.

Proposition 6.7.

For any substitution tree $t_{0}$ of size $k\geq 2$ , assuming $(H2)$ and $(CS)$ , the series $T_{t_{0}}$ has a unique dominant singularity of exponent at least $\widetilde{e}_{t_{0}}$ in $\rho$ , where

•

$\widetilde{e}_{t_{0}}=\sum_{v:d_{v}>\delta}(\delta-d_{v})$ , if there is at least one node $v$ such that $d_{v}>\delta$ ;

•

$\widetilde{e}_{t_{0}}=\delta-\max_{v\in\mathrm{Int}(t_{0})}d_{v}$ * otherwise888Note for future reference that the two expressions for $\widetilde{e}_{t_{0}}$ are equal when there is a unique $v$ such that $d_{v}>\delta$ ..*

Proof.

First recall that $\mathcal{T}_{t_{0}}$ can be decomposed as a union $\bigcup_{V_{s}}\mathcal{T}_{t_{0},V_{s}}$ , where $V_{s}$ are subsets of $\mathrm{Int}(t_{0})$ which contain all nonlinear nodes (see Section 4.1). It is therefore enough to prove that each series $T_{t_{0},V_{s}}$ has a unique dominant singularity in $\rho$ with at least the desired exponent.

Recall from Proposition 4.5 the following formula for $T_{t_{0},V_{s}}(z)$ :

[TABLE]

where each $A_{v}$ is given by (19) and depends on the type of $v$ . Using hypothesis $(CS)$ , Corollary 6.4 and Lemma A.6 (Critical case-A), we know that $\operatorname{Occ}_{\theta_{v}}(T)$ has a dominant singularity of exponent at least $\delta-d_{v}$ in $\rho$ , and, from Lemma 6.5, that it is the term with the lowest exponent and the only possibly divergent term arising in $A_{v}$ . It then follows from Lemma A.5 that $A_{v}$ has a dominant singularity of exponent at least $\delta-d_{v}$ in $\rho$ .

The series $T_{t_{0},V_{s}}(z)$ is then the product of the $A_{v}$ ’s and of some convergent series. The result of the proposition is obtained from Lemma A.5: $T_{t_{0},V_{s}}(z)$ has a dominant singularity of exponent at least $\sum_{v:d_{v}>\delta}(\delta-d_{v})$ in $\rho$ , if this sum is nonempty (i.e., when one of the series is divergent) and of the smallest singularity exponent among them, that is $\delta-\max_{v\in\mathrm{Int}(t_{0})}d_{v}$ otherwise, i.e., if all factors are convergent. ∎

As in Section 5.4 (p. 5.4), we take a uniform random tree with $n$ leaves in $\mathcal{T}$ with $k$ marked leaves (chosen also uniformly at random). Denote as before by $\mathbf{t}_{n}^{k}$ the tree induced by the $k$ marked leaves.

Corollary 6.8.

Assume $(H2)$ and $(CS)$ . For any substitution tree $t_{0}$ with $k\geq 2$ leaves, the probability $\mathbb{P}({\mathbf{t}^{(n)}_{k}}=t_{0})$ tends to [math], unless $t_{0}$ has only one internal node.

Proof.

As in Section 5.4, we use the formula

[TABLE]

From Theorem A.3 and Proposition 6.7 (and using the notation $\widetilde{e}_{t_{0}}$ herein defined), we get that

[TABLE]

for some constant $\tilde{C}_{t_{0}}$ , possibly equal to [math]. On the other hand, Theorem A.3 and Corollary 6.4 imply that

[TABLE]

for some constant $C\neq 0$ . Putting everything together, we obtain

[TABLE]

where $e_{t_{0}}=\delta-k-\widetilde{e}_{t_{0}}$ .

For any subset $A$ of the internal nodes of a tree $t$ with $k$ leaves, we claim that the following inequality holds: $\sum_{v\in A}d_{v}\leq|A|+k-1$ . It is clear when $k=1$ . For $k>1$ , we decompose $t$ as a root $\varnothing$ with $d\geq 2$ subtrees $t_{1},\dots,t_{d}$ . The chosen set $A$ of nodes of $t$ determines a set $A_{i}$ of nodes in each tree $t_{i}$ that has $k_{i}$ leaves. Assume that $\sum_{v\in A_{i}}d_{v}\leq|A_{i}|+k_{i}-1$ for all $i$ . Then, we have

[TABLE]

and with the observation that $(d-1)(\bm{1}_{\varnothing\in A}-1)\leq 0$ , this proves our claim. Set $A=\{v\in\mathrm{Int}(t_{0}):d_{v}>\delta\}$ ; if $|A|\geq 1$ , one has

[TABLE]

which is negative for $|A|\geq 2$ (indeed, $\delta>1$ by $(H2)$ ). When $|A|=0$ or $|A|=1$ , we have $\widetilde{e}_{t_{0}}=\delta-\max_{v\in\mathrm{Int}(t_{0})}d_{v}$ (see also Footnote 8) and, therefore, $e_{t_{0}}=\max_{v\in\mathrm{Int}(t_{0})}d_{v}-k$ is negative unless $t_{0}$ has exactly one internal node (which is then of degree $k$ ). ∎

It is now straightforward to translate this result in terms of the probability to find a given pattern in a random permutation in the set $\mathcal{C}:=\langle\mathcal{S}\rangle$ . As recalled in Section A.4, the hypothesis $(CS)$ is equivalent to the following: for every $k\geq 1$ and every permutation $\theta$ of size $k$ , there exists an analytic function $g_{\theta}$ and a constant $C_{\theta}$ (possibly equal to [math]) such that, on an $\Delta$ -neighborhood of $R_{S}$ , it holds that

[TABLE]

The quantities $C_{\theta}$ are involved in the statement of the following theorem.

Theorem 6.9.

Let $\bm{\sigma}_{n}$ be a uniform random permutation in $\langle\mathcal{S}\rangle_{n}$ . We assume hypotheses $(H2)$ and $(CS)$ . Then, for any $k\geq 2$ and for any $\pi\in\mathfrak{S}_{k}$ ,

[TABLE]

Consequently, there exists a random permuton $\bm{\mu}_{\mathcal{C}}$ with

[TABLE]

such that $\left(\mu_{\bm{\sigma}_{n}}\right)_{n}$ tends to $\bm{\mu}_{\mathcal{C}}$ in distribution.

Proof.

The starting point is the same as in the standard case (Proposition 5.12). As before we denote by $\bm{I}$ a random uniform $k$ -element subset of $[n]$ , and ${\mathbf{t}^{(n)}_{k}}$ is the tree of size $k$ induced by $k$ uniform leaves in a uniform canonical tree of size $n$ in $\mathcal{T}$ . From Lemma 3.11, we have

[TABLE]

where the sum runs over all substitution trees encoding $\pi$ . Denote by $t_{0}^{\pi}$ the substitution tree with only one internal node labeled by $\pi$ . When $n$ tends to infinity, using Corollary 6.8, we know that every term in the above sum vanishes, but the term corresponding to $t_{0}^{\pi}$ :

[TABLE]

Now we can compute directly from Proposition 4.5 that

[TABLE]

The first term has a dominant singularity of exponent at least $\delta-1$ , while the dominant singularity exponent of the second term is at least $\delta-k$ . We therefore focus on the second term and get by an easy computation that, around $z=\rho$ , we have

[TABLE]

and thus

[TABLE]

where $g$ and $h$ are analytic functions. Then we can apply the Transfer Theorem (Theorem A.3) to $T$ and $T_{t_{0}^{\pi}}$ (as in the proof of Corollary 6.8) and obtain that $\lim_{n\to\infty}\mathbb{P}({\mathbf{t}^{(n)}_{k}}=t_{0}^{\pi})$ is proportional to $C_{\pi}$ for $\pi\in\mathfrak{S}_{k}$ . Since the left-hand side of Eq. 43 sums to one (when summed over $\pi\in\mathfrak{S}_{k}$ , for a fixed $k$ ), this proves Eq. 41.

The rest of the statement follows immediately, using Theorem 2.5 (with 2.6). ∎

6.3. Hypothesis $(CS)$ and convergence of uniform random simple permutations

We may wish to replace the hypothesis $(CS)$ with a less technical hypothesis, such as the convergence of a random simple permutation in our set $\mathcal{S}$ to the random permuton $\bm{\mu}_{\mathcal{S}}$ uniquely determined by Eq. 42. We show here that, though not equivalent, these hypotheses are strongly related. Remark that we do not assume $(H2)$ here, so that the existence of $\bm{\mu}_{\mathcal{S}}$ cannot be inferred from Theorem 6.9 and needs a separate proof.

Proposition 6.10.

Suppose that $S$ has a dominant singularity of exponent $\delta>1$ and assume condition $(CS)$ . Then there exists a permuton $\bm{\mu}_{\mathcal{S}}$ such that

[TABLE]

where the $C_{\pi}$ are given by Eq. 40 (which holds under hypothesis $(CS)$ ).

Let $\bm{\alpha}_{n}$ be a uniform random permutation of size $n$ in $\mathcal{S}$ . If $\left(\mu_{\bm{\alpha}_{n}}\right)$ converges in distribution, then its limit is $\bm{\mu}_{\mathcal{S}}$ . Conversely, if we assume that $S$ and all series $\operatorname{Occ}_{\theta}$ have a unique dominant singularity, then $\left(\mu_{\bm{\alpha}_{n}}\right)$ converges in distribution (and the limit must be $\bm{\mu}_{\mathcal{S}}$ , using the first part of the proposition).

Before giving the proof, let us do the following observation. If both $(H2)$ and $(CS)$ are satisfied, then we can apply both Theorems 6.9 and 6.10. By comparing Eqs. 42 and 44, we have $\bm{\mu}_{\mathcal{C}}=\bm{\mu}_{\mathcal{S}}$ in distribution (recall that the distribution of a random permuton is determined by its expected pattern densities, see Proposition 2.4). In particular, assuming that a uniform random simple permutation in the class converges in distribution to some random permuton, then this random permuton is $\bm{\mu}_{\mathcal{S}}=\bm{\mu}_{\mathcal{C}}$ , that is the limit of a uniform random permutation in the class. This justifies a claim in the introduction.

Proof.

We start with the existence of $\bm{\mu}_{\mathcal{S}}$ . For every $k\geq 1$ , let $\bm{\rho}_{k}$ be a random permutation in $\mathfrak{S}_{k}$ such that $\mathbb{P}(\bm{\rho}_{k}=\pi)=C_{\pi}/({\sum_{\theta\in S_{k}}C_{\theta}})$ . By Proposition 2.9, we only need to show that $(\bm{\rho}_{k})_{k}$ forms a consistent family. Let $1\leq k\leq n$ , then for $\pi\in\mathfrak{S}_{k}$ ,

[TABLE]

On the other hand, the following combinatorial identity can be derived from the definition of the $(\operatorname{Occ}_{\theta})_{\theta\in\mathfrak{S}}$ :

[TABLE]

Indeed, the left-hand side is the series of simple permutations in $\mathcal{S}$ whose entries are partitoned into a set of $k$ marked entries forming a pattern $\pi$ and a set of $n-k$ marked entries. The right-hand side counts the same object, according to the pattern $\sigma$ formed by all the $n$ marked entries. To distinguish the marked entries of the first set from the ones of the second set, we need to specify a subpattern $\pi$ inside the pattern $\sigma$ , which explains the factor $\mathrm{occ}(\pi,\sigma)$ .

We now differentiate both sides of Eq. 46 $m$ times so that $\delta-n-m<0$ , and replace all series with their asymptotic estimates obtained thanks to hypothesis $(CS)$ , Eq. 40 and singular differentiation (Theorem A.4) 999For $x\in\mathbb{C}$ and $r\in\mathbb{N}$ , we denote by $(x)_{r}$ the falling factorial $x(x-1)\cdots(x-r+1)$ .

[TABLE]

As only the singular parts diverge, taking the limit in $z\to R_{S}$ allows to identify the constants, yielding $\sum_{\sigma\in\mathfrak{S}_{n}}\mathrm{occ}(\pi,\sigma)C_{\sigma}=(-1)^{n-k}\binom{\delta-k}{n-k}C_{\pi}$ . Plugging this back in Eq. 45 yields

[TABLE]

As probabilities sum to $1$ , we get $\mathbb{P}(\operatorname{pat}_{{\bm{I}}_{n,k}}(\bm{\rho}_{n})=\pi)=\mathbb{P}(\bm{\rho}_{k}=\pi)$ , proving the consistency of $(\bm{\rho}_{k})_{k}$ .

As noticed in the proof of Theorem 6.9, the convergence of $\left(\mu_{\bm{\alpha}_{n}}\right)$ to $\bm{\mu}_{\mathcal{S}}$ is equivalent to the following: for any fixed $k\geq 2$ and any $\pi\in\mathfrak{S}_{k}$ , the limit $\lim_{n\to\infty}\mathbb{E}\big{[}\operatorname{\widetilde{occ}}(\pi,\bm{\alpha}_{n})\big{]}$ exists and is proportional to $C_{\pi}$ for $\pi\in\mathfrak{S}_{k}$ . Furthermore, by consistency, we only need to show it for large $k$ . Directly from the definitions, we have

[TABLE]

For the proof of the direct implication, we assume that $\mu_{\bm{\alpha}_{n}}$ converges in distribution. From Theorem 2.5, this means that $\mathbb{E}\big{[}\operatorname{\widetilde{occ}}(\pi,\bm{\alpha}_{n})\big{]}$ has a limit $\Delta_{\pi}$ for every $\pi$ of size $k\geq 2$ . Then when $n$ goes to infinity,

[TABLE]

As a consequence, for any fixed $\pi$ and $\varepsilon>0$ , there exists polynomials $g_{-}$ , $g_{+}$ such that for any real $z$ in $[0,R_{S})$ ,

[TABLE]

Hypothesis $(CS)$ implies that in $R_{S}$ we have $\operatorname{Occ}_{\pi}(z)=g_{\pi}(z)+(C_{\pi}+o(1))(R_{S}-z)^{\delta-k}$ for some analytic function $g_{\pi}$ . Also $S^{(k)}$ has a dominant singularity of exponent $\delta-k$ in $R_{S}$ so $S^{(k)}(z)=g_{S^{(k)}}(z)+(C_{S^{(k)}}+o(1))(R_{S}-z)^{\delta-k}$ for some analytic function $g_{S^{(k)}}$ and constant $C_{S^{(k)}}>0$ . Plugging these asymptotic estimates into (48) yields

[TABLE]

Let $k$ be such that $\delta-k<0$ , so that the singular parts are the only diverging quantities when $z\to R_{S}$ . After taking the limit we get $\left|C_{\pi}-\frac{C_{S^{(k)}}}{k!}\Delta_{\pi}\right|\leq\varepsilon$ for every $\varepsilon$ and hence equality. We have proven that $(\Delta_{\pi})_{\pi\in\mathfrak{S}_{k}}$ is proportional to $(C_{\pi})_{\pi\in\mathfrak{S}_{k}}$ for large $k$ , as desired.

For the converse, we start from Eq. 47. Theorem A.3 (which we can apply because of the hypotheses on $S$ and $\operatorname{Occ}_{\theta}$ ) gives the following asymptotic behavior when $n\to\infty$ :

[TABLE]

For fixed $k$ , the limit of the right-hand side is proportional to $C_{\pi}$ , which concludes the proof of the proposition. ∎

7. Asymptotic analysis: The critical case $S^{\prime}(R_{S})=2/(1+R_{S})^{2}-1$

The goal of this section is to describe the limiting permuton of a uniform permutation in a substitution-closed class $\mathcal{C}$ , whose set of simple permutations satisfies the following hypothesis.

Definition 7.1 (Hypothesis (H3)).

A family $\mathcal{S}$ of simple permutations is said to satisfy hypothesis $(H3)$ if the generating function $S$ meets the following conditions at its radius of convergence $R_{S}>0$ :

•

$S$ * has a dominant singularity of exponent $\bm{\delta>1}$ in $R_{S}$ ;*

•

$S^{\prime}$ * is convergent at $R_{S}$ and*

[TABLE]

This hypothesis implies the following behavior of $\Lambda$ near its singularity.

Lemma 7.2.

Assume that $S$ satisfies hypothesis $(H3)$ and set, as before,

[TABLE]

Then $\Lambda$ has a unique dominant singularity of exponent $\delta$ in $R_{\Lambda}:=\tfrac{R_{S}}{1+R_{S}}<1$ . Moreover, $\Lambda^{\prime}$ is convergent at $R_{\Lambda}$ and $\Lambda^{\prime}(R_{\Lambda})=1$ .

Proof.

The statement on the singularity exponent follows from Lemma A.6 (Supercritical case). The statement on $\Lambda^{\prime}$ is a simple computation from (H3) and Eq. 24. ∎

In the following, we denote $\delta_{*}=\min(\delta,2)$ . The behavior of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is given in the following lemma, whose proof is postponed to Section A.7. Note that the exponent is different from the one observed in Section 6 (the singularity comes here from a mixture of a branch point and of the singularity of $\Lambda$ , as explained in Section 1.7).

Lemma 7.3.

Assume that $S$ satisfies hypothesis $(H3)$ . Then there is a unique $\rho>0$ such that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=R_{\Lambda}$ . Moreover, $\rho$ is the radius of convergence of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ and $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has a unique dominant singularity of exponent $1/\delta_{*}$ in $\rho$ .

From here, the strategy is similar as the previous sections, we therefore skip unnecessary details. As in the previous section, we deduce immediately the asymptotic behavior of all generating functions of marked trees:

Corollary 7.4.

Assume that $S$ satisfies hypothesis $(H3)$ and define $\rho$ as above. Then $T$ has a unique dominant singularity of exponent $1/\delta_{*}$ in $\rho$ , with $T(\rho)=R_{S}$ . Moreover each of the generating functions $T^{\prime}$ , $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}^{\prime}$ , …, $T^{-}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ has a unique dominant singularity of exponent $1/\delta_{*}-1$ in $\rho$ .

To go further, we have to assume hypothesis $(CS)$ , i.e. that, for any $\theta$ of size $k$ , the series $\operatorname{Occ}_{\theta}(z)$ has a unique dominant singularity of exponent at least $\delta-k$ in $R_{S}$ . The composition $\operatorname{Occ}_{\theta}\circ T$ is then critical, and from Lemma A.6 (Critical case-B), $\operatorname{Occ}_{\theta}(T(z))$ has a unique singularity of exponent at least $\tfrac{1}{\delta_{*}}\min(\delta-k,1)$ in $\rho$ .

By a straightforward computation from Proposition 4.5, we get the following singularity for the series $T_{t_{0},V_{s}}$ .

Lemma 7.5.

Fix a tree $t_{0}$ and a subset $V_{s}$ of its internal nodes. Then the series $T_{t_{0},V_{s}}$ has a singularity of exponent at least

[TABLE]

The behavior of the uniform permutation in $\langle\mathcal{S}\rangle_{n}$ now depends on whether $\delta$ is smaller or larger than $2$ .

7.1. The case $\delta\in(1,2)$ .

Theorem 7.6.

Let $\mathcal{S}$ be a family of simple permutations verifying hypothesis $(H3)$ and $(CS)$ , with $\delta\in(1,2)$ . We consider the permuton $\bm{\mu}_{\mathcal{S}}$ as in Proposition 6.10 and denote for $\pi\in\mathfrak{S}_{k}$

[TABLE]

Let also

[TABLE]

be the probability distribution of the induced subtree with $k$ leaves in the $\delta$ -stable tree (see [28, Thm 3.3.3]).

If $\bm{\sigma}_{n}$ is a uniform permutation in $\langle\mathcal{S}\rangle_{n}$ , then

[TABLE]

As a consequence, $\mu_{\bm{\sigma}_{n}}$ converges in distribution to a random permuton, whose average pattern densities are determined by Eq. 49 and depend only on $\delta$ and $\bm{\mu}_{\mathcal{S}}$ . We call it the $\delta$ -stable permuton driven by $\bm{\mu}_{\mathcal{S}}$ .

A construction of the $\delta$ -stable permuton driven by $\bm{\nu}$ for every $\delta\in(1,2)$ and random permuton $\bm{\nu}$ is given in Lemma B.2.

*Remark 7.7**.*

In this case, all possible patterns, in particular nonseparable ones, appear with positive probability in the limit (as long as they appear with positive probability in a uniform simple permutation in the class). More precisely, the proof will show the following: $k$ random leaves in a uniform canonical tree induce substitution trees with arbitrary large node degrees, and the first common ancestors of those leaves are all simple permutations with probability tending to 1.

Proof.

We start from the estimate of Lemma 7.5. In the present case $\delta_{*}=\delta$ and $\delta-d_{v}$ is always negative. If we fix $(t_{0},V_{s})$ and apply Theorem A.3 to $T$ and $T_{t_{0},V_{s}}$ , we get

[TABLE]

with

[TABLE]

where the last identity uses $|E|-|\mathrm{Int}(t_{0})|-k+1=0$ . Therefore $e_{(t_{0},V_{s})}<0$ unless $V_{s}=\mathrm{Int}(t_{0})$ , i.e. all internal nodes of $t_{0}$ are in $V_{s}$ . As a consequence, if $\bm{t}^{(n)}$ is a uniform canonical tree of size $n$ and $\bm{I}$ is a uniform subset of the leaves of length $k$ ,

[TABLE]

Now we focus on the case $V_{s}=\mathrm{Int}(t_{0})$ . The series $T_{t_{0},\mathrm{Int}(t_{0})}$ has a dominant singularity of exponent $\widetilde{e}_{(t_{0},\mathrm{Int}(t_{0}))}=1/\delta-k$ (see Eq. 50). We are left with identifying the constant in its singular expansion. In what follows, we will always denote by $C_{A}$ the constant in front of the singular part of the expansion of the analytic function $A$ , i.e. $A(z)=g_{A}(z)+C_{A}(R_{A}-z)^{\delta_{A}}$ , with $R_{A}$ the radius of convergence of $A$ and $g_{A}$ analytic at $R_{A}$ . In particular, given that $\delta\in(1,2)$ and $k\geq 2$ , we have the following expansions (recall $T(\rho)=R_{S}$ ):

[TABLE]

Using Proposition 4.5, we can find the singular expansion of $T_{t_{0},\mathrm{Int}(t_{0})}$ :

[TABLE]

From the Transfer Theorem (Theorem A.3) applied to $T$ and $T_{t_{0}}$ we deduce

[TABLE]

The recursive property of the Gamma function gives $\frac{-k!\Gamma(-1/\delta)}{\Gamma(k-1/\delta)\delta^{k}}=\frac{k!}{(\delta-1)\cdots((k-1)\delta-1)}$ . Furthermore, by definition of $\Delta_{\theta_{v}}$ , relation (39) and singular differentiation of $S$ , we get

[TABLE]

This allows us to rewrite (51) as

[TABLE]

Now the proof of Lemma 7.3 (see Eq. 74) yields $-C_{T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}}=C_{\Lambda}^{-1/\delta}$ . We have also the following relations between the various constants (see Eqs. 65 and 64 in the appendix):

[TABLE]

From there we deduce $-C_{T}=C_{S}^{-1/\delta}$ . Therefore Eq. 52 rewrites

[TABLE]

Summing over trees $t_{0}$ with $\operatorname{perm}(t_{0})=\pi$ gives the theorem. ∎

7.2. The case $\delta>2$ .

Theorem 7.8.

Let $\mathcal{S}$ be a family of simple permutations verifying hypotheses $(H3)$ and $(CS)$ , with $\delta>2$ . If $\bm{\sigma}_{n}$ is a uniform permutation in $\langle\mathcal{S}\rangle_{n}$ , then $\bm{\sigma}_{n}$ converges in distribution to the biased Brownian separable permuton of parameter $p$ , where

[TABLE]

*Remark 7.9**.*

While the limiting permuton in this case is independent of $\delta>2$ and is the same as in the standard case, the fine details of this convergence might be different. In particular, if $\pi$ is a nonseparable pattern, the order of magnitude of $\mathbb{E}[\operatorname{\widetilde{occ}}(\pi,\bm{\sigma}_{n})]$ depends on $\delta$ and is in general bigger than in the standard case – compare Eq. 55 to Proposition 5.13.

Proof.

Let $(t_{0},V_{s})$ be a decorated tree with $k$ leaves. Once again, applying Theorem A.3 to $T$ and $T_{t_{0},V_{s}}$ leads to

[TABLE]

But in this case,

[TABLE]

The above inequality is justified as follows:

[TABLE]

The first inequality is an equality if and only if $d_{v}\leq\delta$ for all $v\in V_{s}$ (recall that $\delta>2$ ). In the second part, the equality case occurs when for all $v$ in $\mathrm{Int}(t_{0})$ , either $d_{v}>\delta$ or $d_{v}=2$ . This implies that if $t_{0}$ is not binary, Eq. 55 is a strict inequality (regardless of $V_{s}$ ) and

[TABLE]

We can show that, up to replacing $\tau$ with $R_{\Lambda}$ and $\kappa$ with $R_{S}$ , the estimates of the singular parts of $T$ , $T_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ , $T^{\prime}$ , $T^{+}$ , …, $T^{-}_{{\scriptscriptstyle{\mathrm{not}}{\ominus}}}$ in Propositions 5.8 and 5.9 still hold. Indeed, the proofs can be transposed verbatim up to replacing some calls to the standard case of Lemma A.6 with the critical case.

Proposition 5.10 does not however hold in its generality anymore, because $\operatorname{Occ}_{\theta}$ is not necessarily convergent at $\tfrac{\tau}{1-\tau}$ for every $\theta$ . It is nonetheless convergent when $|\theta|=2$ , since the singularity of $\operatorname{Occ}_{\theta}$ is in $\delta-2$ . This is enough to show that Proposition 5.10 still holds for binary trees. Moreover nonbinary trees still disappear in the limit according to Eq. 56. This allows us to conclude as in Section 5. ∎

Appendix A Complex analysis toolbox

A.1. Aperiodicity and Daffodil Lemma

To study the asymptotic behavior of combinatorial generating functions, it is important to locate dominant singularities. The following lemma is useful to this purpose.

Recall that a function $A$ analytic at [math] is aperiodic if there do not exist two integers $r\geq 0$ and $d\geq 2$ and a function $B$ analytic at [math] such that $A(z)=z^{r}B(z^{d})$ .

Lemma A.1 (Daffodil Lemma).

Let $A$ be a generating function (with nonnegative coefficients) analytic in $|z|<R_{A}$ . If $A$ is aperiodic, then $|A(z)|<A(|z|)\leq A(R_{A})$ for $|z|\leq R_{A}$ and $z\neq|z|$ . (The case $|z|=R_{A}$ can only be considered if $A(R_{A})<\infty$ .)

This lemma can be found in [30, Lemma IV.1, p. 266]. Note that this reference does not consider the case of $z$ on the circle of convergence, i.e. $|z|=R_{A}$ (although this case is used later in the book, e.g. in the proof of Theorem VI.6, p. 405); the proof of the lemma in this case is similar to $|z|<R_{A}$ . The complete statement of Daffodil Lemma in [30] also deals with cases where the function $A$ is periodic, but we do not need these cases in our work.

A.2. Transfer theorem

We start by defining the notion of $\Delta$ -domain. We use $\operatorname{Arg}(z)$ for the principal determination of the argument of $z$ in $\mathbb{C}\setminus\mathbb{R}^{-}$ taking its values in $(-\pi,\pi)$ .

Definition A.2 ( $\Delta$ -domain and $\Delta$ -neighborhood).

A domain $\Delta$ is a $\Delta$ -domain at $1$ if there exist two real numbers $R>1$ and $\pi/2<\phi<\pi$ such that

[TABLE]

By extension, for a complex number $\rho\neq 0$ , a domain is a $\Delta$ -domain at $\rho$ if it the image by the mapping $z\rightarrow\rho z$ of a $\Delta$ -domain at $1$ . A $\Delta$ -neighborhood of $\rho$ is the intersection of a neighborhood of $\rho$ and a $\Delta$ -domain at $\rho$ .

We will make use of the following family of $\Delta$ -neighborhoods: for $\rho\neq 0\in\mathbb{C}$ , $0<r<|\rho|$ , $\varphi>\pi/2$ , set $\Delta(\varphi,r,\rho)=\{z\in\mathbb{C},|\rho-z|<r,|\operatorname{Arg}(\rho-z)|<\varphi\}$ .

When a function $A$ is analytic on a $\Delta$ -domain at some $\rho$ , the asymptotic behavior of its coefficients is closely related to the behavior of the function near the singularity $\rho$ . The following theorem is a corollary of [30, Theorem VI.3 p. 390].

Theorem A.3 (Transfer Theorem).

Let $A$ be a function analytic on a $\Delta$ -domain $\Delta$ at $R_{A}$ , $\delta$ be an arbitrary real number in $\mathbb{R}\setminus\mathbb{Z}_{\geq 0}$ and $C_{A}$ a constant possibly equal to [math].

Suppose $A(z)=(C_{A}+o(1))(1-\tfrac{z}{R_{A}})^{\delta}$ when $z$ tends to $R_{A}$ in $\Delta$ . Then the coefficient of $z^{n}$ in $A$ satisfies

[TABLE]

A.3. Singular differentiation

The next result is also useful to us.

Theorem A.4 (Singular differentiation).

Let $A$ be an analytic function in a $\Delta$ -neighborhood of $R_{A}$ with the following singular expansion near its singularity $R_{A}$

[TABLE]

where $\delta_{j},\delta\in\mathbb{C}$ .

Then, for each $k>0$ , the $k$ -th derivative $A^{(k)}$ is analytic in some $\Delta$ -domain at $R_{A}$ and

[TABLE]

We refer the reader to [30, Theorem VI.8 p. 419] for a proof of this theorem (this reference considers functions defined on a $\Delta$ -domain, but the proof still works with functions defined on a $\Delta$ -neighborhood).

A.4. Exponents of dominant singularity

In this section, we introduce some compact terminology and easy lemmas to keep track of the exponent $\delta$ of the singularities and of the shape of the domain of analycity without computing the functions explicitly.

Recall that the radius of convergence $R_{A}$ of an analytic function $A$ is the modulus of the singularities closest to the origin, called dominant singularities. Recall also that for series with positive real coefficients, by Pringsheim’s theorem [30, Th. IV.6 p. 240], $R_{A}$ is necessarily a dominant singularity. This justifies the following definition:

Let $\delta$ be a real, which is not an integer. We say that a series $A$ with radius of convergence $R_{A}$ has a dominant singularity of exponent $\delta$ in $R_{A}$ (resp. of exponent at least $\delta$ ) if $A$ has an analytic continuation on a $\Delta$ -neighborhood $\Delta_{A}$ of $R_{A}$ and, on $\Delta_{A}$ , we have

[TABLE]

where $g_{A}(z)$ is an analytic function on a neighbourhood of $R_{A}$ (called the analytic part), and $C_{A}$ a nonzero constant (resp. any constant); $(C_{A}+o(1))\,(R_{A}-z)^{\delta}$ is sometimes referred to as the singular part.

If furthermore, $A$ has no other singularity on the disk of convergence, we say that it has a unique dominant singularity of exponent $\delta$ (resp. at least $\delta$ ) in $R_{A}$ . Since we assumed that $A$ has an analytic continuation on a $\Delta$ -neighborhood $\Delta_{A}$ of $R_{A}$ , by a standard compactness argument, this is equivalent to say that $A$ can be extended to a $\Delta$ -domain in $R_{A}$ .

We make the following observation. According to the value of $\delta$ , we may move (part of) $g_{A}(z)$ in the error term and write Eq. 57 in a simpler form, still on a $\Delta$ -neighborhood of $R_{A}$ .

•

For $\delta<0$ , $g_{A}(z)=o((R_{A}-z)^{\delta})$ so $A(z)=(C_{A}+o(1))\,(R_{A}-z)^{\delta}$ .

•

For $0<\delta<1$ , considering the constant term is the Taylor series expansion of $g_{A}(z)$ we find that $A(z)=A(R_{A})+(C_{A}+o(1))\,(R_{A}-z)^{\delta}$ .

•

Similarly, for $\delta>1$ , we obtain

[TABLE]

in which the third dominant term (after the constant and the linear term) depends on how $\delta$ compares with $2$ . But in each case, we have

[TABLE]

where $\delta_{*}=\min(\delta,2)$ .

We now record a few easy lemmas to manipulate these notions. First consider the stability by product.

Lemma A.5.

Let $F$ and $G$ be series with nonnegative coefficients and the same radius of convergence $R=R_{F}=R_{G}\in(0,\infty)$ . Assume they have each a dominant singularity of exponent $\delta_{F}$ and $\delta_{G}$ respectively in $R$ . Then $F\cdot G$ has a dominant singularity in $R$ of exponent $\delta$ defined by

•

$\delta=\delta_{F}+\delta_{G}$ * if both $\delta_{F}$ and $\delta_{G}$ are negative;*

•

$\delta=\min(\delta_{F},\delta_{G})$ * otherwise.*

Moreover, if both $F$ and $G$ have unique dominant singularities, so has $F\cdot G$ .

Proof.

The proof is easy. The analytic function $F\cdot G$ can be extended to the intersection of the domain of $F$ and $G$ . The exponent of the singular expansion around $R$ is obtained by multiplying singular expansion of $F$ and $G$ : note that, if $\delta_{F}$ is negative, the series $F$ is divergent and the singular part is the dominant part around $R$ . On the opposite, when $\delta_{F}$ is positive, the dominant part of the expansion is the value $F(R)$ of the analytic part at point $R$ , which is always positive, since the series has nonnegative coefficients. The same holds of course for $G$ , which explains the case distinction in the lemma. ∎

We now consider the composition $F\circ G$ . We should differentiate cases where $G(R_{G})>R_{F}$ , $G(R_{G})<R_{F}$ or $G(R_{G})=R_{F}$ (called sometimes supercritical, subcritical and critical cases [30, Sec.VI.9]).

Lemma A.6 (Dominant singularity of $F\circ G$ ).

*Let $F$ and $G$ be series with nonnegative coefficients with radii of convergence $R_{F},R_{G}$ in $(0,\infty)$ .

***Supercritical case: *** Assume that $G(0)<R_{F}<G(R_{G})$ ( $G(R_{G})$ may be finite or infinite).

Call $\rho<R_{G}$ the unique positive number with $G(\rho)=R_{F}$ .*

We assume that $F$ has a dominant singularity of exponent $\delta_{F}$ in $R_{F}$ . Then:

i)

$F\circ G$ * has also a dominant singularity of exponent $\delta_{F}$ in $\rho$ .* 2. ii)

Moreover, if $G$ is aperiodic, then the dominant singularity of $F\circ G$ is unique.

*Subcritical case: *** Assume that $G(R_{G})<R_{F}$ .

We assume that $G$ has a dominant singularity of exponent $\delta_{G}$ in $R_{G}$ . Then:

i)

$F\circ G$ * has also a dominant singularity of exponent $\delta_{G}$ in $R_{G}$ .* 2. ii)

Moreover, if the dominant singularity of $G$ is unique, then the dominant singularity of $F\circ G$ is unique.

*Critical case-A: *** Assume that $G(R_{G})=R_{F}$ .

We assume that $F$ and $G$ both have a dominant singularity of respective exponents $\delta_{F}$ and $\delta_{G}$ . Suppose furthermore $\delta_{G}>1$ . Then:

i)

$F\circ G$ * has also a dominant singularity of exponent $\min(\delta_{G},\delta_{F})$ in $R_{G}$ .* 2. ii)

Moreover, if $G$ is aperiodic, then the dominant singularity of $F\circ G$ is unique.

Critical case-B:* Assume again that $G(R_{G})=R_{F}$ . We assume that $F$ and $G$ both have a dominant singularity of respective exponents $\delta_{F}$ and $\delta_{G}$ . Suppose furthermore $\delta_{G}\in(0,1)$ . Then:*

i)

$F\circ G$ * has a dominant singularity of exponent $\min(\delta_{F},1)\delta_{G}$ in $R_{G}$ .* 2. ii)

Moreover, if $G$ is aperiodic, then the singularity is unique.

Proof.

**Supercritical case: ** It is clear that $F\circ G$ is analytic around any $r\in[0,\rho)$ and has nonnegative coefficients, hence it has radius of convergence at least $\rho$ .

To show that $F\circ G$ is defined in a $\Delta$ -neighborhood $\Delta$ of $\rho$ , we show that $G(\Delta)$ is included in $\Delta_{F}$ . This follows easily from the fact that $G$ is analytic in $\rho$ and has a derivative $G^{\prime}(\rho)$ which is a positive real number.

When $z$ is close to $\rho$ , plugging $G(z)$ in the expansion (57) of $F$ we obtain

[TABLE]

The first term $g_{F}(G(z))$ is analytic at $\rho$ . Since $G(\rho)=R_{F}$ and $G$ is differentiable at $\rho$ we have

[TABLE]

Combining these two expansions yields

[TABLE]

which proves i).

Item ii) is also easy. In the case where we assume $G$ aperiodic, we need Lemma A.1, which ensures that $|G(\zeta)|<R_{F}$ for $|\zeta|\leq|\rho|$ , $\zeta\neq\rho$ .

Subcritical case. Most arguments are similar to the ones of the supercritical case. Therefore we only explain the differences in the singular expansion of $F(G(z))$ . Using the singular expansion (57) of $G$ , we have

[TABLE]

Since $G(R_{G})<R_{F}<+\infty$ , the exponent $\delta_{G}$ is positive and the term $(R_{G}-z)^{\delta_{G}}$ tends to [math] at $R_{G}$ . Both $G(z)$ and $g_{G}(z)$ tend to $G(R_{G})$ as $z\to R_{G}$ , so that, by standard calculus arguments, we have

[TABLE]

Since $F$ and $g_{G}$ are analytic at $G(R_{G})$ and $R_{G}$ respectively, this expansion is of the desired form.

Critical case-A. As above, we focus on the expansion of $F(G(z))$ . Since $\delta_{G}>1$ , $G$ is differentiable at $\rho=R_{G}$ and Eq. 61 still holds. The difference is that $g_{F}(G(z))$ is not analytic anymore. Namely, when $z$ is close to $\rho$ ,

[TABLE]

Then

[TABLE]

Since $g_{F}(g_{G}(z))$ is analytic at $\rho$ , the exponent of the dominant singularity of $F\circ G$ is $\min(\delta_{F},\delta_{G})$ . Note that the singular terms cannot cancel each other since when $\delta_{F}=\delta_{G}$ the constants have the same sign.

Critical case-B. Again, we focus on the singular expansion of $F(G(z))$ . Now, since $\delta_{G}<1$ , $G$ is not differentiable at $\rho=R_{G}$ . Instead of (60) we have

[TABLE]

Eq. (61) becomes

[TABLE]

(In this case, $C_{G}$ must be negative, as can be observed by writing the transfer theorem for the coefficients of $G$ which are non-negative by assumption.) As for $g_{F}(G(z))$ , (63) still holds. We obtain

[TABLE]

We conclude that the exponent of the dominant singularity is $\min(\delta_{F},1)\delta_{G}$ . ∎

We note that the above proof also yields the constant $C_{F\circ G}$ appearing in the singular expansion of $F\circ G$ . The two following particular cases were used in Section 7, p. 53 (in particular, we assume in the discussion below that hypothesis $(H3)$ holds).

•

We take $F(z)=S(z)$ and $G(u)=\frac{u}{1-u}$ . The composition is $F\circ G(u)=\Lambda(u)-\tfrac{u^{2}}{1-u}$ , and $\tfrac{u^{2}}{1-u}$ is analytic at $R_{\Lambda}<1$ . Since $\frac{u}{1-u}$ diverges at its singularity and $S$ has a finite radius of convergence, the composition is supercritical. Extracting the constant from (61), we get

[TABLE]

since in this case $C_{F}=C_{S}$ , $\delta_{F}=\delta$ and $R_{F\circ G}=R_{\Lambda}$ .

•

We take $F(u)=\frac{u}{1-u}$ and $G(z)=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ . The composition is $T(z)=\frac{T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)}{1-T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)}$ (Eq. 12). Since $G(R_{G})=T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=R_{\Lambda}<1=R_{F}$ , (here we use Hypothesis (H3) and $\rho$ is defined in Lemma 7.3) the composition is subcritical. Extracting the constant from (62), we get

[TABLE]

Finally, we state the following result, which follows from Theorem A.4.

Lemma A.7 (Singular differentiation).

If $F$ has a (unique) dominant singularity of exponent (at least) $\delta$ in $\rho$ , then its $k$ -th derivative $F^{(k)}$ has a (unique) dominant singularity of exponent (at least) $\delta-k$ in $\rho$ .

A.5. An analytic implicit function theorem

The following theorem allows to locate the dominant singularity of series defined by an implicit equation.

Lemma A.8 (Analytic Implicit Functions).

Let $F(z,w)$ be a bivariate function analytic at $(z_{0},w_{0})$ , we denote $F_{w}=\tfrac{\partial F}{\partial w}$ . If $F(z_{0},w_{0})=0$ and $F_{w}(z_{0},w_{0})\neq 0$ , then there exists a unique function $\phi(z)$ analytic in a neighbourhood of $z_{0}$ such that $\phi(z_{0})=w_{0}$ and $F(z,\phi(z))=0$ .

We refer the reader to [30, Lemma VII.2, p. 469] for a proof of this result.

A.6. Proof of Lemma 6.3

Let $R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ be the radius of convergence of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ . If $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}})\geq R_{\Lambda}$ , then by intermediate value theorem, we know that there exists $\rho$ as in the lemma.

We will prove this by contradiction. Assume $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}})<R_{\Lambda}$ . We apply Lemma A.8. The bivariate function we consider is $(z,w)\mapsto z-w+\Lambda(w)$ . It vanishes at $(R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}},T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}))$ and the derivative with respect to $w$ at that point is nonzero since

[TABLE]

(Recall that this last inequality is equivalent to (37) in Hypothesis (H2).)

Therefore, $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has an analytic continuation on a neighborhood of $R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ . Since it has positive coefficients, by Pringsheim’s theorem [30, Th. IV.6 p. 240], this is in contradiction with the fact that $R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is the radius of convergence of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ .

We have therefore proved that there exists $\rho\leq R_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ such that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)=R_{\Lambda}$ . Note that it implies the relation $\rho=R_{\Lambda}-\Lambda(R_{\Lambda})$ .

We now consider $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ around $z=\rho$ . Equation (38) defining $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ can be rewritten as $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)=G(z,T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z))$ , where

[TABLE]

Since $\Lambda$ has a dominant singularity of exponent $\delta>1$ in $R_{\Lambda}$ , Equation (58), together with elementary computations, yield the following: for $w$ in a $\Delta$ -neighborhood $D_{\Lambda}$ of $R_{\Lambda}$ ,

[TABLE]

We now use Picard’s method of successive approximants to show the existence and analycity of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ on a $\Delta$ -neighborhood $D_{T}$ of $\rho$ . We refer to [30, Appendix B.5 p. 753] for a synthetic description of the method in the case where $\Lambda$ is analytic in $R_{\Lambda}$ ; we have to adapt it carefully to our setting.

Define $\phi_{0}(z)=R_{\Lambda}$ and $\phi_{j+1}(z)=G(z,\phi_{j}(z))$ whenever $\phi_{j}(z)$ is in $D_{\Lambda}$ . We have $\phi_{1}(z)-\phi_{0}(z)=\tfrac{z-\rho}{1-\Lambda^{\prime}(R_{\Lambda})}$ . Also, Theorem A.4 of singular differentiation applied to Eq. 66 implies that

[TABLE]

Therefore101010There is a slight subtlety here: we would like to apply the classical inequality $|f(w)-f(w^{\prime})|\leq\|f^{\prime}\|_{\infty}|w-w^{\prime}|$ , but this is not possible since the domain $D_{\Lambda}$ is not convex. Note however that a $\Delta$ -neighborhood $D$ is always a quasi-convex set, in the sense that we can always find a path between $w$ and $w^{\prime}$ whose length is smaller than $K|w-w^{\prime}|$ , where $K$ depends on the angle defining $D$ but not on $w$ and $w^{\prime}$ . Therefore the following weaker inequality holds: $|f(w)-f(w^{\prime})|\leq K\|f^{\prime}\|_{\infty}|w-w^{\prime}|$ , which is good enough for our purpose (the constant $K$ disappears in the $\mathcal{O}$ symbol). for $j\geq 1$ , if $\phi_{j}(z)$ and $\phi_{j+1}(z)$ are defined and lie in $D_{\Lambda}$ , we have

[TABLE]

where $\eta=\sup_{w\in D_{\Lambda}}|R_{\Lambda}-w|$ . Fix $\varepsilon>0$ . Up to reducing the radius of $D_{\Lambda}$ , we can therefore assume that

[TABLE]

Thus, if $\phi_{j}(z)$ is in $D_{\Lambda}$ for every $i\leq m$ , then $\phi_{M}(z)$ is defined and we have

[TABLE]

If we take $\varepsilon$ small enough, the argument of $\phi_{M}(z)-R_{\Lambda}$ is close to the one of $z-\rho$ . Furthermore if the modulus of $z-\rho$ is small so is the one of $\phi_{M}(z)-R_{\Lambda}$ . This ensures the existence of a $\Delta$ -neighborhood $D_{T}$ of $\rho$ (not depending on $M$ and $z$ ), such that for $z\in D_{T}$ and $M\geq 1$ , $\phi_{M}(z)$ is in $D_{\Lambda}$ as long as it is defined. In particular, $\phi_{M+1}(z)$ is also defined and by immediate induction, all $\phi_{j}$ are defined and analytic on $D_{T}$ .

Eq. 67 also implies that $\phi_{j}$ converges locally uniformly on $D_{T}$ . The limit is the unique solution $w$ in $D_{\Lambda}$ of the fixed point equation $w=G(z,w)$ (the uniqueness of the solution comes from the fact that for every $z\in D_{T}$ , $w\mapsto G(z,w)$ is a contraction for $w$ in $D_{\Lambda}$ ). This limit is therefore an analytic continuation of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ to $D_{T}$ . Note also that from Eq. 68, the following estimate holds on $D_{T}$ :

[TABLE]

Using the expansion given in Eq. 57 of $\Lambda$ around $R_{\Lambda}$ , we have for $z\in D_{T}$ ,

[TABLE]

As $\operatorname{id}-g_{\Lambda}$ is analytic at $R_{\Lambda}$ with a nonzero derivative $1-\Lambda^{\prime}(R_{\Lambda})$ , it can be inverted analytically around $R_{\Lambda}$ by an analytic function $h_{\Lambda}$ and hence

[TABLE]

As $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)-R_{\Lambda}=\frac{1+o(1)}{1-\Lambda^{\prime}(R_{\Lambda})}(z-\rho)$ , it follows from the Taylor expansion of $h_{\Lambda}$ up to exponent $\lceil\delta\rceil$ that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has a singularity of exponent exactly $\delta$ in $\rho$ . In particular $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has a singularity of exponent $\delta$ in $\rho$ and hence $\rho=R_{\scriptscriptstyle{\mathrm{not}}{\oplus}}$ .

We now prove that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has no singularity $\zeta$ with $|\zeta|\leq\rho$ , except $\zeta=\rho$ . By a classical compactness argument (see e.g. [26, end of proof of Theorem 2.19]), this implies that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is analytic on a $\Delta$ -domain at $\rho$ .

Take such a singularity. Since $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has nonnegative coefficients, the triangular inequality gives $|T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\zeta)|\leq T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\rho)$ and since $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ is aperiodic, from Lemma A.1 we have a strict inequality unless $\zeta=\rho$ . Therefore, if $|\zeta|\leq\rho$ and $\zeta\neq\rho$ , we have $|\Lambda^{\prime}(T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(\zeta))|<\Lambda^{\prime}(R_{\Lambda})<1$ and we can apply Lemma A.8 as above as in the second paragraph of this proof to argue that $\zeta$ cannot be a singularity. ∎

A.7. Proof of Lemma 7.3

As in the proof of Lemma 6.3, the existence of $\rho$ and the fact that the convergence radius of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ is at least $\rho$ is straightforward. The key point is to prove that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}$ has an analytic continuation to a $\Delta$ -neighborhood of $\rho$ .

By assumption, $\Lambda$ is analytic on a $\Delta$ -neighborhood $D_{\Lambda}=\Delta(\varphi_{\Lambda},r_{\Lambda},R_{\Lambda})$ of $R_{\Lambda}$ , and the following approximation holds:

[TABLE]

where as before, $\delta_{*}=\min(\delta,2)$ ; $C^{\prime}_{\Lambda}$ is $C_{\Lambda}$ or $\tfrac{1}{2}\Lambda^{\prime\prime}(R_{\Lambda})$ depending on whether $\delta$ is smaller or bigger than $2$ ; and $\varepsilon(w)$ is an analytic function on $D_{\Lambda}$ tending to [math] in $R_{\Lambda}$ .

Fix $z$ in a $\Delta$ neighborhood $D_{T}$ of $\rho$ , whose parameters $r_{T}$ and $\varphi_{T}$ will be made precise later. The equation $w=z+\Lambda(w)$ then rewrites as

[TABLE]

or, as a fixed point equation $w=G(z,w)$ for

[TABLE]

We again use Picard’s method of successive approximants to find an analytic solution $w(z)$ for (69), which will be the analytic continuation of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ that we are looking for. For $z\in D_{T}$ , set $\phi_{0}(z)=R_{\Lambda}$ and, whenever $\phi_{i}(z)$ lies in $D_{\Lambda}\cup\{R_{\Lambda}\}$ , set $\phi_{i+1}(z)=G(z,\phi_{i}(z))$ . In particular,

[TABLE]

Since $1/\delta_{*}<1$ , we have $\operatorname{Arg}(R_{\Lambda}-\phi_{1}(z))=\tfrac{1}{\delta_{*}}\operatorname{Arg}(\rho-z)$ . We choose the parameters defining the $\Delta$ -neighborhood $D_{T}$ of $\rho$ to be $\varphi_{T}=\varphi_{\Lambda}$ and $r_{T}=C^{\prime}_{\Lambda}\,(\tfrac{r_{\Lambda}}{2})^{\delta_{*}}$ . In this way, if $z$ is in $D_{T}$ , then then $\phi_{1}(z)$ lives in $\widetilde{D_{\Lambda}}=\Delta(\widetilde{\varphi_{\Lambda}},\tfrac{r_{\Lambda}}{2},R_{\Lambda})$ , for some $\widetilde{\varphi_{\Lambda}}<\varphi_{\Lambda}$ .

We define an intermediate $\Delta$ -neighborhood $D^{\prime}_{\Lambda}=\Delta(\frac{\varphi_{\Lambda}+\widetilde{\varphi_{\Lambda}}}{2},\tfrac{3r_{\Lambda}}{4},R_{\Lambda})$ . This ensures that we have a constant $0<r_{0}<1$ , depending only on $\varphi_{\Lambda}$ and $\widetilde{\varphi_{\Lambda}}$ , such that the circle $\gamma_{w}$ of center $w$ and radius $r_{0}\,|R_{\Lambda}-w|$ is contained in $D_{\Lambda}$ for every $w\in D^{\prime}_{\Lambda}$ and in $D^{\prime}_{\Lambda}$ for every $w\in\widetilde{D_{\Lambda}}$ (cf. Fig. 16).

Consider the partial derivative

[TABLE]

We take $w$ in the domain $D^{\prime}_{\Lambda}$ . The quantity $\varepsilon^{\prime}(w)$ can now be evaluated through a contour integral on $\gamma_{w}\subset D_{\Lambda}$ :

[TABLE]

This yields the inequality

[TABLE]

Plugging this back in Eq. 70, we get, for $w$ in $D^{\prime}_{\Lambda}$

[TABLE]

Now we shall find a domain where we have enough control on $|\tfrac{\partial G}{\partial w}(z,w)|$ as to guarantee the stability of the iterates. A subtlety here is that this control is impossible near $\phi_{0}(z)=R_{\Lambda}$ . So we need to consider a domain around $\phi_{1}(z)$ , hence that depends on $z$ . For every $z\in D_{T}$ , we have $\phi_{1}(z)\in\widetilde{D_{\Lambda}}$ , so the disk

[TABLE]

is included in $D^{\prime}_{\Lambda}$ . For $w$ in $\Gamma_{z}$ , we have

[TABLE]

which implies after plugging back into Eq. 71

[TABLE]

By possibly reducing the radius $r_{\Lambda}$ of $D_{\Lambda}$ , we can make $\sup_{u\in D_{\Lambda}}|\varepsilon(u)|$ as small as wanted: for any $w$ in $\Gamma$ ,

[TABLE]

Similarly,

[TABLE]

can be made smaller than $\tfrac{1}{r_{0}+1}|\phi_{1}(z)-R_{\Lambda}|$ by reducing $r_{\Lambda}$ . In particular, $\phi_{2}(z)$ is in $\Gamma_{z}$ .

For $m\geq 2$ , assume that $\phi_{1}(z),\cdots,\phi_{m}(z)$ lie in $\Gamma_{z}$ . Then for each $i\leq m$ , using (72),

[TABLE]

Since $\phi_{m}(z)$ lies in $\Gamma_{z}\subset D_{\Lambda}$ , the next term $\phi_{m+1}(z)$ is defined and

[TABLE]

In particular, $\phi_{m+1}(z)$ also lies in $\Gamma_{z}$ and an immediate induction shows that this is indeed the case for all $m\geq 1$ .

By (73), the series $\sum_{i\geq 0}\phi_{i+1}(z)-\phi_{i}(z)$ is uniformly bounded by a geometric series and converges towards an analytic function $\phi$ on $D_{T}$ . The limit $\phi(z)$ is a solution of $\phi(z)=z+\Lambda(\phi(z))$ and is the analytic continuation of $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ that we were looking for.

A small modification of the above argument shows that, when $r_{T}$ , or equivalently $r_{\Lambda}$ , tends to [math], the quotient

[TABLE]

also tends to [math]. This proves that

[TABLE]

The proof that $T_{{\scriptscriptstyle{\mathrm{not}}{\oplus}}}(z)$ has no other singularities than $\rho$ on the circle of convergence is similar to that of Lemma 6.3. ∎

Appendix B On the simulations given in the introduction

In this appendix, we explain how the simulations in Figs. 1 and 4 have been obtained.

B.1. Biased Brownian permuton

Fix $p$ in $(0,1)$ and consider a uniform binary planar tree $\bm{b}_{n}^{(p)}$ with $n$ leaves, where each internal node is labeled $\oplus$ (resp. $\ominus$ ) with probability $p$ (resp. $1-p$ ), independently from each other. As mentioned in Section 5.1, $\tau_{n}^{(p)}:=\operatorname{perm}(\bm{b}_{n}^{(p)})$ forms a consistent family of random permutations. Therefore, from Proposition 2.9, we have the following lemma (using the notation $\operatorname{Perm}(.,.)$ defined in Section 2.1)

Lemma B.1.

There exists a random permuton $\bm{\mu}^{(p)}$ , whose induced subpermutation are the $\tau_{n}^{(p)}$ ( i.e. for all $n$ , $\bm{\tau}_{n}^{(p)}\stackrel{{\scriptstyle d}}{{=}}\operatorname{Perm}({\vec{\mathbf{m}}_{n}},\bm{\mu}^{(p)})$ ) and we have the convergence in distribution

[TABLE]

By definition, $\bm{\mu}^{(p)}$ is the biased Brownian separable permuton with parameter $p$ (indeed $\bm{\tau}_{k}^{(p)}\stackrel{{\scriptstyle d}}{{=}}\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\bm{\mu}^{(p)})$ implies Eq. 20 of Eq. 20). This lemma is not needed to prove the results of this paper, but was used in the simulations. The three pictures in Fig. 1 p. 1 are obtained by drawing the diagram of a random permutation distributed as $\bm{\tau}_{n}^{(p)}$ , for $p=0.2$ and $n=11\,629$ , $p=0.45$ and $n=12\,666$ , $p=0.5$ and $n=17\,705$ .

B.2. Stable permutons

Fix $\delta\in(1,2)$ . For every $k$ the following probability distribution on unlabeled plane trees with $k$ leaves was introduced in [28, Thm 3.3.3] and is the distribution of the induced subtree with $k$ leaves in the $\delta$ -stable tree:

[TABLE]

Now if we fix the distribution of a random permuton $\bm{\nu}$ , for every $n\geq 1$ , we build a random substitution tree $\bm{t}^{(\delta,\bm{\nu})}_{n}$ as follows: the tree is chosen according to $\rho_{\delta,n}$ , and conditional on that choice, all internal nodes $v$ are independently labeled by a permutation distributed like $\operatorname{Perm}(\vec{\mathbf{m}}_{d_{v}},\bm{\nu})$ (the notation $\operatorname{Perm}(.,.)$ is defined in Section 2.1).

Now we can define the permutations $\bm{\tau}^{(\delta,\bm{\nu})}_{n}=\operatorname{perm}(\bm{t}^{(\delta,\bm{\nu})}_{n})$ . This family of permutations is consistent: we omit the proof of this fact, which follows from the consistency of the family $(\operatorname{Perm}({\vec{\mathbf{m}}_{k}},\bm{\nu}))_{k}$ (Proposition 2.9) and from Marchal’s algorithm [47] to generate trees of distribution $\nu_{\delta,k}$ . We deduce the following lemma.

Lemma B.2.

For every $\delta\in(1,2)$ and random permuton $\bm{\nu}$ , there exists a random permuton $\bm{\mu}^{(\delta,\bm{\nu})}$ , whose induced subpermutations are the $\bm{\tau}_{n}^{(\delta,\bm{\nu})}$ ( i.e. for all $n$ , $\bm{\tau}_{n}^{(\delta,\bm{\nu})}\stackrel{{\scriptstyle d}}{{=}}\operatorname{Perm}({\vec{\mathbf{m}}_{n}},\bm{\mu}^{(\delta,\bm{\nu})})$ ) and we have the convergence in distribution

[TABLE]

We call $\mu^{(\delta,\bm{\nu})}$ the $\delta$ -stable permuton driven by $\bm{\nu}$ .

Once again, this lemma is used in our simulations. The pictures in Fig. 4 are the rescaled diagrams of realizations of $\bm{\tau}_{n}^{(\delta,\bm{\nu})}$ for $n=20\,000$ and $\delta\in\{1.1,1.5\}$ , where we have taken $\bm{\nu}$ to be the (nonrandom) uniform measure on $[0,1]^{2}$ .

B.3. Simulations of permutations in classes

The uniform random permutations in substitution-closed classes shown on Fig. 3 have been obtained through a Boltzmann sampler. Obtaining such a sampler is routine from the equations on the generating series given in Eq. 12 [27]. The only input that we need is a Boltzmann sampler for the set $\mathcal{S}$ of simple permutations in the class. In the case of a finite set $\mathcal{S}$ , this is trivial. For the set of simple permutations in $\mathrm{Av}(321)$ , we built a Boltzmann sampler for the whole class $\mathrm{Av}(321)$ , which has an easy recursive structure, and we run it until the output is simple (this happens with probability bigger than $1/4$ when the parameter of the Boltzmann sampler is close to the radius of convergence of $\mathcal{C}=\langle\mathrm{Av}(321)\rangle$ ). Code is available on request.

Acknowledgements

The authors are grateful to the referees for suggestions which improved the presentation of the paper.

MB and VF are partially supported by the Swiss National Science Foundation, under grants number 200021-172536 and 200020-172515. LG’s research is supported by ANR grants GRAAL (ANR-14-CE25-0014) and PPPP (ANR-16-CE40-0016).

Bibliography56

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. H. Albert, M. D. Atkinson. Simple permutations and pattern restricted permutations. Discrete Mathematics , 300(1):1–15, 2005.
2[2] M. H. Albert, M. D. Atkinson, R. Brignall. The enumeration of three pattern classes using monotone grid classes. Electronic Journal of Combinatorics , 19(3) #P 20, 2012.
3[3] M. H. Albert, M. D. Atkinson, R. Brignall. The enumeration of permutations avoiding 2143 2143 2143 and 4231 4231 4231 . Pure Mathematics and Applications , 22(2):87–98, 2011.
4[4] M.H.Albert, M.D. Atkinson, M. Klazar. The enumeration of simple permutations. Journal of Integer Sequences vol.6 (2003), article 03.4.4.
5[5] M. H. Albert, M. D. Atkinson, V. Vatter. Inflations of geometric grid classes: three case studies. Australasian Journal of Combinatorics , vol. 58 (2014), p. 27-47.
6[6] M. H. Albert, M. D. Atkinson, V. Vatter. Counting 1324 , 4231 1324 4231 1324,4231 -Avoiding Permutations. Electronic Journal of Combinatorics , 16(1) #R 136, 2009.
7[7] M. H. Albert, R. Brignall. Enumerating indices of Schubert varieties defined by inclusions. Journal of Combinatorial Theory, Series A , 123(1): 154–168, 2014.
8[8] M. H. Albert, V. Vatter. Generating and Enumerating 321 321 321 -Avoiding and Skew-Merged Simple Permutations. Electronic Journal of Combinatorics , 20(2) #P 44, 2013.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Universal limits of substitution-closed permutation classes

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

Contents

1. Introduction

1.1. Permutation classes and their limit

1.2. The permuton viewpoint

Theorem 1.1**.**

1.3. Substitution-closed classes

Definition 1.2**.**

Definition 1.3**.**

Definition 1.4**.**

Theorem 1.5** (Decomposition of permutations, Proposition 2 in [1]).**

Definition 1.6**.**

Proposition 1.7**.**

Proof.

Observation 1.8*.*

1.4. Our results: Universality

Remark 1.9*.*

Theorem 1.10** (Main Theorem: the standard case).**

1.5. Our results: Beyond universality

Remark 1.11*.*

Remark 1.12*.*

1.6. Limits of proportions of pattern occurrences

1.7. Outline of the proof

1.8. Organization of the paper

2. Convergence of random permutons

2.1. Deterministic permutons and extracted permutations

Lemma 2.1** (Occurrences in a permutation and its associated permuton [36, Lemma 3.5]).**

Lemma 2.2** (Approximation of a permuton by a permutation [36, Lemma 4.2]).**

2.2. Random permutons and convergence in distribution

Lemma 2.3** (Approximation of a random permuton by a random permutation).**

Proposition 2.4** (Subpermutations characterize the distribution of μ\bm{\mu}μ).**

Proof.

Theorem 2.5**.**

Observation 2.6*.*

Remark 2.7*.*

Definition 2.8**.**

Proposition 2.9**.**

Proof.

3. Coding permutations by trees

3.1. Substitution trees

Definition 3.1**.**

Definition 3.2**.**

Definition 3.3**.**

Lemma 3.4**.**

Proof.

Corollary 3.5**.**

Lemma 3.6**.**

Proof.

3.2. Induced trees

Definition 3.7** (First common ancestor).**

Observation 3.8*.*

Definition 3.9** (Induced tree).**

Observation 3.10*.*

Lemma 3.11**.**

4. Exact enumeration of various families of trees

4.1. Generating functions of S\mathcal{S}S-canonical trees (possibly with marked leaves)

Proposition 4.1**.**

Proof.

Proposition 4.2**.**

Proof.

4.2. Generating function counting trees with marked leaves inducing a given tree

Observation 4.3*.*

Definition 4.4**.**

Proposition 4.5** (Enumeration of trees with marked leaves inducing a given decorated tree).**

Proof.

5. Asymptotic analysis: The standard case S′(RS)>2/(1+RS)2−1S^{\prime}(R_{S})>2/(1+R_{S})^{2}-1S′(RS​)>2/(1+RS​)2−1

5.1. Definition of the biased Brownian separable permuton and statement of the theorem

Definition 5.1**.**

Theorem 5.2**.**

Example 5.3*.*

Example 5.4*.*

Theorem 1.1.

Definition 1.2.

Definition 1.3.

Definition 1.4.

Theorem 1.5 (Decomposition of permutations, Proposition 2 in [1]).

Definition 1.6.

Proposition 1.7.

*Observation 1.8**.*

*Remark 1.9**.*

Theorem 1.10 (Main Theorem: the standard case).

*Remark 1.11**.*

*Remark 1.12**.*

Lemma 2.1 (Occurrences in a permutation and its associated permuton [36, Lemma 3.5]).

Lemma 2.2 (Approximation of a permuton by a permutation [36, Lemma 4.2]).

Lemma 2.3 (Approximation of a random permuton by a random permutation).

Proposition 2.4 (Subpermutations characterize the distribution of $\bm{\mu}$ ).

Theorem 2.5.

*Observation 2.6**.*

*Remark 2.7**.*

Definition 2.8.

Proposition 2.9.

Definition 3.1.

Definition 3.2.

Definition 3.3.

Lemma 3.4.

Corollary 3.5.

Lemma 3.6.

Definition 3.7 (First common ancestor).

*Observation 3.8**.*

Definition 3.9 (Induced tree).

*Observation 3.10**.*

Lemma 3.11.

4.1. Generating functions of $\mathcal{S}$ -canonical trees (possibly with marked leaves)

Proposition 4.1.

Proposition 4.2.

*Observation 4.3**.*

Definition 4.4.

Proposition 4.5 (Enumeration of trees with marked leaves inducing a given decorated tree).

5. Asymptotic analysis: The standard case $S^{\prime}(R_{S})>2/(1+R_{S})^{2}-1$

Definition 5.1.

Theorem 5.2.

*Example 5.3**.*

*Example 5.4**.*

*Example 5.5**.*

*Observation 5.6**.*

*Observation 5.7**.*

Proposition 5.8 (Asymptotics of the generating function of $\mathcal{S}$ -canonical trees with no marked leaf).

Proposition 5.9 (Asymptotics of the generating function of $\mathcal{S}$ -canonical trees with marked leaves).

Proposition 5.10.

Proposition 5.11.

Proposition 5.12.

Proposition 5.13.

*Remark 5.14**.*

Corollary 5.15.

6. Asymptotic analysis: The degenerate case $S^{\prime}(R_{S})<2/(1+R_{S})^{2}-1$

Definition 6.1 (Hypothesis $(H2)$ ).

Lemma 6.2.

Lemma 6.3.

Corollary 6.4.

Lemma 6.5.

Definition 6.6 (Hypothesis $(CS)$ ).

Proposition 6.7.

Corollary 6.8.

Theorem 6.9.

6.3. Hypothesis $(CS)$ and convergence of uniform random simple permutations

Proposition 6.10.

7. Asymptotic analysis: The critical case $S^{\prime}(R_{S})=2/(1+R_{S})^{2}-1$

Definition 7.1 (Hypothesis (H3)).

Lemma 7.2.

Lemma 7.3.

Corollary 7.4.

Lemma 7.5.

7.1. The case $\delta\in(1,2)$ .

Theorem 7.6.

*Remark 7.7**.*

7.2. The case $\delta>2$ .

Theorem 7.8.

*Remark 7.9**.*

Lemma A.1 (Daffodil Lemma).

Definition A.2 ( $\Delta$ -domain and $\Delta$ -neighborhood).