On the gap between deterministic and probabilistic joint spectral radii   for discrete-time linear systems

Yacine Chitour; Guilherme Mazanti; Mario Sigalotti

arXiv:1812.08399·math.OC·November 17, 2021

On the gap between deterministic and probabilistic joint spectral radii for discrete-time linear systems

Yacine Chitour, Guilherme Mazanti, Mario Sigalotti

PDF

TL;DR

This paper explores the relationship between deterministic and probabilistic measures of asymptotic behavior in discrete-time linear switched systems, aiming to characterize when these measures are equal.

Contribution

It provides a characterization of the conditions under which the deterministic joint spectral radius equals certain probabilistic spectral radii.

Findings

01

Identifies conditions for equality between deterministic and probabilistic spectral radii.

02

Analyzes the sets of matrices where such equalities hold.

03

Provides insights into the structure of systems with matching spectral radii.

Abstract

Given a discrete-time linear switched system $Σ (A)$ associated with a finite set $A$ of matrices, we consider the measures of its asymptotic behavior given by, on the one hand, its deterministic joint spectral radius $ρ_{d} (A)$ and, on the other hand, its probabilistic joint spectral radii $ρ_{p} (ν, P, A)$ for Markov random switching signals with transition matrix $P$ and a corresponding invariant probability $ν$ . Note that $ρ_{d} (A)$ is larger than or equal to $ρ_{p} (ν, P, A)$ for every pair $(ν, P)$ . In this paper, we investigate the cases of equality of $ρ_{d} (A)$ with either a single $ρ_{p} (ν, P, A)$ or with the supremum of $ρ_{p} (ν, P, A)$ over $(ν, P)$ and we aim at characterizing the sets $A$ for which…

Equations120

Σ (A) : x_{k + 1} = A_{σ (k)} x_{k}, σ \in S, k \in N,

Σ (A) : x_{k + 1} = A_{σ (k)} x_{k}, σ \in S, k \in N,

ρ (σ) = n \to \infty lim sup ∥ A_{σ (n)} \dots A_{σ (1)} ∥^{1/ n} .

ρ (σ) = n \to \infty lim sup ∥ A_{σ (n)} \dots A_{σ (1)} ∥^{1/ n} .

ρ_{p} (μ, A) \leq ρ_{d} (A) .

ρ_{p} (μ, A) \leq ρ_{d} (A) .

\rho_{\mathrm{d}}(\mathcal{A})=\limsup_{n\to\infty}\max_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}.

\rho_{\mathrm{d}}(\mathcal{A})=\limsup_{n\to\infty}\max_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}.

\rho_{\mathrm{d}}(\mathcal{A})=\lim_{n\to\infty}\max_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}=\inf_{n\in\mathbb{N}}\max_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}.

\rho_{\mathrm{d}}(\mathcal{A})=\lim_{n\to\infty}\max_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}=\inf_{n\in\mathbb{N}}\max_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}.

\rho\mathopen{}\mathclose{{}\left(A_{i_{n}}\dotsm A_{i_{1}}}\right)^{1/n}\leq\rho_{\mathrm{d}}(\mathcal{A}),

\rho\mathopen{}\mathclose{{}\left(A_{i_{n}}\dotsm A_{i_{1}}}\right)^{1/n}\leq\rho_{\mathrm{d}}(\mathcal{A}),

P = P_{1} 0 ⋮ 00 * 0 P_{2} 0 ⋮ 0 * \dots 0 ⋱ ⋱ \dots \dots 0 \dots ⋱ ⋱ 0 * 00 ⋮ 0 P_{R} * 00 ⋮ 00 Q,

P = P_{1} 0 ⋮ 00 * 0 P_{2} 0 ⋮ 0 * \dots 0 ⋱ ⋱ \dots \dots 0 \dots ⋱ ⋱ 0 * 00 ⋮ 0 P_{R} * 00 ⋮ 00 Q,

ν = i = 1 \sum R α_{i} ν^{[i]},

ν = i = 1 \sum R α_{i} ν^{[i]},

\mathcal{I}_{i}=\mathopen{}\mathclose{{}\left\llbracket 1+\sum_{j=1}^{i-1}n_{j},\sum_{j=1}^{i}n_{j}}\right\rrbracket,

\mathcal{I}_{i}=\mathopen{}\mathclose{{}\left\llbracket 1+\sum_{j=1}^{i-1}n_{j},\sum_{j=1}^{i}n_{j}}\right\rrbracket,

C (P, s) = {A (i_{1}, \dots, i_{k}) ∣ (i_{1}, \dots, i_{k}) is a P -cycle and i_{1} = s} .

C (P, s) = {A (i_{1}, \dots, i_{k}) ∣ (i_{1}, \dots, i_{k}) is a P -cycle and i_{1} = s} .

C (P) = s \in [[1, N]] ⋃ C (P, s) .

C (P) = s \in [[1, N]] ⋃ C (P, s) .

\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\limsup_{n\to\infty}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right],

\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\limsup_{n\to\infty}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right],

\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right]=\sum_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\nu_{i_{1}}p_{i_{1}i_{2}}\dotsm p_{i_{n-1}i_{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}.

\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right]=\sum_{(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}}\nu_{i_{1}}p_{i_{1}i_{2}}\dotsm p_{i_{n-1}i_{n}}\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}.

\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\lim_{n\to\infty}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right]=\inf_{n\in\mathbb{N}}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right].

\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\lim_{n\to\infty}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right]=\inf_{n\in\mathbb{N}}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert^{1/n}}\right].

\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})=\limsup_{n\to\infty}\frac{1}{n}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\log\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert}\right].

\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})=\limsup_{n\to\infty}\frac{1}{n}\mathbb{E}_{(\nu,P)}\mathopen{}\mathclose{{}\left[\log\mathopen{}\mathclose{{}\left\lVert A_{i_{n}}\dotsm A_{i_{1}}}\right\rVert}\right].

\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})=\lim_{n\to\infty}\frac{1}{n}\log\mathopen{}\mathclose{{}\left\lVert A_{\sigma(n)}\dotsm A_{\sigma(1)}}\right\rVert\qquad\text{ for $\mathbb{P}_{(\nu,P)}$-almost every $\sigma\in\mathfrak{S}$},

\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})=\lim_{n\to\infty}\frac{1}{n}\log\mathopen{}\mathclose{{}\left\lVert A_{\sigma(n)}\dotsm A_{\sigma(1)}}\right\rVert\qquad\text{ for $\mathbb{P}_{(\nu,P)}$-almost every $\sigma\in\mathfrak{S}$},

\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\lim_{n\to\infty}\mathopen{}\mathclose{{}\left\lVert A_{\sigma(n)}\dotsm A_{\sigma(1)}}\right\rVert^{1/n}\qquad\text{ for $\mathbb{P}_{(\nu,P)}$-almost every $\sigma\in\mathfrak{S}$}.

\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\lim_{n\to\infty}\mathopen{}\mathclose{{}\left\lVert A_{\sigma(n)}\dotsm A_{\sigma(1)}}\right\rVert^{1/n}\qquad\text{ for $\mathbb{P}_{(\nu,P)}$-almost every $\sigma\in\mathfrak{S}$}.

ρ_{p} (ν, P, A) \leq ν^{'} sup ρ_{p} (ν^{'}, P, A) \leq (ν^{'}, P^{'}) sup ρ_{p} (ν^{'}, P^{'}, A) \leq ρ_{d} (A),

ρ_{p} (ν, P, A) \leq ν^{'} sup ρ_{p} (ν^{'}, P, A) \leq (ν^{'}, P^{'}) sup ρ_{p} (ν^{'}, P^{'}, A) \leq ρ_{d} (A),

ρ_{p} (P, A) = ν^{'} sup ρ_{p} (ν^{'}, P, A), ρ_{p} (A) = (ν^{'}, P^{'}) sup ρ_{p} (ν^{'}, P^{'}, A) .

ρ_{p} (P, A) = ν^{'} sup ρ_{p} (ν^{'}, P, A), ρ_{p} (A) = (ν^{'}, P^{'}) sup ρ_{p} (ν^{'}, P^{'}, A) .

ρ_{p} (ν, P, A) = j = 1 \sum R α_{j} ρ_{p} (ν^{[j]}, P, A) .

ρ_{p} (ν, P, A) = j = 1 \sum R α_{j} ρ_{p} (ν^{[j]}, P, A) .

n \to \infty lim sup ρ_{d} (A)^{- n} ρ (A_{σ (n)} \dots A_{σ (1)}) = 1.

n \to \infty lim sup ρ_{d} (A)^{- n} ρ (A_{σ (n)} \dots A_{σ (1)}) = 1.

ρ_{d} (A)^{- n} ρ (A_{σ (n)} \dots A_{σ (1)}) = ρ_{d} (A)^{- j} ρ (A_{i_{j}} \dots A_{i_{1}} M^{ℓ}) .

ρ_{d} (A)^{- n} ρ (A_{σ (n)} \dots A_{σ (1)}) = ρ_{d} (A)^{- j} ρ (A_{i_{j}} \dots A_{i_{1}} M^{ℓ}) .

ρ (A_{i_{r}} \dots A_{i_{1}}) = ρ_{d} (A)^{r} .

ρ (A_{i_{r}} \dots A_{i_{1}}) = ρ_{d} (A)^{r} .

\rho_{\mathrm{d}}(\mathcal{A})^{r}\leq\mathopen{}\mathclose{{}\left\lVert A_{i_{r}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}\leq\mathopen{}\mathclose{{}\left\lVert A_{i_{r}}\dotsm A_{i_{k+1}}}\right\rVert_{\mathrm{B}}\mathopen{}\mathclose{{}\left\lVert A_{i_{k}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}.

\rho_{\mathrm{d}}(\mathcal{A})^{r}\leq\mathopen{}\mathclose{{}\left\lVert A_{i_{r}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}\leq\mathopen{}\mathclose{{}\left\lVert A_{i_{r}}\dotsm A_{i_{k+1}}}\right\rVert_{\mathrm{B}}\mathopen{}\mathclose{{}\left\lVert A_{i_{k}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}.

\mathopen{}\mathclose{{}\left\lVert A_{i_{r}}\dotsm A_{i_{k+1}}}\right\rVert_{\mathrm{B}}\mathopen{}\mathclose{{}\left\lVert A_{i_{k}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}\leq\rho_{\mathrm{d}}(\mathcal{A})^{r-k}\rho_{\mathrm{d}}(\mathcal{A})^{k}=\rho_{\mathrm{d}}(\mathcal{A})^{r}.

\mathopen{}\mathclose{{}\left\lVert A_{i_{r}}\dotsm A_{i_{k+1}}}\right\rVert_{\mathrm{B}}\mathopen{}\mathclose{{}\left\lVert A_{i_{k}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}\leq\rho_{\mathrm{d}}(\mathcal{A})^{r-k}\rho_{\mathrm{d}}(\mathcal{A})^{k}=\rho_{\mathrm{d}}(\mathcal{A})^{r}.

A_{j} = A_{j}^{(1)} 00 ⋮ 0 * A_{j}^{(2)} 0 ⋮ 0 * * A_{j}^{(3)} ⋮ 0 \dots \dots \dots ⋱ \dots * * * ⋮ A_{j}^{(R)}, j \in [[1, N]],

A_{j} = A_{j}^{(1)} 00 ⋮ 0 * A_{j}^{(2)} 0 ⋮ 0 * * A_{j}^{(3)} ⋮ 0 \dots \dots \dots ⋱ \dots * * * ⋮ A_{j}^{(R)}, j \in [[1, N]],

\rho_{\mathrm{d}}(\mathcal{A})\geq\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\max_{r\in\llbracket 1,R\rrbracket}\rho\mathopen{}\mathclose{{}\left(A_{i_{k}}^{(r)}\dotsm A_{i_{1}}^{(r)}}\right)^{1/k},

\rho_{\mathrm{d}}(\mathcal{A})\geq\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\max_{r\in\llbracket 1,R\rrbracket}\rho\mathopen{}\mathclose{{}\left(A_{i_{k}}^{(r)}\dotsm A_{i_{1}}^{(r)}}\right)^{1/k},

A (j^{R}) A (i^{R})^{n} \dots A (j^{2}) A (i^{2})^{n} A (j^{1}) A (i^{1})^{n} \in C (P) .

A (j^{R}) A (i^{R})^{n} \dots A (j^{2}) A (i^{2})^{n} A (j^{1}) A (i^{1})^{n} \in C (P) .

\rho\mathopen{}\mathclose{{}\left(A^{(r_{n})}(j^{R})A^{(r_{n})}(i^{R})^{n}\dotsm A^{(r_{n})}(j^{2})A^{(r_{n})}(i^{2})^{n}A^{(r_{n})}(j^{1})A^{(r_{n})}(i^{1})^{n}}\right)\\ =\rho\mathopen{}\mathclose{{}\left(A(j^{R})A(i^{R})^{n}\dotsm A(j^{2})A(i^{2})^{n}A(j^{1})A(i^{1})^{n}}\right)=1.

\rho\mathopen{}\mathclose{{}\left(A^{(r_{n})}(j^{R})A^{(r_{n})}(i^{R})^{n}\dotsm A^{(r_{n})}(j^{2})A^{(r_{n})}(i^{2})^{n}A^{(r_{n})}(j^{1})A^{(r_{n})}(i^{1})^{n}}\right)\\ =\rho\mathopen{}\mathclose{{}\left(A(j^{R})A(i^{R})^{n}\dotsm A(j^{2})A(i^{2})^{n}A(j^{1})A(i^{1})^{n}}\right)=1.

1

1

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On the gap between deterministic and probabilistic joint spectral radii for discrete-time linear systems††thanks: This research was partially supported by the iCODE Institute, research project of the IDEX Paris-Saclay. The second author was partially supported by the public grant number ANR-10-CAMP-0151-02-FMJH as part of the “Programme des Investissements d’Avenir”.

Yacine Chitour, Guilherme Mazanti††footnotemark: , Mario Sigalotti††footnotemark: Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des signaux et systèmes, 91190, Gif-sur-Yvette, France.Inria, France. Sorbonne Université, Université de Paris, CNRS, Inria, Laboratoire Jacques-Louis Lions, 75005 Paris, France.

Abstract

Given a discrete-time linear switched system $\Sigma(\mathcal{A})$ associated with a finite set $\mathcal{A}$ of matrices, we consider the measures of its asymptotic behavior given by, on the one hand, its deterministic joint spectral radius $\rho_{\mathrm{d}}(\mathcal{A})$ and, on the other hand, its probabilistic joint spectral radii $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ for Markov random switching signals with transition matrix $P$ and a corresponding invariant probability $\nu$ . Note that $\rho_{\mathrm{d}}(\mathcal{A})$ is larger than or equal to $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ for every pair $(\nu,P)$ . In this paper, we investigate the cases of equality of $\rho_{\mathrm{d}}(\mathcal{A})$ with either a single $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ or with the supremum of $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ over $(\nu,P)$ and we aim at characterizing the sets $\mathcal{A}$ for which such equalities may occur.

Keywords. Linear switched systems, discrete time, joint spectral radius, Markov process.

2020 Mathematics Subject Classification. 93C30, 93C55, 37H15.

This paper was first published in Linear Algebra and its Applications, 613:24–45, 2021. With respect to the published version, this version provides an additional remark (Remark 2.10) and a more precise proof of Lemma 3.2. All modifications with respect to the published version are given in blue. The authors are very grateful to Matteo Della Rossa for pointing out the imprecisions in the previous version of the proof of Lemma 3.2.

1 Introduction

In this paper, we consider discrete-time switched linear systems of the form

[TABLE]

where $d$ and $N$ are positive integers, $x_{k}\in\mathbb{R}^{d}$ , $\mathfrak{S}$ is the set of the set of all maps $\sigma:\mathbb{N}\to\{1,\dotsc,N\}$ , and $\mathcal{A}=(A_{1},\dotsc,A_{N})$ is an $N$ -tuple of $d\times d$ matrices with real coefficients.

Switched systems model the behavior of a continuous variable $x$ whose dynamics may change over time according to the value of a discrete variable $\sigma$ . These models are useful for several applications, ranging from air traffic control, electronic circuits, and automotive engines to chemical processes and population models in biology. This wide field of applications, together with the interesting mathematical questions arising from their analysis, justify the extensive literature on switched systems, which have been studied from the point of view of both deterministic and random switching [28, 22, 29, 23, 7, 6]. A commonly used point of view on the switching signal $\sigma$ , which we adopt in this paper, is to consider it as an uncertainty or perturbation acting on the system, the goal being thus to provide properties of the system independent of a particular choice of $\sigma$ .

We are interested in describing the asymptotic behavior of $\Sigma(\mathcal{A})$ . For a given $\sigma\in\mathfrak{S}$ , the asymptotic behavior of the corresponding non-autonomous linear system is measured by the quantity $\rho(\sigma)$ defined by

[TABLE]

Indeed, $\rho(\sigma)<1$ if and only if all trajectories of the non-autonomous system $x_{k+1}=A_{\sigma(k)}x_{k}$ converge exponentially to the origin.

In order to capture the asymptotic behavior of $\Sigma(\mathcal{A})$ , we must formulate some condition which is independent of the choice of $\sigma\in\mathfrak{S}$ . There exist two main approaches to proceed. The first one is deterministic and consists in considering the joint spectral radius $\rho_{\mathrm{d}}(\mathcal{A})$ of $\mathcal{A}$ , defined as the supremum of $\rho(\sigma)$ over all $\sigma\in\mathfrak{S}$ . Since its introduction in [26] and after the seminal paper [13], it has been extensively studied in the computer science and control theory communities (see, e.g., the monograph [19]).

The other approach to handle the asymptotic behavior of $\Sigma(\mathcal{A})$ is probabilistic and amounts to considering a probability measure $\mu$ on $\mathfrak{S}$ and hence $\sigma\mapsto\rho(\sigma)$ as a random variable. We may then consider as a probabilistic joint spectral radius the expected value of $\rho(\sigma)$ with respect to the probability law $\mu$ , which we denote by $\rho_{\mathrm{p}}(\mu,\mathcal{A})$ . There exists a vast literature devoted to the properties of products of random matrices, and we refer the reader to [1, 5, 8] for more details. A major result in this field has been obtained in [16] and provides general conditions on $\mu$ under which $\rho(\sigma)=\rho_{\mathrm{p}}(\mu,\mathcal{A})$ on a set of $\mu$ probability $1$ .

The interest in considering $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\mu,\mathcal{A})$ comes from the stability analysis of (1.1). Indeed, $\rho_{\mathrm{d}}(\mathcal{A})<1$ if and only if (1.1) is uniformly exponentially stable [19], whereas, under the conditions of [16], $\rho_{\mathrm{p}}(\mu,\mathcal{A})<1$ if and only if $\mu$ -almost every trajectory of (1.1) converges exponentially to the origin.

In this paper, we aim at understanding the relations between the deterministic and the probabilistic approaches. The deterministic measure of stability $\rho_{\mathrm{d}}(\mathcal{A})$ characterizes the worst possible behavior over all $\sigma\in\mathfrak{S}$ , while the probabilistic counterpart $\rho_{\mathrm{p}}(\mu,\mathcal{A})$ provides the average behavior for $\sigma\in\mathfrak{S}$ corresponding to the probability measure $\mu$ . As a consequence, the deterministic approach provides a more conservative estimate of the asymptotic behavior of the system than the probabilistic one, in the sense that

[TABLE]

A natural question is then to understand under which conditions on $\mathcal{A}$ and $\mu$ the inequality in (1.3) is strict. Furthermore, for practical and modeling purposes, it is important to understand whether, given a family of probability measures $\{\mu_{\ell}\}_{\ell\in\mathcal{I}}$ , the strict inequality $\sup_{\ell\in\mathcal{I}}\rho_{\mathrm{p}}(\mu_{\ell},\mathcal{A})<\rho_{\mathrm{d}}(\mathcal{A})$ holds true. Regarding the first question, it is known that there always exists a measure $\mu$ such that equality holds in (1.3) (see, for instance, [24], where such measures are referred to as maximizing measures). At such a level of generality, a handy characterization of maximizing measures cannot be expected. This is why we restrict our attention to the family $\mathfrak{M}$ of probability measures on $\mathfrak{S}$ obtained from discrete-time shift-invariant Markov chains and reformulate the previous two questions as follows: under which conditions on $\mathcal{A}$ do we have

(Q1)

equality between $\rho_{\mathrm{p}}(\mu,\mathcal{A})$ and $\rho_{\mathrm{d}}(\mathcal{A})$ for a given $\mu\in\mathfrak{M}$ ? 2. (Q2)

equality between $\sup_{\mu\in\mathfrak{M}}\rho_{\mathrm{p}}(\mu,\mathcal{A})$ and $\rho_{\mathrm{d}}(\mathcal{A})$ ?

Notice that the condition $\sup_{\mu\in\mathfrak{M}}\rho_{\mathrm{p}}(\mu,\mathcal{A})<1$ is related to the almost sure stability of the system uniformly with respect to the Markov process, a stability property first considered in [18] in the case of Markov chains with positive transition probabilities. Other stability notions have also been considered for (1.1), such as periodic stability, meaning stability for all periodic signals $\sigma\in\mathfrak{S}$ , or mean square stability. Several works explore relations between these different notions, see, e.g., [18, 14, 15, 12, 6, 9]. In particular, [12] establishes a probabilistic version of the finiteness conjecture, i.e., if (1.1) is periodically stable, then $\rho_{\mathrm{p}}(\mu,\mathcal{A})<1$ for every $\mu\in\mathfrak{M}$ .

Another interesting fact is that the quantities $\rho_{\mathrm{d}}(\mathcal{A})$ and of $\rho_{\mathrm{p}}(\mu,\mathcal{A})$ for $\mu\in\mathfrak{M}$ could be equivalently computed by replacing the norm $\lVert\cdot\rVert$ in (1.2) by the spectral radius. In the deterministic case, this result is known as the Berger–Wang formula or also as the Joint Spectral Radius Theorem [19], and it has been extended to the Markovian setting in [20, 10].

In order to describe the main results of our paper, let us identify a measure $\mu\in\mathfrak{M}$ with the pair $(\nu,P)$ , where $P$ is the transition matrix of the Markov chain corresponding to $\mu$ and $\nu$ is its (invariant) initial probability. In particular, we write $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ for $\rho_{\mathrm{p}}(\mu,\mathcal{A})$ . Our main result concerning (a) (see Theorem 3.1) establishes that a necessary and sufficient condition for equality is that $\rho_{\mathrm{d}}(\mathcal{A})=\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}$ for every $(i_{1},\dotsc,i_{k})$ that corresponds to a cycle in the directed weighted graph determined by $P$ such that $\nu_{i_{1}}>0$ . The necessity follows from results provided in [24], whereas, for sufficiency, we consider first the particular case where $\mathcal{A}$ is irreducible and $P$ is strongly connected (see Lemma 3.3). Irreducibility implies in particular the existence of a Barabanov norm for $\mathcal{A}$ (see Definition 2.1), which is an important tool in our proof. We then generalize the result to the case of reducible $\mathcal{A}$ (see Lemma 3.5) by a suitable block decomposition of the matrices in $\mathcal{A}$ and the fact that $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ and $\rho_{\mathrm{d}}(\mathcal{A})$ can be read on the diagonal blocks of the decomposed matrix (cf. [19, 17]). Finally, the general case for $P$ can be obtained by using a classical block decomposition of stochastic matrices.

The equivalence established in Theorem 3.1 can be further characterized in terms of simultaneous similarity of the matrices $\rho_{\mathrm{d}}(\mathcal{A})^{-1}A_{i}$ , $i\in\{1,\dotsc,N\}$ , to orthogonal matrices, under some additional assumptions on $\mathcal{A}$ and $P$ (Proposition 3.9). The latter characterization is based on the description of matrix semigroups with constant spectral radius from [25].

Our next main result, Theorem 3.13, concerns (b) and states that equality is equivalent to the existence of a family of pairwise distinct indices $i_{1},\dotsc,i_{k}\in\{1,\dotsc,N\}$ such that $\rho_{\mathrm{d}}(\mathcal{A})=\rho(A_{i_{1}}\dotsm A_{i_{k}})^{1/k}$ . This corresponds to the case where the worst behavior of the system is attained by a periodic $\sigma$ with no repetition of indices on a period. This property is reminiscent of the finiteness property, except for the fact that, in the finiteness property, repetition of indices is allowed. We recall that the finiteness property is known to hold only for a proper subclass of $N$ -tuples $\mathcal{A}$ [3, 4], contrarily to what had been earlier conjectured [21]. By applying a standard lifting argument of Markov chains of higher order to Markov chains of order one, we generalize the equivalence stated in Theorem 3.13 by providing the following characterization of the finiteness property: a $N$ -tuple $\mathcal{A}$ satisfies the finiteness property if and only if there exist $m\geq 1$ and a Markov chain of order $m$ whose corresponding probabilistic Lyapunov exponent is equal to $\rho_{\mathrm{d}}(\mathcal{A})$ (see Corollary 4.4). This, in turns, is equivalent to say that the finiteness property holds if and only if the set of maximizing measures contains the measure induced by some Markov chain of arbitrary order.

Acknowledgements. The authors are indebted with D. Chafaï for helpful discussions. They are also grateful to the anonymous reviewers of a preceding version of the manuscript for providing helpful comments and pointing out relevant literature.

2 Definitions, notations, and basic facts

Throughout the paper, $d$ and $N$ belong to $\mathbb{N}$ , which is used to denote the set of positive integers. If $a$ and $b$ are positive integers, $\llbracket a,b\rrbracket$ denotes the set of integers $j$ such that $a\leq j\leq b$ . For $x\in\mathbb{R}$ , $\lceil x\rceil$ denotes the smallest integer greater than or equal to $x$ , and we extend this notation componentwise to vectors and matrices. We use $\lVert\cdot\rVert$ to denote a norm in $\mathbb{R}^{d}$ as well as the corresponding induced norm on the space $\mathcal{M}_{d}(\mathbb{R})$ of $d\times d$ matrices with real coefficients. We only consider in this paper norms in $\mathcal{M}_{d}(\mathbb{R})$ obtained as induced norms from $\mathbb{R}^{d}$ . An $N$ -tuple $\mathcal{A}=(A_{1},\dotsc,A_{N})\in\mathcal{M}_{d}(\mathbb{R})^{N}$ is said to be irreducible if the only subspaces of $\mathbb{R}^{d}$ invariant under all the matrices $A_{1},\dotsc,A_{N}$ are $\{0\}$ and $\mathbb{R}^{d}$ .

2.1 Deterministic joint spectral radius

Let $\Sigma(\mathcal{A})$ be the discrete-time switched system defined in (1.1). The deterministic joint spectral radius $\rho_{\mathrm{d}}(\mathcal{A})$ of $\Sigma(\mathcal{A})$ , introduced in [26], is defined by

[TABLE]

Since all norms in $\mathcal{M}_{d}(\mathbb{R})$ induced by norms in $\mathbb{R}^{d}$ are submultiplicative and equivalent to each other, it immediately follows that $\rho_{\mathrm{d}}(\mathcal{A})$ does not depend on a specific choice of such a norm and that

[TABLE]

Notice that, for every $n\in\mathbb{N}$ and $(i_{1},\dotsc,i_{n})\in\llbracket 1,N\rrbracket^{n}$ , we have

[TABLE]

where we use the definition of $\rho_{\mathrm{d}}(\mathcal{A})$ and the fact that $\rho(M)=\rho(M^{k})^{1/k}\leq\lVert M^{k}\rVert^{1/k}$ for every square matrix $M$ and $k\in\mathbb{N}$ .

Definition 2.1 (Barabanov norm).

Let $\mathcal{A}=(A_{1},\dotsc,A_{N})$ be an $N$ -tuple of $d\times d$ matrices with real coefficients. A norm $\lVert\cdot\rVert_{\mathrm{B}}$ on $\mathbb{R}^{d}$ is said to be a Barabanov norm for $\mathcal{A}$ if the following two conditions hold.

(a)

For every $i\in\llbracket 1,N\rrbracket$ , $\lVert A_{i}\rVert_{\mathrm{B}}\leq\rho_{\mathrm{d}}(\mathcal{A})$ . 2. (b)

For every $x\in\mathbb{R}^{d}$ and $k\in\mathbb{N}$ , there exists $\sigma\in\mathfrak{S}$ such that $\lVert A_{\sigma(k)}\dotsm A_{\sigma(1)}x\rVert_{\mathrm{B}}=\rho_{\mathrm{d}}(\mathcal{A})^{k}\lVert x\rVert_{\mathrm{B}}$ .

The following basic result on Barabonov norms was proved in [2].

Proposition 2.2.

Let $\mathcal{A}$ be an $N$ -tuple of $d\times d$ matrices with real coefficients. If $\mathcal{A}$ is irreducible, then it admits a Barabanov norm.

2.2 Probabilistic joint spectral radius

We now provide a probabilistic counterpart to $\rho_{\mathrm{d}}(\mathcal{A})$ . For that purpose, we collect some basic notions concerning transition matrices of Markov chains.

Definition 2.3.

Let $P=(p_{ij})_{1\leq i,j\leq N}$ be an $N\times N$ matrix with nonnegative coefficients.

(a)

$P$ is said to be stochastic if, for every $i\in\llbracket 1,N\rrbracket$ , $\sum_{j=1}^{N}p_{ij}=1$ . 2. (b)

$P$ is said to be strongly connected if it is not similar via a permutation to an upper block triangular matrix. 3. (c)

For $k\in\mathbb{N}$ and $i_{1},\dotsc,i_{k}\in\llbracket 1,N\rrbracket$ , we say that $(i_{1},\dotsc,i_{k})$ is a $P$ -word if $p_{i_{1}i_{2}}p_{i_{2}i_{3}}\dotsm\allowbreak p_{i_{k-1}i_{k}}>0$ . The integer $k$ is called the length of the $P$ -word $(i_{1},\dotsc,i_{k})$ . We say that $(i_{1},\dotsc,i_{k})$ is a $P$ -cycle if $p_{i_{1}i_{2}}p_{i_{2}i_{3}}\dotsm p_{i_{k-1}i_{k}}p_{i_{k}i_{1}}>0$ . The index $i_{1}$ is called the starting index of the $P$ -cycle $(i_{1},\dotsc,i_{k})$ . 4. (d)

Let $\nu$ be a vector in $\mathbb{R}^{N}$ with nonnegative coefficients. We say that $(i_{1},\dotsc,i_{k})$ is a $(\nu,P)$ -word (respectively, $(\nu,P)$ -cycle) if it is a $P$ -word (respectively, $P$ -cycle) and $\nu_{i_{1}}>0$ . 5. (e)

If $P$ is stochastic, a row vector $\nu=(\nu_{1},\dotsc,\nu_{N})\in\mathbb{R}^{N}$ is said to be an invariant probability for $P$ if $\nu_{i}\geq 0$ for every $i\in\llbracket 1,N\rrbracket$ , $\sum_{i=1}^{N}\nu_{i}=1$ , and $\nu=\nu P$ .

Remark 2.4.

In the context of discrete-time Markov chains in a finite state space with $N$ states, the transition matrix is the stochastic matrix $P=(p_{ij})_{1\leq i,j\leq N}$ where $p_{ij}$ represents the probability to switch from the state $i$ to the state $j$ . Notice that $P$ is strongly connected if and only if its associated oriented graph $G_{P}$ is strongly connected. In the stochastic processes literature, the strong connectedness of $P$ is more often referred to as irreducibility. We choose to stick with the former to avoid ambiguities with the homonymous notion for $N$ -tuples of matrices. Notice also that the notions of strong connectedness, $P$ -cycles, and $P$ -words only depend on the adjacency matrix $\lceil P\rceil$ of the graph $G_{P}$ , while $(\nu,P)$ -cycles and $(\nu,P)$ -words depend on $\lceil P\rceil$ and $\lceil\nu\rceil$ .

Remark 2.5.

Recall that, by the Perron–Frobenius Theorem, a stochastic matrix $P$ always admits an invariant probability, which is unique and has positive entries if $P$ is strongly connected. In the latter case, the definitions of $P$ -word and $(\nu,P)$ -word coincide, as well as those of $P$ -cycle and $(\nu,P)$ -cycle.

We have the following classical decomposition result for stochastic matrices [27, §§1.2 and 4.2].

Proposition 2.6.

Let $P\in\mathcal{M}_{N}(\mathbb{R})$ be a stochastic matrix. Then, up to a permutation in the set of indices $\llbracket 1,N\rrbracket$ , $P$ is given by

[TABLE]

where $\rho(Q)<1$ and, for $i\in\llbracket 1,R\rrbracket$ , $P_{i}\in\mathcal{M}_{n_{i}}(\mathbb{R})$ is a stochastic and strongly connected matrix.

Moreover, for $i\in\llbracket 1,R\rrbracket$ , let $\nu^{[i]}$ be the unique invariant probability for $P_{i}$ and denote by the same symbol its canonical extension as a vector in $\mathbb{R}^{N}$ according to the decomposition (2.3). Then every invariant probability $\nu\in\mathbb{R}^{N}$ can be uniquely decomposed as

[TABLE]

where $\alpha_{1},\dotsc,\alpha_{R}\in[0,1]$ and $\sum_{i=1}^{R}\alpha_{i}=1$ .

The next lemma, useful in the proof of some of our results, uses the previous decomposition to obtain that any $(\nu,P)$ -cycle has all its indices corresponding to a same diagonal block $P_{i}$ in (2.3).

Lemma 2.7.

Let $P\in\mathcal{M}_{N}(\mathbb{R})$ be a stochastic matrix decomposed according to Proposition 2.6. For $i\in\llbracket 1,R\rrbracket$ , let

[TABLE]

i.e., $\mathcal{I}_{i}$ is the set of indices corresponding to the diagonal block $P_{i}$ in (2.3). Let $\nu$ be an invariant probability for $P$ . Then, for every $(\nu,P)$ -cycle $(i_{1},\dotsc,i_{n})$ , there exists $j\in\llbracket 1,R\rrbracket$ such that $i_{1},\dotsc,i_{n}$ are in $\mathcal{I}_{j}$ .

Proof.

Notice that, by (2.4), $\nu_{i}=0$ if $i\notin\bigcup_{j\in\llbracket 1,R\rrbracket}\mathcal{I}_{j}$ . Hence, since $\nu_{i_{1}}>0$ , there exists $j\in\llbracket 1,R\rrbracket$ such that $i_{1}\in\mathcal{I}_{j}$ . Since $p_{i_{1}i_{2}}>0$ , it follows by the block decomposition (2.3) that $i_{2}\in\mathcal{I}_{j}$ . The conclusion follows by an immediate inductive argument. ∎

We also introduce the following notation.

Definition 2.8.

Let $P$ be a stochastic matrix and $\mathcal{A}=(A_{1},\dotsc,A_{N})$ be an $N$ -tuple of $d\times d$ matrices with real coefficients.

(a)

For every $P$ -word $(i_{1},\dotsc,i_{k})$ , we use $A(i_{1},\dotsc,i_{k})$ to denote the matrix product $A_{i_{k}}\dotsm\allowbreak A_{i_{1}}$ . 2. (b)

For every $s\in\llbracket 1,N\rrbracket$ , let $C(P,s)$ be the matrix semigroup made of all matrix products associated with $P$ -cycles with starting index $s$ , i.e.,

[TABLE]

We also set

[TABLE]

We finally provide the definition of the probabilistic counterpart of $\rho_{\mathrm{d}}(\mathcal{A})$ for $\Sigma(\mathcal{A})$ . Let $P=(p_{ij})_{1\leq i,j\leq N}$ be a stochastic matrix, $\nu=(\nu_{1},\dotsc,\nu_{N})$ be an invariant probability for $P$ , and $\mathcal{A}=(A_{1},\dotsc,A_{N})$ an $N$ -tuple in $\mathcal{M}_{d}(\mathbb{R})$ . The probabilistic joint spectral radius $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ is defined as

[TABLE]

where

[TABLE]

As in the deterministic case, $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ does not depend on the specific choice of the norm $\lVert\cdot\rVert$ and we have

[TABLE]

Remark 2.9.

The expectation in (2.5) is taken with respect to the random variable $(i_{1},\dotsc,\allowbreak i_{n})\in\llbracket 1,N\rrbracket^{n}$ . The definition of probabilistic joint spectral radius provided here is a particular instance of a more general and comprehensive formulation based on symbolic dynamics; see, for instance, [12, 11, 24]. Notice also that it follows from the definition of $(\nu,P)$ -word that the summation in (2.6) can be restricted to $(\nu,P)$ -words of length $n$ .

Remark 2.10.

When dealing with probabilistic switching phenomena in discrete time, several works, such as [24, 17, 16, 11, 8, 1], deal with the probabilistic Lyapunov exponent $\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})$ defined by

[TABLE]

Our choice to use $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ instead is motivated by the fact that the main goal of our paper is to compare the probabilistic behavior of (1.1) with the worst deterministic behavior provided by the classical joint spectral radius $\rho_{\mathrm{d}}(\mathcal{A})$ , whose definition in discrete-time (2.1) does not involve taking the logarithm of the norm of the matrix product. Working with $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ also has the additional advantage of being able to handle the case $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=0$ without dealing with the singularity of the logarithm at [math].

Clearly, by Jensen’s inequality, we have $e^{\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})}\leq\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ , but this inequality may be strict in some cases. Indeed, for $d=1$ , $N=2$ , $P=\operatorname{Id}_{2}$ , $\nu=(\frac{1}{2},\frac{1}{2})$ , and $\mathcal{A}=(A_{1},A_{2})\in\mathcal{M}_{1}(\mathbb{R})^{2}\simeq\mathbb{R}^{2}$ , we easily compute that $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\frac{1}{2}(A_{1}+A_{2})$ and $e^{\lambda_{\mathrm{p}}}(\nu,P,\mathcal{A})=\sqrt{A_{1}A_{2}}$ .

We do have equality between $e^{\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})}$ and $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ , however, under the assumption that $(\nu,P)$ defines an ergodic Markov chain, i.e., $\nu=\nu^{[i]}$ for some $i\in\llbracket 1,R\rrbracket$ in the decompositions (2.3) and (2.4) in Proposition 2.6. Indeed, in this case, the main result of [16] implies that

[TABLE]

where $\mathbb{P}_{(\nu,P)}$ denotes the probability measure on $\mathfrak{S}$ associated canonically with the transition matrix $P$ and the invariant probability $\nu$ . Using this fact, one deduces that $e^{\lambda_{\mathrm{p}}(\nu,P,\mathcal{A})}=\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ , and, in addition, we also have the equality

[TABLE]

Notice that, in particular, $(\nu,P)$ is ergodic when $P$ is strongly connected and $\nu$ is its unique invariant measure.

Remark 2.11.

The deterministic joint spectral radius $\rho_{\mathrm{d}}(\mathcal{A})$ provides the worst asymptotic behavior of $\Sigma(\mathcal{A})$ with respect to $\sigma\in\mathfrak{S}$ . By introducing the probability measure $\mathbb{P}_{(\nu,P)}$ on $\mathfrak{S}$ associated canonically with the transition matrix $P$ and the invariant probability $\nu$ , the quantity $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ defined in (2.5) can be interpreted as an asymptotic behavior averaged by $\mathbb{P}_{(\nu,P)}$ . When $(\nu,P)$ is ergodic, thanks to (2.8), we have the stronger interpretation of $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ as the $\mathbb{P}_{(\nu,P)}$ -almost sure asymptotic behavior of $\Sigma(\mathcal{A})$ .

It is immediate to see that, for every $(\nu,P,\mathcal{A})$ as above, we have $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})\leq\rho_{\mathrm{d}}(\mathcal{A})$ , and then

[TABLE]

where the first supremum is taken over all invariant probabilities $\nu^{\prime}$ for $P$ and the second one over the pairs $(\nu^{\prime},P^{\prime})$ made of an $N\times N$ stochastic matrix $P^{\prime}$ and an invariant probability $\nu^{\prime}$ for $P^{\prime}$ . We find it useful to introduce the notation

[TABLE]

Remark 2.12.

It follows from (2.7) that $(\nu^{\prime},P^{\prime})\mapsto\rho_{\mathrm{p}}(\nu^{\prime},P^{\prime},\mathcal{A})$ is upper semicontinuous. Moreover, the set of pairs $(\nu^{\prime},P^{\prime})$ consisting of an $N\times N$ stochastic matrix $P^{\prime}$ and an invariant probability $\nu^{\prime}$ for $P^{\prime}$ is compact. As a consequence, the suprema in (2.10) can be replaced by maxima.

3 Equality between deterministic and probabilistic joint spectral radii

3.1 Equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$

The goal of this section is to prove the following result characterizing equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ .

Theorem 3.1.

Let $P\in\mathcal{M}_{N}(\mathbb{R})$ be a stochastic matrix, $\nu\in\mathbb{R}^{N}$ be an invariant probability measure for $P$ , and $\mathcal{A}=(A_{1},\allowbreak\dotsc,\allowbreak A_{N})\in\mathcal{M}_{d}(\mathbb{R})^{N}$ . Then the following statements are equivalent:

(a)

$\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ . 2. (b)

$\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\rho_{\mathrm{d}}(\mathcal{A})$ * for every $(\nu,P)$ -cycle $(i_{1},\dotsc,i_{k})$ .*

The fact that (a) implies (b) follows from the results in [24], as detailed in the following lemma.

Lemma 3.2.

Let $P\in\mathcal{M}_{N}(\mathbb{R})$ be a stochastic matrix, $\nu\in\mathbb{R}^{N}$ be an invariant probability measure for $P$ , and $\mathcal{A}=(A_{1},\allowbreak\dotsc,\allowbreak A_{N})\in\mathcal{M}_{d}(\mathbb{R})^{N}$ . If $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ , then $\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\rho_{\mathrm{d}}(\mathcal{A})$ for every $(\nu,P)$ -cycle $(i_{1},\dotsc,i_{k})$ .

Proof.

If $\rho_{\mathrm{d}}(\mathcal{A})=0$ , the result follows trivially from (2.2). We then assume $\rho_{\mathrm{d}}(\mathcal{A})>0$ ,

we decompose $P$ and $\nu$ according to Proposition 2.6, and we use in the sequel the same notations as in its statement. We also let $\mathcal{I}_{1},\dotsc,\mathcal{I}_{R}$ be defined as in the statement of Lemma 2.7. Thanks to (2.4), (2.5), and (2.6), we have

[TABLE]

By (2.9), we have $\rho_{\mathrm{p}}(\nu^{[j]},P,\mathcal{A})\leq\rho_{\mathrm{d}}(\mathcal{A})$ for every $j\in\llbracket 1,R\rrbracket$ and, since $\alpha_{j}\in[0,1]$ for every $j\in\llbracket 1,R\rrbracket$ and $\sum_{j=1}^{R}\alpha_{j}=1$ , we deduce from (3.1) and the equality $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ that $\rho_{\mathrm{p}}(\nu^{[j]},P,\mathcal{A})=\rho_{\mathrm{d}}(\mathcal{A})$ for every $j\in\llbracket 1,R\rrbracket$ such that $\alpha_{j}>0$ .

Let $(i_{1},\dotsc,i_{k})$ be a $(\nu,P)$ -cycle and note that, by Lemma 2.7, there exists $r\in\llbracket 1,R\rrbracket$ such that $i_{1},\dotsc,i_{k}$ are in $\mathcal{I}_{r}$ , and thus, in particular, $(i_{1},\dotsc,i_{k})$ is also a $(\nu^{[r]},P)$ -cycle. Moreover, such a $r$ necessarily satisfies $\alpha_{r}>0$ .

Consider the $k$ -periodic switching signal $\sigma\in\mathfrak{S}$ corresponding to $(i_{1},\dotsc,i_{k})$ , defined by $\sigma(j+\ell k)=i_{j}$ for all integers $j\in\llbracket 1,k\rrbracket$ and $\ell\geq 0$ . Endow $\mathfrak{S}$ with its usual product topology and denote by $\mathbb{P}_{r}$ the Borel probability measure on $\mathfrak{S}$ corresponding to the Markov chain defined by $(\nu^{[r]},P)$ . Note that, since $(i_{1},\dotsc,i_{k})$ is a $(\nu^{[r]},P)$ -cycle, for every $n\in\mathbb{N}$ the set $\{\widetilde{\sigma}\in\mathfrak{S}\mid\widetilde{\sigma}(i)=\sigma(i)\text{ for every }i\in\llbracket 1,n\rrbracket\}$ has positive $\mathbb{P}_{r}$ measure, and thus $\sigma$ is in the support of $\mathbb{P}_{r}$ . Moreover, using also Remark 2.10, we have $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(\nu^{[r]},P,\mathcal{A})=e^{\lambda_{\mathrm{p}}(\nu^{[r]},P,A)}$ , and then $\mathbb{P}_{r}$ is a maximizing measure of $\mathcal{A}$ in the sense of [24], where the set of maximizing measures of $\mathcal{A}$ is defined as the set of all Borel probability measures on $\mathfrak{S}$ invariant under the usual time shift and such that the corresponding probabilistic Lyapunov exponent coincides with $\log\rho_{\mathrm{d}}(\mathcal{A})$ . Hence $\sigma$ belongs to the Mather set of $\mathcal{A}$ (see [24, Theorem 2.3], where the Mather set of $\mathcal{A}$ is defined as the union of the supports of all maximizing measures of $\mathcal{A}$ ), and thus, by [24, Theorem 2.3(3)], we get

[TABLE]

Set $M=\rho_{\mathrm{d}}(\mathcal{A})^{-k}A_{i_{k}}\dotsm A_{i_{1}}$ . By (2.2), we have that $\rho(M)\leq 1$ . For every $n\geq 1$ , there exist integers $\ell\geq 0$ and $j\in\llbracket 0,k-1\rrbracket$ such that $n=j+\ell k$ . Since $\sigma$ is $k$ -periodic, we have that

[TABLE]

If $\rho(M)<1$ , then the right-hand side of the above inequality tends to [math] as $\ell\to\infty$ , contradicting (3.2). Hence, we have necessarily $\rho(M)=1$ . ∎

The proof that (b) implies (a) in Theorem 3.1 is decomposed in three steps. We first establish the result under the extra assumptions that $\mathcal{A}$ is irreducible and $P$ is strongly connected (Lemma 3.3). We then obtain the conclusion under the sole additional assumption that $P$ is strongly connected (Lemma 3.5). Finally, we consider the general case in the third step.

Lemma 3.3.

Let $P\in\mathcal{M}_{d}(\mathbb{R})$ be a stochastic strongly connected matrix, $\mathcal{A}=(A_{1},\dotsc,\allowbreak A_{N})\allowbreak\in\mathcal{M}_{d}(\mathbb{R})^{N}$ be irreducible, and $\lVert\cdot\rVert_{\mathrm{B}}$ be a Barabanov norm for $\mathcal{A}$ . Then the following statements are equivalent:

(a)

$\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(P,\mathcal{A})$ . 2. (b)

$\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\rho_{\mathrm{d}}(\mathcal{A})$ * for every $P$ -cycle $(i_{1},\dotsc,i_{k})$ .* 3. (c)

$\mathopen{}\mathclose{{}\left\lVert A_{i_{k}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}^{1/k}=\rho_{\mathrm{d}}(\mathcal{A})$ * for every $P$ -word $(i_{1},\dotsc,i_{k})$ .*

Proof.

The fact that (a) implies (b) is a particular case of Lemma 3.2. Moreover, it is immediate that (c) implies (a) thanks to (2.5), (2.6), and Remark 2.9. We are then left to prove that (b) implies (c).

Assume that (b) holds. Fix a $P$ -word $(i_{1},\dotsc,i_{k})$ . Since $P$ is strongly connected, there exist $r\in\mathbb{N}$ and $i_{k+1},\dotsc,i_{r}\in\llbracket 1,N\rrbracket$ (obtained by connecting $i_{k}$ to $i_{1}$ ) such that $(i_{1},\dotsc,i_{r})$ is a $P$ -cycle. Then, by (b),

[TABLE]

Since the spectral radius is a lower bound for any induced norm of a matrix, we obtain that

[TABLE]

Using the fact that $\lVert\cdot\rVert_{\mathrm{B}}$ is a Barabanov norm, we also have that

[TABLE]

By combining the previous inequalities, it follows that $\mathopen{}\mathclose{{}\left\lVert A_{i_{k}}\dotsm A_{i_{1}}}\right\rVert_{\mathrm{B}}=\rho_{\mathrm{d}}(\mathcal{A})^{k}$ . ∎

Remark 3.4.

The proof of Lemma 3.3 only uses that $\lVert\cdot\rVert_{\mathrm{B}}$ is an extremal norm, i.e., it satisfies (a) in Definition 2.1. The irreducibility assumption on $\mathcal{A}$ could then be replaced by its nondefectiveness (we refer the reader to [19, Section 2.1.2] for details). However, we prefer to state Lemma 3.3 in terms of irreducibility since this condition is easier to handle: it can be checked more directly and, up to a linear change of coordinates, a reducible $\mathcal{A}$ can be put into block-triangular form with irreducible diagonal blocks. This block decomposition is a key argument in the proof of Lemma 3.5.

We now consider the case where $\mathcal{A}$ is not necessarily irreducible. Here, a Barabanov norm for $\mathcal{A}$ in general does not exist, and hence item (c) from Lemma 3.3 cannot be expected.

Lemma 3.5.

Let $P\in\mathcal{M}_{N}(\mathbb{R})$ be a stochastic strongly connected matrix and $\mathcal{A}=(A_{1},\allowbreak\dotsc,\allowbreak A_{N})\in\mathcal{M}_{d}(\mathbb{R})^{N}$ . Then the following statements are equivalent:

(a)

$\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(P,\mathcal{A})$ . 2. (b)

$\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\rho_{\mathrm{d}}(\mathcal{A})$ * for every $P$ -cycle $(i_{1},\dotsc,i_{k})$ .*

Proof.

Before giving the core of the argument, we start with a set of remarks. First, up to a linear change of coordinates, $A_{1},\dotsc,A_{N}$ can be presented in block-triangular form as

[TABLE]

with $\mathcal{A}^{(r)}=(A_{1}^{(r)},\dotsc,A_{N}^{(r)})$ irreducible for every $r\in\llbracket 1,R\rrbracket$ . Remark that, on the one hand, according to [19, Proposition 1.5], we have $\rho_{\mathrm{d}}(\mathcal{A})=\max_{r\in\llbracket 1,R\rrbracket}\rho_{\mathrm{d}}(\mathcal{A}^{(r)})$ and, on the other hand, it follows from [17, Theorem 1.1] and the strong connectedness of $P$ that $\rho_{\mathrm{p}}(P,\mathcal{A})=\max_{r\in\llbracket 1,R\rrbracket}\allowbreak\rho_{\mathrm{p}}(\allowbreak P,\allowbreak\mathcal{A}^{(r)})$ . Moreover, for every $P$ -cycle $(i_{1},\dotsc,i_{k})$ , we have

[TABLE]

where the inequality comes from (2.2) and the equality results from the simple fact that the spectral radius of a block-triangular matrix is equal to the maximum of the spectral radii over the diagonal blocks.

Since (a) implies (b) by Lemma 3.2, we are left to prove the converse implication. Assume that (b) holds true. Then (a) holds trivially if $\rho_{\mathrm{d}}(\mathcal{A})=0$ . Otherwise, we can assume, with no loss of generality, that $\rho_{\mathrm{d}}(\mathcal{A})=1$ up to replacing $\mathcal{A}$ by $\rho_{\mathrm{d}}(\mathcal{A})^{-1}\mathcal{A}$ . By assumption and (3.3), for every $P$ -cycle $(i_{1},\dotsc,i_{k})$ , there exists $r\in\llbracket 1,R\rrbracket$ such that $\rho\mathopen{}\mathclose{{}\left(A_{i_{k}}^{(r)}\dotsm A_{i_{1}}^{(r)}}\right)=1$ .

We claim that $r$ can be chosen independently of the $P$ -cycle. We argue by contradiction, i.e., we assume that, for every $r\in\llbracket 1,R\rrbracket$ , there exists a $P$ -cycle $i^{r}=(i_{1}^{r},\dotsc,i_{\ell_{r}}^{r})$ such that $\rho(A^{(r)}(i^{r}))<1$ . Let $j^{r}=(j_{1}^{r},\dotsc,j_{k_{r}}^{r})$ be a $P$ -word such that $j_{1}^{r}=i_{1}^{r}$ and $p_{j_{k_{r}}^{r}i_{1}^{r+1}}>0$ (with the convention that $i_{1}^{R+1}=i_{1}^{1}$ ). Then, for every $n\in\mathbb{N}$ ,

[TABLE]

For every $n$ , we apply (b) to the above product, and we deduce from (3.3) that there exists $r_{n}\in\llbracket 1,R\rrbracket$ such that

[TABLE]

Let $(n_{q})_{q\in\mathbb{N}}$ be an increasing sequence such that there exists $\overline{r}\in\llbracket 1,R\rrbracket$ with $r_{n_{q}}=\overline{r}$ for every $q\in\mathbb{N}$ . Since $\mathcal{A}^{(\overline{r})}$ is irreducible, there exists a Barabanov norm $\lVert\cdot\rVert_{\overline{r}}$ for $\mathcal{A}^{(\overline{r})}$ . Then, for every $q\in\mathbb{N}$ , we have

[TABLE]

where the last inequality follows from the fact that $\lVert\cdot\rVert_{\overline{r}}$ is a Barabanov norm. Since $\rho(A^{(\overline{r})}(i^{\overline{r}}))<1$ , we have that $\mathopen{}\mathclose{{}\left\lVert A^{(\overline{r})}(i^{\overline{r}})^{n_{q}}}\right\rVert_{\overline{r}}\xrightarrow[q\to\infty]{}0$ , hence the contradiction.

We thus have proved that there exists $\overline{r}\in\llbracket 1,R\rrbracket$ such that, for every $P$ -cycle $(i_{1},\dotsc,i_{k})$ ,

[TABLE]

On the other hand, by (2.2), we have $\rho\mathopen{}\mathclose{{}\left(A_{i_{k}}^{(\overline{r})}\dotsm A_{i_{1}}^{(\overline{r})}}\right)\leq\rho_{\mathrm{d}}(\mathcal{A}^{(\overline{r})})$ . Since $\rho_{\mathrm{d}}(\mathcal{A}^{(\overline{r})})\leq\rho_{\mathrm{d}}(\mathcal{A})$ , we deduce that

[TABLE]

for every $P$ -cycle $(i_{1},\dotsc,i_{k})$ . Then, using Lemma 3.3, we obtain that

[TABLE]

and then (a) holds thanks to (2.9). ∎

We can conclude now the proof of Theorem 3.1.

Proof of Theorem 3.1.

Recall that, thanks to Lemma 3.2, we are only left to prove that (b) implies (a). We first decompose $P$ and $\nu$ according to Proposition 2.6 and use in the sequel the same notations as in its statement. Thanks to (2.5) and (2.6), we have

[TABLE]

For $j\in\llbracket 1,R\rrbracket$ , let $\mathcal{A}^{[j]}$ be the ordered $n_{j}$ -tuple made of the matrices $A_{\ell}$ such that $\nu^{[j]}_{\ell}>0$ . Notice that $\rho_{\mathrm{p}}(\nu^{[j]},P_{j},\mathcal{A}^{[j]})=\rho_{\mathrm{p}}(\nu^{[j]},P,\mathcal{A})$ for every $j\in\llbracket 1,R\rrbracket$ . Using (2.9) and the fact that $\mathcal{A}^{[j]}$ is made of matrices from $\mathcal{A}$ , we obtain that, for every $j\in\llbracket 1,R\rrbracket$ ,

[TABLE]

Let $\mathcal{I}_{i}$ be defined for $i\in\llbracket 1,R\rrbracket$ as in Lemma 2.7 and let $j\in\llbracket 1,R\rrbracket$ be such that $\alpha_{j}>0$ . Thanks to Lemma 2.7, there exists a $(\nu,P)$ -cycle $(i_{1},\dotsc,i_{k})$ with $i_{1},\dotsc,i_{k}$ in $\mathcal{I}_{j}$ . Then, by (2.2), (3.5), and (b), we have

[TABLE]

In particular, $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{d}}(\mathcal{A}^{[j]})$ and $\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\rho_{\mathrm{d}}(\mathcal{A}^{[j]})$ . Lemma 3.5 applied to $P_{j}$ and $\mathcal{A}^{[j]}$ yields that $\rho_{\mathrm{p}}(\nu^{[j]},P_{j},\mathcal{A}^{[j]})=\rho_{\mathrm{d}}(\mathcal{A}^{[j]})$ . Hence $\rho_{\mathrm{p}}(\nu^{[j]},P_{j},\allowbreak\mathcal{A}^{[j]})=\rho_{\mathrm{d}}(\mathcal{A})$ , and, since this holds for every $j\in\llbracket 1,R\rrbracket$ such that $\alpha_{j}>0$ , it follows from (3.4) that $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\rho_{\mathrm{d}}(\mathcal{A})$ , as required. ∎

Remark 3.6.

Theorem 3.1 and Lemmas 3.3 and 3.5 characterize equality between deterministic and probabilistic joint spectral radii in terms of $P$ -cycles and $(\nu,P)$ -cycles only, and hence only on $\mathopen{}\mathclose{{}\left\lceil P}\right\rceil$ and $\lceil\nu\rceil$ (see Remark 2.4). In other words, equality in Theorem 3.1 (a) depends only on the graph associated with the Markov chain and the possible choices of initial states, but not on the precise values of the non-zero initial and transition probabilities.

3.2 Geometric characterization of equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(P,\allowbreak\mathcal{A})$

It is clear from Theorem 3.1 that equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(P,\mathcal{A})$ is possible only for restricted choices of $\mathcal{A}$ . The goal of this section is to provide a more precise description of such choices of $\mathcal{A}$ using results from [25], where the authors classify matrix semigroups of constant spectral radius. We start with the following proposition.

Proposition 3.7.

Let $P\in\mathcal{M}_{N}(\mathbb{R})$ be a stochastic strongly connected matrix and $\mathcal{A}=(A_{1},\dotsc,\allowbreak A_{N})\allowbreak\in\mathcal{M}_{d}(\mathbb{R})^{N}$ be such that $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(P,\mathcal{A})$ . Assume that there exists $s\in\llbracket 1,N\rrbracket$ such that $C(P,s)$ is irreducible. Then there exists an invertible matrix $G\in\mathcal{M}_{d}(\mathbb{R})$ such that, for every $P$ -cycle $i$ starting at $s$ , either $A(i)$ is singular or $\rho_{\mathrm{d}}(\mathcal{A})^{-k}GA(i)G^{-1}$ is orthogonal, where $k$ is the length of $i$ .

Proof.

We only have to provide an argument if there exists a $P$ -cycle $i_{\ast}$ starting at $s$ such that $A(i_{\ast})$ is invertible. In that case, from (2.2), $\rho_{\mathrm{d}}(\mathcal{A})\geq\rho(A(i_{\ast}))^{1/k_{\ast}}>0$ , where $k_{\ast}$ is the length of $i_{\ast}$ . From Lemma 3.5, the set

[TABLE]

is a matrix semigroup with constant spectral radius. Since, moreover, this semigroup is also irreducible, the conclusion follows from [25, Theorem 2]. ∎

Remark 3.8.

As remarked in [25], the problem of classifying matrix semigroups with constant spectral radius is highly nontrivial when the semigroup contains singular matrices. By using additional results from [25], we may obtain, under the assumptions of Proposition 3.7, properties on $\rho_{\mathrm{d}}(\mathcal{A})^{-k}GA(i)G^{-1}$ that are weaker than orthogonality but apply to all matrices $A(i)\in C(P,s)$ , and not only nonsingular ones. We refer the interested reader to [25, Theorem 3 and Corollary 6].

A limitation of Proposition 3.7 lies in the fact that, in general, given a stochastic and strongly connected matrix $P$ , it is a nontrivial task to verify the existence of an index $s$ such that $C(P,s)$ is irreducible, even if $\mathcal{A}$ is itself irreducible. However, this is true if one assumes in addition that $\mathcal{A}$ contains only invertible matrices and that all diagonal elements of $P$ are positive, in which case we have the following proposition.

Proposition 3.9.

Let $P\in\mathcal{M}_{d}(\mathbb{R})$ be a stochastic strongly connected matrix with positive diagonal entries and $\mathcal{A}=(A_{1},\dotsc,\allowbreak A_{N})\allowbreak\in\mathcal{M}_{d}(\mathbb{R})^{N}$ be irreducible with $A_{1},\dotsc,A_{N}$ invertible. Then, for every $s\in\llbracket 1,N\rrbracket$ , $C(P,s)$ is irreducible. Moreover, $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(P,\mathcal{A})$ if and only if there exists an invertible matrix $G\in\mathcal{M}_{d}(\mathbb{R})$ such that, for every $i\in\llbracket 1,N\rrbracket$ , $\rho_{\mathrm{d}}(\mathcal{A})^{-1}GA_{i}G^{-1}$ is orthogonal.

Proof.

Let $s\in\llbracket 1,N\rrbracket$ and consider the group $\widetilde{C}(P,s)$ generated by $C(P,s)$ . We claim that $A_{1},\dotsc,A_{N}\in\widetilde{C}(P,s)$ . Indeed, since $P$ is strongly connected, there exists a $P$ -cycle $i=(i_{1},\dotsc,i_{k})$ starting at $s$ such that $\{i_{1},\dotsc,i_{k}\}=\llbracket 1,N\rrbracket$ . Since $p_{i_{k}i_{k}}>0$ , then $A_{i_{k}}^{2}A_{i_{k-1}}\dotsm A_{i_{1}}\in C(P,s)$ and

[TABLE]

Similarly, since $p_{i_{k-1}i_{k-1}}>0$ , then $A_{i_{k}}A_{i_{k-1}}^{2}A_{i_{k-2}}\dotsm A_{i_{1}}\in C(P,s)$ and

[TABLE]

An inductive reasoning based on the identity

[TABLE]

yields that $A_{i_{j}}\in\widetilde{C}(P,s)$ for $j\in\llbracket 1,k\rrbracket$ , as required.

To prove that $C(P,s)$ is irreducible for every $s$ , assume by contradiction that there exists $s\in\llbracket 1,N\rrbracket$ such that $C(P,s)$ is reducible. Then the group $\widetilde{C}(P,s)$ is also reducible, however, since it contains $A_{1},\dotsc,A_{N}$ , this contradicts the irreducibility of $\mathcal{A}$ .

Since $A_{1},\dotsc,A_{N}$ are invertible matrices, $\rho_{\mathrm{d}}(\mathcal{A})$ is positive and, with no loss of generality, we can assume that $\rho_{\mathrm{d}}(\mathcal{A})=1$ . If $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(P,\mathcal{A})$ , then, applying Proposition 3.7 to $C(P,1)$ , there exists a basis in which every $M\in C(P,1)$ is orthogonal. Hence, in this same basis, $\widetilde{C}(P,1)$ is also made of orthogonal matrices, yielding the conclusion. On the other hand, if there exists a basis in which $A_{1},\dotsc,A_{N}$ are orthogonal, then $\rho(A(i))=1$ for every $P$ -word $i$ , and the conclusion follows by Lemma 3.5. ∎

Remark 3.10.

Notice that, to obtain the second part of the conclusion of Proposition 3.9, it is enough that there exists $s\in\llbracket 1,N\rrbracket$ such that $C(P,s)$ is irreducible and the generated group $\widetilde{C}(P,s)$ contains all matrices $A_{1},\dotsc,A_{N}$ . The assumption that $P$ has positive diagonal entries is used to guarantee the latter, and therefore it can be replaced by any other condition ensuring that $A_{1},\dotsc,A_{N}$ belong to $\widetilde{C}(P,s)$ for some $s\in\llbracket 1,N\rrbracket$ . For instance, assume that $p_{11}=0$ and $p_{jj}>0$ for $j\in\llbracket 2,N\rrbracket$ . For every $P$ -cycle $(i_{1},\dotsc,i_{k})$ with $i_{1}=1$ and $i_{j}\neq 1$ for every $j\in\llbracket 2,k\rrbracket$ , we can proceed as in the proof of the proposition to obtain that $A_{i_{j}}\in\widetilde{C}(P,1)$ for every $j\in\llbracket 2,k\rrbracket$ and use the identity

[TABLE]

to obtain that $A_{i_{1}}\in\widetilde{C}(P,1)$ . Since $P$ is strongly connected, every matrix $A_{i}$ , $i\in\llbracket 1,N\rrbracket$ , belongs to such a $P$ -cycle, hence the conclusion.

At the light of Remark 3.10, we may wonder whether the second part of the conclusion of Proposition 3.9 can be obtained under even weaker assumptions on the matrix $P$ , allowing for instance the presence of more than one diagonal element equal to zero, but requiring at least one non-zero element in the diagonal. The example given below shows that this is not possible.

Example 3.11.

Consider the case $d=2$ , $N=3$ ,

[TABLE]

Note that, in this case,

[TABLE]

The matrix $P$ is stochastic, strongly connected, and its unique invariant probability is $\nu=(\frac{1}{2},\frac{1}{4},\frac{1}{4})$ . Moreover, $\mathcal{A}=(A_{1},A_{2},A_{3})$ is irreducible and $A_{1},A_{2},A_{3}$ are invertible. Denoting by $\lVert\cdot\rVert$ the Euclidean norm in $\mathbb{R}^{2}$ , we have $\lVert A_{1}\rVert=\lVert A_{2}\rVert=\lVert A_{3}\rVert=1$ , yielding that $\rho_{\mathrm{d}}(\mathcal{A})\leq 1$ , and we easily check that $\rho_{\mathrm{d}}(\mathcal{A})=1$ by considering $\sigma\in\mathfrak{S}$ given by $\sigma(i)=1$ for every $i\in\mathbb{N}$ . Moreover, for any $(\nu,P)$ -word $(i_{1},\dotsc,i_{k})$ , there exist an integer $\ell\geq 0$ and $a,b\in\{0,1\}$ such that $\lVert A_{i_{k}}\dotsm A_{i_{1}}\rVert=\lVert A_{2}^{b}(A_{3}A_{2})^{\ell}A_{3}^{a}\rVert$ . Setting $x=\begin{pmatrix}1\\ 0\end{pmatrix}$ in the case $a=1$ and $x=\begin{pmatrix}0\\ 1\end{pmatrix}$ in the case $a=0$ , it is immediate to verify that $\lVert A_{2}^{b}(A_{3}A_{2})^{\ell}A_{3}^{a}x\rVert=1$ , yielding that $\lVert A_{i_{k}}\dotsm A_{i_{1}}\rVert=1$ for every $(\nu,P)$ -word $(i_{1},\dotsc,i_{k})$ , and thus $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\rho_{\mathrm{p}}(P,A)=1$ . However, $A_{3}$ is not similar to an orthogonal matrix, and hence the second conclusion of Proposition 3.9 does not hold. Notice moreover that, in this case, $C(P,1)=\{(A_{3}A_{2})^{n}\mid n\in\mathbb{N}\cup\{0\}\}$ , $C(P,2)=\{(A_{3}A_{2})^{n}\mid n\in\mathbb{N}\}$ , and $C(P,3)=\{(A_{2}A_{3})^{n}\mid n\in\mathbb{N}\}$ , and thus $C(P,s)$ is reducible for every $s\in\{1,2,3\}$ .

Remark 3.12.

We now provide a description of all cases where equality holds between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ under the assumption that $\mathcal{A}$ is irreducible and made of two invertible matrices.

(a)

If $P=\begin{pmatrix}p&1-p\\ 1-q&q\end{pmatrix}$ for $p,q\in[0,1)$ with $p+q>0$ , by Remark 3.10, equality occurs if and only if there exists an invertible matrix $G\in\mathcal{M}_{d}(\mathbb{R})$ such that $\rho_{\mathrm{d}}(\mathcal{A})^{-1}GA_{1}G^{-1}$ and $\rho_{\mathrm{d}}(\mathcal{A})^{-1}GA_{2}G^{-1}$ are orthogonal. 2. (b)

If $P=\begin{pmatrix}0&1\\ 1&0\end{pmatrix}$ , equality occurs if and only if $\rho(A_{1}A_{2})=\rho(A_{2}A_{1})=\rho_{\mathrm{d}}(\mathcal{A})^{2}$ . 3. (c)

If $P=\operatorname{Id}_{2}$ , equality occurs if and only if $\rho(A_{i})=\rho_{\mathrm{d}}(\mathcal{A})$ whenever $\nu_{i}>0$ , $i\in\{1,2\}$ . 4. (d)

If $P=\begin{pmatrix}1&0\\ 1-p&p\end{pmatrix}$ for some $p\in[0,1)$ , then equality is equivalent to $\rho(A_{1})=\rho_{\mathrm{d}}(\mathcal{A})$ . 5. (e)

If $P=\begin{pmatrix}p&1-p\\ 0&1\end{pmatrix}$ for some $p\in[0,1)$ , then equality is equivalent to $\rho(A_{2})=\rho_{\mathrm{d}}(\mathcal{A})$ .

3.3 Equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\mathcal{A})$

Based on the results obtained previously, we can now address the issue of characterizing the equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\mathcal{A})$ . Recall that the latter is defined as the maximum of $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ over all pairs $(\nu,P)$ .

Theorem 3.13.

Let $\mathcal{A}=(A_{1},\dotsc,A_{N})\in\mathcal{M}_{d}(\mathbb{R})^{N}$ . Then the following statements are equivalent:

(a)

$\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(\mathcal{A})$ . 2. (b)

There exist $i_{1},\dotsc,i_{k}\in\llbracket 1,N\rrbracket$ pairwise distinct such that

[TABLE]

Proof.

We start by proving that (a) implies (b). Recall that, by Remark 2.12, there exist a stochastic matrix $P$ and an invariant probability $\nu$ for $P$ such that $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\rho_{\mathrm{p}}(\mathcal{A})$ . Using (a), we deduce that $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\rho_{\mathrm{d}}(\mathcal{A})$ . It is clear that there exists a $(\nu,P)$ -cycle $(i_{1},\dotsc,i_{k})$ such that $i_{1},\dotsc,i_{k}$ are pairwise distinct, and the conclusion follows from Theorem 3.1.

To prove that (b) implies (a), let $P=(p_{ij})$ be a stochastic matrix with $p_{i_{j-1}i_{j}}=1$ for $j\in\llbracket 2,k\rrbracket$ and $p_{i_{k}i_{1}}=1$ . Set $\nu\in\mathbb{R}^{N}$ as the probability vector such that $\nu_{i_{j}}=\frac{1}{k}$ for $j\in\llbracket 1,k\rrbracket$ . Then $\nu$ is invariant under $P$ and the set of $(\nu,P)$ -cycles is made of the shifts of $(i_{1},\dotsc,i_{k})$ and their powers. Moreover, for every such $(\nu,P)$ -cycle $(j_{1},\dotsc,j_{s})$ , we have

[TABLE]

Indeed, this follows from the fact that $\rho(M_{1}M_{2})=\rho(M_{2}M_{1})$ for every $M_{1},M_{2}\in\mathcal{M}_{d}(\mathbb{R})$ . Then Theorem 3.1 (b) holds, hence $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=\rho_{\mathrm{d}}(\mathcal{A})$ , and the conclusion follows from (2.9). ∎

Remark 3.14.

It follows from (2.9) that, if $\rho_{\mathrm{d}}(\mathcal{A})>0$ , the ratio $\frac{\rho_{\mathrm{p}}(\mathcal{A})}{\rho_{\mathrm{d}}(\mathcal{A})}$ belongs to $[0,1]$ and Theorem 3.13 addresses the case where it is equal to $1$ . We provide next an example where it is equal to [math], proving that it is not possible to find a uniform positive lower bound for this ratio. Indeed, considering

[TABLE]

an immediate computation yields

[TABLE]

and $A_{1}^{3}=A_{1}A_{2}^{2}=A_{2}A_{1}A_{2}=A_{2}^{2}A_{1}=A_{2}^{3}=0$ . Let $\lVert\cdot\rVert_{1}$ denote the matrix norm induced by the $\ell^{1}$ norm in $\mathbb{R}^{3}$ . Define

[TABLE]

and, for $k\in\mathbb{N}$ , let $\mathcal{E}_{k}$ be the set made of the three words of length $k$ obtained by taking the first $k$ entries of each element of $\mathcal{E}$ . By an easy computation, we get that, for every $k\in\mathbb{N}$ and $(i_{1},\dotsc,i_{k})\in\llbracket 1,N\rrbracket^{k}$ ,

[TABLE]

We then obtain that $\rho_{\mathrm{d}}(\mathcal{A})=1$ . On the other hand, for every stochastic matrix $P\in\mathcal{M}_{2}(\mathbb{R})$ and every invariant probability vector $\nu$ for $P$ , we have $\mathbb{P}_{(\nu,P)}(\mathcal{E})=0$ . Hence

[TABLE]

proving that $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})=0$ . Then $\rho_{\mathrm{p}}(\mathcal{A})=0$ .

4 Markov chains of higher order

In this section, we extend the previous results to probability measures on $\mathfrak{S}$ obtained from discrete-time shift-invariant Markov chains of order $m\geq 1$ . Any such probability measure $\mu$ can be described by a pair $(\nu,P)$ of tensors of orders $m$ and $m+1$ , respectively, where the non-negative scalar $P_{i_{1}\dotso i_{m}i_{m+1}}$ represents the probability to switch from the state $i_{m}$ to the state $i_{m+1}$ when the previous $m$ states of the chain are $(i_{1},\dotsc,i_{m})$ , and $\nu_{i_{1}\dotso i_{m}}$ represents the probability of the first $m$ states being $(i_{1},\dotsc,i_{m})$ . In particular, for every $(i_{1},\dotsc,i_{m})\in\llbracket 1,N\rrbracket^{m}$ , we have that

[TABLE]

and $\nu$ satisfies

[TABLE]

We refer to such $\nu$ and $P$ as a probability tensor of order $m$ and a stochastic tensor of order $m+1$ , respectively. The shift-invariance property now reads

[TABLE]

and any probability tensor $\nu$ satisfying the above shift-invariant property is said to be invariant under $P$ . The probabilistic joint spectral radius $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ associated with $(\nu,P)$ is still defined by (2.5), where the expectation $\mathbb{E}_{(\nu,P)}$ corresponds to the probability measure on $\mathfrak{S}$ defined above.

Markov chains of order $m\geq 1$ can be canonically transformed into Markov chains of order $1$ by considering as state space the set $\llbracket 1,N\rrbracket^{m}$ and defining a pair $(\widehat{\nu},\widehat{P})$ from $(\nu,P)$ by $\widehat{\nu}_{(i_{1},\dotsc,i_{m})}=\nu_{i_{1}\dotso i_{m}}$ and

[TABLE]

for every $(i_{1},\dotsc,i_{m})$ and $(j_{1},\dotsc,j_{m})$ in $\llbracket 1,N\rrbracket^{m}$ . It is immediate from the definitions and the shift-invariance property that

[TABLE]

where $\widehat{\mathcal{A}}=(\widehat{A}_{i_{1}\dotso i_{m}})_{(i_{1},\dotsc,i_{m})\in\llbracket 1,N\rrbracket^{m}}$ and $\widehat{A}_{i_{1}\dotso i_{m}}=A_{i_{m}}$ for every $(i_{1},\dotsc,i_{m})\in\llbracket 1,N\rrbracket^{m}$ .

For every positive integer $k$ , we say that $(i_{1},\dotsc,i_{k})$ is a $(\nu,P)$ -cycle if

[TABLE]

is a $(\widehat{\nu},\widehat{P})$ -cycle, where $z\mapsto i_{z}$ is extended to $\mathbb{Z}$ by $k$ -periodicity.

Applying Theorem 3.1 to $(\widehat{\nu},\widehat{P})$ and $\widehat{\mathcal{A}}$ , we deduce at once the following.

Theorem 4.1.

Let $m$ be a positive integer, $P$ be a stochastic tensor of order $m+1$ , $\nu$ be an invariant probability tensor for $P$ , and $\mathcal{A}=(A_{1},\allowbreak\dotsc,\allowbreak A_{N})\in\mathcal{M}_{d}(\mathbb{R})^{N}$ . Then the following statements are equivalent:

(a)

$\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ . 2. (b)

$\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}=\rho_{\mathrm{d}}(\mathcal{A})$ * for every $(\nu,P)$ -cycle $(i_{1},\dotsc,i_{k})$ .*

Recall that (1.1) is said to be periodically stable if $\rho(\sigma)<1$ for all periodic signals $\sigma\in\mathfrak{S}$ . It has been shown in [12] that this property implies $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})<1$ for every strongly connected stochastic matrix $P\in\mathcal{M}_{N}(\mathbb{R})$ , where $\nu\in\mathbb{R}^{N}$ is the unique invariant probability vector for $P$ . A slightly improved version of this result can be obtained as a consequence of Theorem 4.1 as stated in the following corollary.

Corollary 4.2.

*Assume that (1.1) is periodically stable. Then, for every $m\in\mathbb{N}$ , every stochastic tensor $P$ of order $m+1$ , and every invariant probability tensor $\nu$ for $P$ , we have $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})<1$ . *

Proof.

By the Joint Spectral Radius Theorem (see, e.g., [19, Theorem 2.3]), periodic stability implies that $\rho_{\mathrm{d}}(\mathcal{A})\leq 1$ . In the case $\rho_{\mathrm{d}}(\mathcal{A})<1$ , the conclusion follows immediately. Otherwise, when $\rho_{\mathrm{d}}(\mathcal{A})=1$ , the periodic stability assumption implies that assertion (b) from Theorem 4.1 does not hold, which proves that $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})<\rho_{\mathrm{d}}(\mathcal{A})=1$ , yielding the conclusion. ∎

Similarly as for Theorem 4.1, we deduce by applying Theorem 3.13 to $(\widehat{\nu},\widehat{P})$ and $\widehat{\mathcal{A}}$ the following.

Theorem 4.3.

Let $m$ be a positive integer and $\mathcal{A}=(A_{1},\dotsc,A_{N})\in\mathcal{M}_{d}(\mathbb{R})^{N}$ . Then the following statements are equivalent:

(a)

$\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(m,\mathcal{A})$ , where $\rho_{\mathrm{p}}(m,\mathcal{A})$ is the supremum of $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$ over all pairs $(\nu,P)$ with $P$ a stochastic tensor of order $m+1$ and $\nu$ an invariant probability tensor for $P$ . 2. (b)

There exist $i_{1},\dotsc,i_{k}\in\llbracket 1,N\rrbracket$ such that

[TABLE]

and $(i_{j_{1}},\dotsc,i_{j_{1}+m-1})\neq(i_{j_{2}},\dotsc,i_{j_{2}+m-1})$ whenever $j_{1},j_{2}\in\llbracket 1,k\rrbracket$ with $j_{1}\neq j_{2}$ , where $z\mapsto i_{z}$ is extended to $\mathbb{Z}$ by $k$ -periodicity.

As a consequence of Theorem 4.3, we have the following corollary. To state it, recall that $\mathcal{A}$ is said to have the finiteness property if there exist $i_{1},\dotsc,i_{k}\in\llbracket 1,N\rrbracket$ such that $\rho_{\mathrm{d}}(\mathcal{A})=\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}$ .

Corollary 4.4.

Let $\mathcal{A}=(A_{1},\dotsc,A_{N})$ . Then $\mathcal{A}$ has the finiteness property if and only if there exists $m\in\mathbb{N}$ such that $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(m,\mathcal{A})$ .

Proof.

If there exists $m$ such that $\rho_{\mathrm{d}}(\mathcal{A})=\rho_{\mathrm{p}}(m,\mathcal{A})$ , then the finiteness property of $\mathcal{A}$ follows immediately from Theorem 4.3. Assume now that $\mathcal{A}$ has the finiteness property and let $i_{1},\dotsc,i_{k}\in\llbracket 1,N\rrbracket$ be such that $\rho_{\mathrm{d}}(\mathcal{A})=\rho(A_{i_{k}}\dotsm A_{i_{1}})^{1/k}$ . Extend $z\mapsto i_{z}$ over $\mathbb{Z}$ by $k$ -periodicity and let $k^{\prime}$ be the minimal period of $z\mapsto i_{z}$ . Without loss of generality, we can assume that $k=k^{\prime}$ . We claim that property (b) of Theorem 4.3 holds with $m=k$ . Indeed, let $j_{1},j_{2}\in\llbracket 1,k\rrbracket$ be such that $(i_{j_{1}},\dotsc,i_{j_{1}+k-1})=(i_{j_{2}},\dotsc,i_{j_{2}+k-1})$ and assume, to obtain a contradiction, that $j_{1}\neq j_{2}$ . Without loss of generality, $j_{1}<j_{2}$ . Set $k^{\prime\prime}=j_{2}-j_{1}$ and notice that $0<k^{\prime\prime}<k$ and $i_{j_{1}+\ell}=i_{j_{1}+k^{\prime\prime}+\ell}$ for every $\ell\in\llbracket 0,k-1\rrbracket$ . Since $z\mapsto i_{z}$ is $k$ -periodic, the previous equality holds for every $\ell\in\mathbb{Z}$ , proving that $z\mapsto i_{z}$ is $k^{\prime\prime}$ -periodic, contradicting the minimality of $k$ as period of $z\mapsto i_{z}$ . Hence property (a) of Theorem 4.3 holds, as required. ∎

Remark 4.5.

Given $\mathcal{A}=(A_{1},\dotsc,A_{N})$ , $\ell\in\mathbb{N}$ , and a word $w=(i_{1},\dotsc,i_{\ell})\in\llbracket 1,N\rrbracket^{\ell}$ , set $A(w)=A_{i_{\ell}}\dotsm A_{i_{1}}$ and let $\lvert w\rvert=\ell$ be the length of $w$ . Notice that, by proceeding similarly to the second part of the proof of Theorem 3.13, we can construct, for every word $w$ of finite length, a Markov chain of order $\lvert w\rvert$ with tensors $\nu_{w}$ , $P_{w}$ such that $\rho(A(w))^{1/\lvert w\rvert}=\rho_{\mathrm{p}}(\nu_{w},P_{w},\mathcal{A})$ . We deduce that

[TABLE]

where the equality is a consequence of the Joint Spectral Radius Theorem (see, e.g., [19]). Since, moreover, $\rho_{\mathrm{p}}(m,\mathcal{A})\leq\rho_{\mathrm{d}}(\mathcal{A})$ for every $m$ , it follows that $\rho_{\mathrm{d}}(\mathcal{A})=\sup_{m\in\mathbb{N}}\rho_{\mathrm{p}}(m,\mathcal{A})$ .

A further characterization of the equivalence in Corollary 4.4 can then be stated as follows: an $N$ -tuple of matrices $\mathcal{A}=(A_{1},\dotsc,A_{N})$ satisfies the finiteness property if and only if

[TABLE]

is attained at some $(m,\nu,P)$ , where the supremum is taken over all $(m,\nu,P)$ with $m\in\mathbb{N}$ , $P$ a stochastic tensor of order $m+1$ , and $\nu$ an invariant probability tensor for $P$ .

Bibliography29

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. Arnold. Random dynamical systems . Springer Monographs in Mathematics. Springer-Verlag, Berlin, 1998.
2[2] N. E. Barabanov. Lyapunov indicator of discrete inclusions. I–III. Autom. Remote Control , 49:152–157, 283–287, 558–565, 1988.
3[3] V. D. Blondel, J. Theys, and A. A. Vladimirov. An elementary counterexample to the finiteness conjecture. SIAM J. Matrix Anal. Appl. , 24(4):963–970, 2003.
4[4] T. Bousch and J. Mairesse. Asymptotic height optimization for topical IFS, Tetris heaps, and the finiteness conjecture. J. Amer. Math. Soc. , 15(1):77–111, 2002.
5[5] F. Colonius and W. Kliemann. Dynamical systems and linear algebra , volume 158 of Graduate Studies in Mathematics . American Mathematical Society, Providence, RI, 2014.
6[6] O. L. V. Costa, M. D. Fragoso, and R. P. Marques. Discrete-time Markov jump linear systems . Probability and its Applications (New York). Springer-Verlag London, Ltd., London, 2005.
7[7] O. L. V. Costa, M. D. Fragoso, and M. G. Todorov. Continuous-time Markov jump linear systems . Probability and its Applications (New York). Springer, Heidelberg, 2013.
8[8] A. Crisanti, G. Paladin, and A. Vulpiani. Products of random matrices in statistical physics , volume 104 of Springer Series in Solid-State Sciences . Springer-Verlag, Berlin, 1993. With a foreword by Giorgio Parisi.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Abstract

1 Introduction

2 Definitions, notations, and basic facts

2.1 Deterministic joint spectral radius

Definition 2.1** (Barabanov norm).**

Proposition 2.2**.**

2.2 Probabilistic joint spectral radius

Definition 2.3**.**

Remark 2.4**.**

Remark 2.5**.**

Proposition 2.6**.**

Lemma 2.7**.**

Proof.

Definition 2.8**.**

Remark 2.9**.**

Remark 2.10**.**

Remark 2.11**.**

Remark 2.12**.**

3 Equality between deterministic and probabilistic joint spectral radii

3.1 Equality between ρd(A)\rho_{\mathrm{d}}(\mathcal{A})ρd​(A) and ρp(ν,P,A)\rho_{\mathrm{p}}(\nu,P,\mathcal{A})ρp​(ν,P,A)

Theorem 3.1**.**

Lemma 3.2**.**

Proof.

Lemma 3.3**.**

Proof.

Remark 3.4**.**

Lemma 3.5**.**

Proof.

Proof of Theorem 3.1.

Remark 3.6**.**

3.2 Geometric characterization of equality between ρd(A)\rho_{\mathrm{d}}(\mathcal{A})ρd​(A) and ρp(P,A)\rho_{\mathrm{p}}(P,\allowbreak\mathcal{A})ρp​(P,A)

Proposition 3.7**.**

Proof.

Remark 3.8**.**

Proposition 3.9**.**

Proof.

Remark 3.10**.**

Example 3.11**.**

Remark 3.12**.**

3.3 Equality between ρd(A)\rho_{\mathrm{d}}(\mathcal{A})ρd​(A) and ρp(A)\rho_{\mathrm{p}}(\mathcal{A})ρp​(A)

Theorem 3.13**.**

Proof.

Remark 3.14**.**

4 Markov chains of higher order

Theorem 4.1**.**

Corollary 4.2**.**

Proof.

Theorem 4.3**.**

Corollary 4.4**.**

Proof.

Remark 4.5**.**

Definition 2.1 (Barabanov norm).

Proposition 2.2.

Definition 2.3.

Remark 2.4.

Remark 2.5.

Proposition 2.6.

Lemma 2.7.

Definition 2.8.

Remark 2.9.

Remark 2.10.

Remark 2.11.

Remark 2.12.

3.1 Equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\nu,P,\mathcal{A})$

Theorem 3.1.

Lemma 3.2.

Lemma 3.3.

Remark 3.4.

Lemma 3.5.

Remark 3.6.

3.2 Geometric characterization of equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(P,\allowbreak\mathcal{A})$

Proposition 3.7.

Remark 3.8.

Proposition 3.9.

Remark 3.10.

Example 3.11.

Remark 3.12.

3.3 Equality between $\rho_{\mathrm{d}}(\mathcal{A})$ and $\rho_{\mathrm{p}}(\mathcal{A})$

Theorem 3.13.

Remark 3.14.

Theorem 4.1.

Corollary 4.2.

Theorem 4.3.

Corollary 4.4.

Remark 4.5.