A panorama of positivity
Alexander Belton, Dominique Guillot, Apoorva Khare, Mihai Putinar

TL;DR
This survey explores the concept of positive semi-definiteness across various mathematical and applied contexts, emphasizing positivity-preserving operations and their applications in high-dimensional data analysis.
Contribution
It provides a comprehensive overview of positivity-preserving techniques, connecting classical theories with modern applications in covariance estimation and regularization.
Findings
Highlights classical and modern positivity-preserving methods
Connects harmonic analysis, operator theory, and combinatorics
Includes applications to high-dimensional covariance estimation
Abstract
This survey contains a selection of topics unified by the concept of positive semi-definiteness (of matrices or kernels), reflecting natural constraints imposed on discrete data (graphs or networks) or continuous objects (probability or mass distributions). We put emphasis on entrywise operations which preserve positivity, in a variety of guises. Techniques from harmonic analysis, function theory, operator theory, statistics, combinatorics, and group representations are invoked. Some partially forgotten classical roots in metric geometry and distance transforms are presented with comments and full bibliographical references. Modern applications to high-dimensional covariance estimation and regularization are included.
| Positivity | |||
| G–K–R | G–K–R | G–K–R | |
| FitzGerald–Horn | Hiai, Bhatia–Elsner, | Hiai, G–K–R | |
| G–K–R | |||
| Monotonicity | |||
| G–K–R | G–K–R | G–K–R | |
| FitzGerald–Horn | Hiai, G–K–R | Hiai, G–K–R | |
| Convexity | |||
| G–K–R | G–K–R | G–K–R | |
| Hiai, G–K–R | Hiai, G–K–R | Hiai, G–K–R | |
| Super-additivity | |||
| G–K–R | G–K–R | G–K–R | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A panorama of positivity
Alexander Belton
Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
,
Dominique Guillot
University of Delaware, Newark, DE, USA
,
Apoorva Khare
Indian Institute of Science; Analysis and Probability Research Group; Bangalore, India
and
Mihai Putinar
University of California at Santa Barbara, CA, USA and Newcastle University, Newcastle upon Tyne, UK
[email protected], [email protected]
Abstract.
This survey contains a selection of topics unified by the concept of positive semi-definiteness (of matrices or kernels), reflecting natural constraints imposed on discrete data (graphs or networks) or continuous objects (probability or mass distributions). We put emphasis on entrywise operations which preserve positivity, in a variety of guises. Techniques from harmonic analysis, function theory, operator theory, statistics, combinatorics, and group representations are invoked. Some partially forgotten classical roots in metric geometry and distance transforms are presented with comments and full bibliographical references. Modern applications to high-dimensional covariance estimation and regularization are included.
Key words and phrases:
metric geometry, positive semidefinite matrix, Toeplitz matrix, Hankel matrix, positive definite function, completely monotone functions, absolutely monotonic functions, entrywise calculus, generalized Vandermonde matrix, Schur polynomials, symmetric function identities, totally positive matrices, totally non-negative matrices, totally positive completion problem, sample covariance, covariance estimation, hard / soft thresholding, sparsity pattern, critical exponent of a graph, chordal graph, Loewner monotonicity, convexity, and super-additivity
2010 Mathematics Subject Classification:
15-02, 26-02, 15B48, 51F99, 15B05, 05E05, 44A60, 15A24, 15A15, 15A45, 15A83, 47B35, 05C50, 30E05, 62J10
D.G. is partially supported by a University of Delaware Research Foundation grant, by a Simons Foundation collaboration grant for mathematicians, and by a University of Delaware Research Foundation Strategic Initiative grant. A.K. is partially supported by Ramanujan Fellowship SB/S2/RJN-121/2017 and MATRICS grant MTR/2017/000295 from SERB (Govt. of India), by grant F.510/25/CAS-II/2018(SAP-I) from UGC (Govt. of India), and by a Young Investigator Award from the Infosys Foundation.
Contents
-
3 Entrywise functions preserving positivity in all dimensions
-
4 Entrywise polynomials preserving positivity in fixed dimension
-
4.2 Schur polynomials; the sharp threshold bound for a single matrix
-
4.3 The threshold for all rank-one matrices: a Schur positivity result
-
4.6 Digression: Schur polynomials from smooth functions, and new symmetric function identities
-
4.7 Further applications: linear matrix inequalities, Rayleigh quotients, and the cube problem
-
5.2 Entrywise preservers of totally non-negative Hankel matrices
1. Introduction
Matrix positivity, or positive semidefiniteness, is one of the most wide-reaching concepts in mathematics, old and new. Positivity of a matrix is as natural as positivity of mass in statics or positivity of a probability distribution. It is a notion which has attracted the attention of many great minds. Yet, after at least two centuries of research, positive matrices still hide enigmas and raise challenges for the working mathematician.
The vitality of matrix positivity comes from its breadth, having many theoretical facets and also deep links to mathematical modelling. It is not our aim here to pay homage to matrix positivity in the large. Rather, the present survey, split for technical reasons into two parts, has a limited but carefully chosen scope.
Our panorama focuses on entrywise transforms of matrices which preserve their positive character. In itself, this is a rather bold departure from the dogma that canonical transformations of matrices are not those that operate entry by entry. Still, this apparently esoteric topic reveals a fascinating history, abundant characteristic phenomena and numerous open problems. Each class of positive matrices or kernels (regarding the latter as continuous matrices) carries a specific toolbox of internal transforms. Positive Hankel forms or Toeplitz kernels, totally positive matrices, and group-invariant positive definite functions all possess specific positivity preservers. As we see below, these have been thoroughly studied for at least a century.
One conclusion of our survey is that the classification of positivity preservers is accessible in the dimension-free setting, that is, when the sizes of matrices are unconstrained. In stark contrast, precise descriptions of positivity preservers in fixed dimension are elusive, if not unattainable with the techniques of modern mathematics. Furthermore, the world of applications cares much more about matrices of fixed size than in the free case. The accessibility of the latter was by no means a sequence of isolated, simple observations. Rather, it grew organically out of distance geometry, and spread rapidly through harmonic analysis on groups, special functions, and probability theory. The more recent and highly challenging path through fixed dimensions requires novel methods of algebraic combinatorics and symmetric functions, group representations, and function theory.
As well as its beautiful theoretical aspects, our interest in these topics is also motivated by the statistics of big data. In this setting, functions are often applied entrywise to covariance matrices, in order to induce sparsity and improve the quality of statistical estimators (see [72, 73, 114]). Entrywise techniques have recently increased in popularity in this area, largely because of their low computational complexity, which makes them ideal to handle the ultra high-dimensional datasets arising in modern applications. In this context, the dimensions of the matrices are fixed, and correspond to the number of underlying random variables. Ensuring that positivity is preserved by these entrywise methods is critical, as covariance matrices must be positive semidefinite. Thus, there is a clear need to produce characterizations of entrywise preservers, so that these techniques are widely applicable and mathematically justified. We elaborate further on this in the second part of the survey.
We conclude by remarking that, while we have tried to be comprehensive in our coverage of the field of matrix positivity and the entrywise calculus, our panorama is far from being complete. We apologize for any omissions.
2. From metric geometry to matrix positivity
2.1. Distance geometry
During the first decade of the 20th century, the concept of a metric space emerged from the works of Fréchet and Hausdorff, each having different and well-anchored roots, in function spaces and in set theory and measure theory. We cannot think today of modern mathematics and physics without referring to metric spaces, which touch areas as diverse as economics, statistics, and computer science. Distance geometry is one of the early and ever-lasting by-products of metric-space theory. One of the key figures of the Vienna Circle, Karl Menger, started a systematic study in the 1920s of the geometric and topological features of spaces that are intrinsic solely to the distance they carry. Menger published his findings in a series of articles having the generic name “Untersuchungen über allgemeine Metrik,” the first one being [99]; see also his synthesis [100]. His work was very influential in the decades to come [23], and by a surprising and fortunate stroke not often encountered in mathematics, Menger’s distance geometry has been resurrected in recent times by practitioners of convex optimization and network analysis [39, 95].
Let be a metric space. One of the naive, yet unavoidable, questions arising from the very beginning concerns the nature of operations which may be performed on the metric and which enhance various properties of the topological space . We all know that and , if , also satisfy the axioms of a metric, with the former making it bounded. Less well known is an observation due to Blumenthal, that the new metric space has the four-point property if : every four-point subset of can be embedded isometrically into Euclidean space [23, Section 49].
Metric spaces which can be embedded isometrically into Euclidean space, or into infinite-dimensional Hilbert space, are, of course, distinguished and desirable for many reasons. We owe to Menger a definitive characterization of this class of metric spaces. The core of Menger’s theorem, stated in terms of certain matrices built from the distance function (known as Cayley–Menger matrices) was slightly reformulated by Fréchet and cast in the following simple form by Schoenberg.
Theorem 2.1** (Schoenberg [120]).**
Let be an integer and let be a metric space. An -tuple of points , , …, in can be isometrically embedded into Euclidean space , but not into , if and only if the matrix
[TABLE]
is positive semidefinite with rank equal to .
Proof.
This is surprisingly simple. Necessity is immediate, since the Euclidean norm and scalar product in give that
[TABLE]
and the latter are the entries of a positive semidefinite Gram matrix of rank less than or equal to .
For the other implication, we consider first a full-rank matrix associated with a -tuple. The corresponding quadratic form
[TABLE]
is positive definite. Hence there exists a linear change of variables
[TABLE]
such that
[TABLE]
Interpreting as coordinates in , the standard simplex with vertices
[TABLE]
has the corresponding quadratic form (of distances) equal to . Now we perform the coordinate change . Specifically, set and let be the point with coordinates and if . Then one identifies distances:
[TABLE]
The remaining case with can be analyzed in a similar way, after taking an appropriate projection. ∎
In the conditions of the theorem, fixing a “frame” of points and letting the -th point float, one obtains an embedding of the full metric space into . This idea goes back to Menger, and it led, with Schoenberg’s touch, to the following definitive statement. Here and below, all Hilbert spaces are assumed to be separable.
Corollary 2.2** (Schoenberg [120], following Menger).**
A separable metric space can be isometrically embedded into Hilbert space if and only if, for every -tuple of points in , where , the matrix
[TABLE]
is positive semidefinite.
The notable aspect of the two previous results is the interplay between purely geometric concepts and matrix positivity. This will be a recurrent theme of our survey.
2.2. Spherical distance geometry
One can specialize the embedding question discussed in the previous section to submanifolds of Euclidean space. A natural choice is the sphere.
For two points and on the unit sphere , the rotationally invariant distance between them is
[TABLE]
where the angle between the two vectors is measured on a great circle and is always less than or equal to .
A straightforward application of the simple, but central, Theorem 2.1] yields the following result.
Theorem 2.3** (Schoenberg [120]).**
Let be a metric space and let be an -tuple of points in . For any integer , there exists an isometric embedding of into endowed with the geodesic distance but not if and only if
[TABLE]
and the matrix \bigl{[}\cos\rho(x_{j},x_{k})\bigr{]}_{j,k=1}^{n} is positive semidefinite of rank .
Indeed, the necessity is assured by choosing to be the origin in . In this case,
[TABLE]
The condition is also sufficient, by possibly adding an external point to the metric space, subject to the constraints that for all . The details can be found in [120].111An alternate proof of sufficiency is to note that is a Gram matrix of rank , hence equal to for some matrix with unit columns. Denoting these columns by , …, , the map is an isometry since and . Moreover, since has rank , the cannot all lie in a smaller-dimensional sphere.
2.3. Distance transforms
A notable step forward in the study of the existence of isometric embeddings of a metric space into Euclidean or Hilbert space was made by Schoenberg. In a series of articles [121, 123, 124, 136], he changed the set-theoretic lens of Menger, by initiating a harmonic-analysis interpretation of this embedding problem. This was a major turning point, with long-lasting, unifying, and unexpected consequences.
We return to a separable metric space and seek distance-function transforms which enhance the geometry of , to the extent that the new metric space \bigl{(}X,\phi(\rho)\bigr{)} is isometrically equivalent to a subspace of Hilbert space. Schoenberg launched this whole new chapter from the observation that the Euclidean norm is such that the matrix
[TABLE]
is positive semidefinite for any choice of points , …, in the ambient space. Once again, we see the presence of matrix positivity. While this claim may not be obvious at first sight, it is accessible once we recall a key property of Fourier transforms.
An even function is said to be positive definite if the complex matrix is positive semidefinite for any and any choice of points , …, . We will call a positive semidefinite kernel on in this case. (See [132] for a comprehensive survey of this class of maps.)
Bochner’s theorem [25] characterizes positive definite functions on as Fourier transforms of even positive measures of finite mass:
[TABLE]
Indeed,
[TABLE]
is a positive semidefinite kernel because it is the average over of the positive kernel . Since the Gaussian is the Fourier transform of itself (modulo constants), it turns out that it is a positive definite function on , whence has the same property as a function on . Taking one step further, the function is positive definite on any Hilbert space.
With this preparation we are ready for a second characterization of metric subspaces of Hilbert space.
Theorem 2.4** (Schoenberg [123]).**
A separable metric space can be embedded isometrically into Hilbert space if and only if the kernel
[TABLE]
is positive semidefinite for all .
Proof.
Necessity follows from the positive definiteness of the Gaussian discussed above. (We also provide an elementary proof below; see Lemma 5.7 and the subsequent discussion). To prove sufficiency, we recall the Menger–Schoenberg characterization of isometric subspaces of Hilbert space. We have to derive, from the positivity assumption, the positivity of the matrix
[TABLE]
Elementary algebra transforms this constraint into the requirement that
[TABLE]
By expanding as a power series in , and invoking the positivity of the exponential kernel, we see that
[TABLE]
for all . Hence the coefficient of is non-positive. ∎
The flexibility of the Fourier-transform approach is illustrated by the following application, also due to Schoenberg [123].
Corollary 2.5**.**
Let be a Hilbert space with norm . For every , the metric space is isometric to a subspace of a Hilbert space.
Proof.
Note first the identity
[TABLE]
where is a normalization constant. Consequently,
[TABLE]
Let . For points , , …, in and weights , , …, satisfying
[TABLE]
it holds that
[TABLE]
and the proof is complete. ∎
Several similar consequences of the Fourier-transform approach are within reach. For instance, Schoenberg observed in the same article that if the norm is raised to the power , where and , then is isometrically embeddable into Hilbert space.
2.4. Altering Euclidean distance
By specializing the theme of the previous section to Euclidean space, Schoenberg and von Neumann discovered an arsenal of powerful tools from harmonic analysis that were able to settle the question of whether Euclidean space equipped with the altered distance \phi\bigl{(}\|x-y\|\bigr{)} may be isometrically embedded into Hilbert space [122, 136]. The key ingredients are characterizations of Laplace and Fourier transforms of positive measures, that is, Bernstein’s completely monotone functions [17] and Bochner’s positive definite functions [25].
Here we present some highlights of the Schoenberg–von Neumann framework. First, we focus on an auxiliary class of distance transforms. A real continuous function is called positive definite in Euclidean space if the kernel
[TABLE]
is positive semidefinite. Bochner’s theorem and the rotation-invariance of this kernel prove that such a function is characterized by the representation
[TABLE]
where is a positive measure and
[TABLE]
with the normalized area measure on the unit sphere in ; see [122, Theorem 1]. By letting tend to infinity, one finds that positive definite functions on infinite-dimensional Hilbert space are precisely of the form
[TABLE]
with a positive measure on the semi-axis. Notice that positive definite functions in are not necessarily differentiable more than times, while those which are positive definite in Hilbert space are smooth and even complex analytic in the sector .
The class of functions which are continuous on , smooth on the open semi-axis , and such that
[TABLE]
was studied by S. Bernstein, who proved that they coincide with Laplace transforms of positive measures on :
[TABLE]
Such functions are called completely monotonic and have proved highly relevant for probability theory and approximation theory; see [17] for the foundational reference. Thus we have obtained a valuable equivalence.
Theorem 2.6** (Schoenberg).**
A function is completely monotone if and only if is positive definite on Hilbert space.
The direct consequences of this apparently innocent observation are quite deep. For example, the isometric-embedding question for altered Euclidean distances is completely answered via this route. The following results are from [122] and [136].
Theorem 2.7** (Schoenberg–von Neumann).**
Let be a separable Hilbert space with norm .
- (1)
For any integers , the metric space (\mathbb{R}^{d},\phi\bigl{(}\|\cdot\|\bigr{)}) may be isometrically embedded into if and only if for some . 2. (2)
The metric space (\mathbb{R}^{d},\phi\bigl{(}\|\cdot\|\bigr{)}) may be isometrically embedded into if and only if
[TABLE]
where is a positive measure on the semi-axis such that
[TABLE] 3. (3)
The metric space (H,\phi\bigl{(}\|\cdot\|\bigr{)}) may be isometrically embedded into if and only if
[TABLE]
where is a positive measure on the semi-axis such that
[TABLE]
In von Neumann and Schoenberg’s article [136], special attention is paid to the case of embedding a modified distance on the line into Hilbert space. This amounts to characterizing all screw lines in a Hilbert space : the continuous functions
[TABLE]
with the translation-invariance property
[TABLE]
In this case, the gauge function is such that and provides the isometric embedding of (\mathbb{R},\phi\bigl{(}|\cdot|\bigr{)}) into . Von Neumann seized the opportunity to use Stone’s theorem on one-parameter unitary groups, together with the spectral decomposition of their unbounded self-adjoint generators, to produce a purely operator-theoretic proof of the following result.
Corollary 2.8**.**
The metric space (\mathbb{R},\phi\bigl{(}|\cdot|)\bigr{)} isometrically embeds into Hilbert space if and only if
[TABLE]
where is a positive measure on satisfying
[TABLE]
Moreover, in the conditions of the corollary, the space (\mathbb{R},\phi\bigl{(}|\cdot|\bigr{)}) embeds isometrically into if and only if the measure consists of finitely many point masses, whose number is roughly ; see [136, Theorem 2] for the precise statement. To give a simple example, consider the function
[TABLE]
This is indeed a screw function, because
[TABLE]
Note that a screw line is periodic if and only if it is not injective. Furthermore, one may identify screw lines with period by the geometry of the support of the representing measure: this support must be contained in the lattice , where . Consequently, all periodic screw lines in Hilbert space have a gauge function such that
[TABLE]
where and ; see [136, Theorem 5].
2.5. Positive definite functions on homogeneous spaces
Having resolved the question of isometrically embedding Euclidean space into Hilbert space, a natural desire was to extend the analysis to other special manifolds with symmetry. This was done almost simultaneously by Schoenberg on spheres [125] and by Bochner on compact homogeneous spaces [26].
Let be a compact space endowed with a transitive action of a group and an invariant measure. We seek -invariant distance functions, and particularly those which identify with a subspace of a Hilbert space. To simplify terminology, we call the latter Hilbert distances.
The first observation of Bochner is that a -invariant symmetric kernel satisfies the Hilbert-space embeddability condition,
[TABLE]
for all choices of weights and points , if and only if is of the form
[TABLE]
where is a -invariant positive definite kernel and is a point of . One implication is clear. For the other, we start with a -invariant function subject to the above constraint and prove, using -invariance and integration over , the existence of a constant such that is a positive semidefinite kernel. This gives the following result.
Theorem 2.9** (Bochner [26]).**
Let be a compact homogeneous space. A continuous invariant function on is a Hilbert distance if and only if there exists a continuous, real-valued, invariant, positive definite kernel on and a point , such that
[TABLE]
Privileged orthonormal bases of -invariant functions, in the space associated with the invariant measure, provide a canonical decompositions of positive definite kernels. These generalized spherical harmonics were already studied by E. Cartan, H. Weyl and J. von Neumann; see, for instance [138]. We elaborate on two important particular cases.
Let be the unit torus, endowed with the invariant arc-length measure. A continuous positive definite function admits a Fourier decomposition
[TABLE]
If is further required to be rotation invariant, we find that
[TABLE]
where for all and because takes real values. Moreover, the series is Abel summable: . Therefore, a rotation-invariant Hilbert distance on the torus has the expression (after taking its square):
[TABLE]
These are the periodic screw lines (2.2) already investigated by von Neumann and Schoenberg.
As a second example, we follow Bochner in examining a separable, compact group . A real-valued, continuous, positive definite and -invariant kernel admits the decomposition
[TABLE]
where for all , and denote the characters of irreducible representations of . In conclusion, an invariant Hilbert distance on is characterized by the formula
[TABLE]
where and .
For details and an analysis of similar decompositions on more general homogeneous spaces, we refer the reader to [26].
The above analysis of positive definite functions on homogeneous spaces was carried out separately by Schoenberg in [125]. First, he remarks that a continuous, real-valued, rotationally invariant and positive definite kernel on the sphere has a distinguished Fourier-series decomposition with non-negative coefficients. Specifically,
[TABLE]
where , are the ultraspherical orthogonal polynomials, for all and . This decomposition is in accord with Bochner’s general framework, with the difference lying in Schoenberg’s elementary proof, based on induction on dimension. As with all our formulas concerning the sphere, represents the geodesic distance (arc length along a great circle) between two points.
To convince the reader that expressions in the cosine of the geodesic distance are positive definite, let us consider points , …, . The Gram matrix with entries
[TABLE]
is obviously positive semidefinite, with constant diagonal elements equal to . According to the Schur product theorem [129], all functions of the form , where is a non-negative integer, are therefore positive definite on the sphere.
At this stage, Schoenberg makes a leap forward and studies invariant positive definite kernels on , that is, functions which admit representations as above for all . His conclusion is remarkable in its simplicity.
Theorem 2.10** (Schoenberg [125]).**
A real-valued function is positive definite on all spheres, independent of their dimension, if and only if
[TABLE]
where for all and .
This provides a return to the dominant theme, of isometric embedding into Hilbert space.
Corollary 2.11**.**
The function is a Hilbert distance on if and only if
[TABLE]
where for all and .
However, there is much more to derive from Schoenberg’s theorem, once it is freed from the spherical context.
Theorem 2.12** (Schoenberg [125]).**
Let be a continuous function. If the matrix is positive semidefinite for all and all positive semidefinite matrices with entries in , then, and only then,
[TABLE]
where for all and .
Proof.
One implication follows from the Schur product theorem [129], which says that if the matrices and are positive semidefinite, then so is their entrywise product . Indeed, inductively setting , the -fold entrywise power, shows that every monomial preserves positivity when applied entrywise. That the same property holds for functions , with all , now follows from the fact that the set of positive semidefinite matrices forms a closed convex cone, for all .
For the non-trivial, reverse implication we restrict the test matrices to those with leading diagonal terms all equal to . By interpreting such a matrix as a Gram matrix, we identify points on the sphere , …, satisfying
[TABLE]
Then we infer from Schoenberg’s theorem that admits a uniformly convergent Taylor series with non-negative coefficients. ∎
We conclude this section by mentioning some recent avenues of research that start from Bochner’s theorem (and its generalization in 1940, by Weil, Povzner, and Raikov, to all locally compact abelian groups) and Schoenberg’s classification of positive definite functions on spheres. On the theoretical side, there has been a profusion of recent mathematical activity on classifying positive definite functions (and strictly positive definite functions) in numerous settings, mostly related to spheres [9, 10, 32, 141, 142, 144], two-point homogeneous spaces222Recall [137] that a metric space is -point homogeneous if, given finite sets , of equal size no more than , every isometry from to extends to a self-isometry of . This property was first considered by Birkhoff [21], and of course differs from the more common usage of the terminology of a homogeneous space , whose study by Bochner was mentioned above.[7, 8, 28], locally compact abelian groups and homogeneous spaces [45, 64], and products of these [15, 16, 63, 65, 67, 66].
Moreover, this line of work directly impacts applied fields. For instance, in climate science and geospatial statistics, one uses positive definite kernels and Schoenberg’s results (and their sequels) to study trends in climate behavior on the Earth, since it can be modelled by a sphere, and positive definite functions on characterize space-time covariance functions on it. See [62, 101, 108, 109] for more details on these applications. There is a natural connection to probability theory, through the work of Lévy; see e.g. [56]. Other applied fields include genomics and finance, through high-dimensional covariance estimation. We elaborate on this in Chapter 7 below.
There are several other applications of Schoenberg’s work on positive definite functions on spheres (his paper [125] has more than 160 citations) and we mention here just a few of them. Schoenberg’s results were used by Musin [102] to compute the kissing number in four dimensions, by an extension of Delsarte’s linear-programming method. Moreover, the results also apply to obtain new bounds on spherical codes [103], with further applications to sphere packing [35, 36, 37, 38]. There are also applications to approximating functions and interpolating data on spheres, pseudodifferential equations with radial basis functions, and Gaussian random fields.
Remark 2.13**.**
Another modern-day use of Schoenberg’s results in [125] is in Machine Learning; see [131, 133], for example. Given a real inner-product space and a function , an alternative notion of being positive definite is as follows: for any finite set of vectors , …, , the matrix
[TABLE]
is positive semidefinite. This is in contrast to the notion promoted by Bochner, Weil, Schoenberg, Pólya, and others, which concerns positivity of the matrix with entries . It turns out that every positive definite kernel on , given by
[TABLE]
for a function which is positive definite in this alternate sense, gives rise to a reproducing-kernel Hilbert space, which is a central concept in Machine Learning. We restrict ourselves here to mentioning that, in this setting, it is desirable for the kernel to be strictly positive definite; see [105] for further clarification and theoretical results along these lines.
2.6. Connections to harmonic analysis
Positivity and sharp continuity bounds for linear transformations between specific normed function spaces go hand in hand, especially when focusing on the kernels of integral transforms. The end of 1950s marked a fortunate condensation of observations, leading to a quasi-complete classification of preservers of positive or bounded convolution transforms acting on spaces of functions on locally compact abelian groups. In particular, these results can be interpreted as Schoenberg-type theorems for Toeplitz matrices or Toeplitz kernels. We briefly recount the main developments.
A groundbreaking theorem of the 1930s attributed to Wiener and Levy asserts that the pointwise inverse of a non-vanishing Fourier series with coefficients in exhibits the same summability behavior of the coefficient sequence. To be more precise, if is never zero and has the representation
[TABLE]
then its reciprocal has a representation of the same form:
[TABLE]
It was Gelfand [61] who in 1941 cast this permanence phenomenon in the general framework of commutative Banach algebras. Gelfand’s theory applied to the Wiener algebra of Fourier transforms of functions on the dual of the unit torus proves the following theorem.
Theorem 2.14** (Gelfand [61]).**
Let and let be an analytic function defined in a neighborhood of . Then .
The natural inverse question of deriving smoothness properties of inner transformations of Lebesgue spaces of Fourier transforms was tackled almost simultaneously by several analysts. For example, Rudin proved in 1956 [115] that a coefficient-wise transformation mapping the space into itself implies the analyticity of in a neighborhood of zero. In a similar vein, Rudin and Kahane proved in 1958 [84] that a coefficient-wise transformation which preserves the space of Fourier transforms of finite measures on the torus implies that is an entire function. In the same year, Kahane [83] showed that no quasi-analytic function (in the sense of Denjoy–Carleman) preserves the space and Katznelson [87] refined an inverse to Gelfand’s theorem above, by showing the semi-local analyticity of transformers of elements of subject to some support conditions.
Soon after, the complete picture emerged in full clarity. It was unveiled by Helson, Kahane, Katznelson and Rudin in an Acta Mathematica article [74]. Given a function defined on a subset of the complex plane, we say that operates on the function algebra , if for every with range contained in . The following metatheorem is proved in the cited article.
Theorem 2.15** (Helson–Kahane–Katznelson–Rudin [74]).**
Let be a locally compact abelian group and let denote its dual, and suppose both are endowed with their respective Haar measures. Let be a function satisfying .
- (1)
If is discrete and operates on , then is analytic in some neighborhood of the origin. 2. (2)
If is not discrete and operates on , then is analytic in . 3. (3)
If is not compact and operates on , then can be extended to an entire function.
Rudin refined the above results to apply in the case of various norms [117, 118], by stressing the lack of continuity assumption for the transformer in all results (similar in nature to the statements in the above theorem). From Rudin’s work we extract a highly relevant observation, à la Schoenberg’s theorem, aligned to the spirit of the present survey.
Theorem 2.16** (Rudin [116]).**
Suppose maps every positive semidefinite Toeplitz kernel with elements in into a positive semidefinite kernel:
[TABLE]
Then is absolutely monotonic, that is analytic on with a Taylor series having non-negative coefficients:
[TABLE]
The converse is obviously true by the Schur product theorem. The elementary proof, quite independent of the derivation of the metatheorem stated above, is contained in [116]. Notice again the lack of a continuity assumption in the hypotheses.
In fact, Rudin proves more, by restricting the test domain of positive semidefinite Toeplitz kernels to the two-parameter family
[TABLE]
with fixed so that is irrational and , such that . Rudin’s proof commences with a mollifier argument to deduce the continuity of the transformer, then uses a development in spherical harmonics very similar to the original argument of Schoenberg. We will resume this topic in Section 3.3, setting it in a wider context.
With the advances in abstract duality theory for locally convex spaces, it is not surprising that proofs of Schoenberg-type theorems should be accessible with the aid of such versatile tools. We will confine ourselves here to mentioning one pertinent convexity-theoretic proof of Schoenberg’s theorem, due to Christensen and Ressel [33]. (See also [34] for a complex sphere variant.)
Skipping freely over the details, the main observation of these two authors is that the multiplicatively closed convex cone of positivity preservers of positive semidefinite matrices of any size, with entries in , is closed in the product topology of , with a compact base defined by the normalization . The set of extreme points of is readily seen to be closed, and an elementary argument identifies it as the set of all monomials , where , plus the characteristic functions . An application of Choquet’s representation theorem now provides a proof of a generalization of Schoenberg’s theorem, by removing the continuity assumption in the statement.
3. Entrywise functions preserving positivity in all dimensions
3.1. History
With the above history to place the present survey in context, we move to its dominant theme: entrywise positivity preservers. In analysis and in applications in the broader mathematical sciences, one is familiar with applying functions to the spectrum of diagonalizable matrices: then . More formally, one uses the Riesz–Dunford holomorphic functional calculus to define for classes of matrices and functions .
Our focus in this survey will be on the parallel philosophy of entrywise calculus. To differentiate this from the functional calculus, we use the notation .
Definition 3.1**.**
Fix a domain and integers , . Let denote the set of Hermitian positive semidefinite matrices with all entries in .
A function acts entrywise on a matrix
[TABLE]
by setting
[TABLE]
Below, we allow the dimensions and to vary, while keeping the uniform notation .
We also let denote the matrix with each entry equal to one. Note that .
In this survey, we explore the following overarching question in several different settings.
Which functions preserve positive semidefiniteness when applied entrywise to a class of positive matrices?
This question was first asked by Pólya and Szegö in their well-known book [107]. The authors observed that Schur’s product theorem, together with the fact that the positive matrices form a closed convex cone, has the following consequence: if is any power series with non-negative Maclaurin coefficients that converges on a domain , then preserves positivity (that is, preserves positive semidefiniteness) when applied entrywise to positive semidefinite matrices with entries in . Pólya and Szegö then asked if there are any other functions that possess this property. As discussed above, Schoenberg’s theorem 2.12 provides a definitive answer to their question (together with the improvements by Rudin or Christensen–Ressel to remove the continuity hypothesis). Thanks to Pólya and Szegö’s observation, Schoenberg’s result may be considered as a rather challenging converse to the Schur product theorem.
In a similar vein, Rudin [116] observed that if one moves to the complex setting, then the conjugation map also preserves positivity when applied entrywise to positive semidefinite complex matrices. Therefore the maps
[TABLE]
preserve positivity when applied entrywise to complex matrices of all dimensions, again by the Schur product theorem. The same property is now satisfied by non-negative linear combinations of these functions. In [116], Rudin made this observation and conjectured, à la Pólya–Szegö, that these are all of the preservers. This was proved by Herz in 1963.
Theorem 3.2** (Herz [77]).**
Let denote the open unit disc in , and suppose . The entrywise map preserves positivity on \mathcal{P}_{n}\bigl{(}D(0,1)\bigr{)} for all , if and only if
[TABLE]
where for all , .
Akin to the above results by Schoenberg, Rudin, Christensen and Ressel, and Herz, we mention one more Schoenberg-type theorem, for matrices with positive entries. The following result again demonstrates the rigid principle that analyticity and absolute monotonicity follow from the preservation of positivity in all dimensions.
Theorem 3.3** (Vasudeva [134]).**
Let . Then preserves positivity on \mathcal{P}_{n}\bigl{(}(0,\infty)\bigr{)} for all , if and only if on , where for all .
3.2. The Horn–Loewner necessary condition in fixed dimension
The previous section contains several variants of a “dimension-free” result: namely, the classification of entrywise maps that preserve positivity on test sets of matrices of all sizes. In the next section, we discuss a dimension-free result that parallels Rudin’s work in [116], by approaching the problem via preservers of moment sequences for positive measures on the real line. In other words, we will work with Hankel instead of Toeplitz matrices.
In the later part of this survey, we focus on entrywise functions that preserve positivity when the test set consists of matrices of a fixed size. For both of these settings, the starting point is an important result first published by R. Horn (who in [80] attributes it to his PhD advisor C. Loewner).
Theorem 3.4** ([80]).**
Let be continuous. Fix a positive integer and suppose preserves positivity on \mathcal{P}_{n}\bigl{(}(0,\infty)\bigr{)}. Then ,
[TABLE]
and is a convex non-decreasing function on . Furthermore, if f\in C^{n-1}\bigl{(}(0,\infty)\bigr{)}, then whenever and .
This result and its variations are the focus of the present section.
Theorem 3.4 is remarkable for several reasons.
- (1)
Modulo variations, it remains to this day the only known criterion for a general entrywise function to preserve positivity in a fixed dimension. Later on, we will see more precise conclusions drawn when is a polynomial or a power function, but for a general function there are essentially no other known results. 2. (2)
While Theorem 3.4 is a fixed-dimension result, it can be used to prove some of the aforementioned dimension-free characterizations. For instance, if preserves positivity on \mathcal{P}_{n}\bigl{(}(0,\infty)\bigr{)} for all , then, by Theorem 3.4, the function is absolutely monotonic on . A classical result of Bernstein on absolutely monotonic functions now implies that is necessarily given by a power series with non-negative coefficients, which is precisely Vasudeva’s Theorem 3.3.
In the next section, we will outline an approach to prove a stronger version of Schoenberg’s theorem 2.12 (in the spirit of Theorem 2.16 by Rudin), starting from Theorem 3.3. 3. (3)
Theorem 3.4 is also significant because there is a sense in which it is sharp. We elaborate on this when studying polynomial and power-function preservers; see Chapters 4 and 6.
Remark 3.5**.**
There are other, rather unexpected consequences of Theorem 3.4 as well. It was recently shown that the key determinant computation underlying Theorem 3.4 can be generalized to yield a new class of symmetric function identities for any formal power series. The only such identities previously known were for the case . This is discussed in Section 4.6.
We next explain the steps behind the proof of the Horn–Loewner theorem 3.4. These also help in proving certain strengthenings of Theorem 3.4, which are mentioned below. In turn, these strengthenings additionally serve to clarify the nature of the Horn–Loewner necessary condition.
Proof of Theorem 3.4.
The proof by Loewner is in two steps. First he assumes to be smooth and shows the result by induction on . The base case of is immediate, and for the induction step one proceeds as follows. Fix , choose any vector with distinct coordinates, and define
[TABLE]
Then Loewner shows that
[TABLE]
(See Remark 3.5 above.)
Returning to the proof of Theorem 3.4 for smooth functions: apply the above treatment not to but to , where . By the Schur product theorem, satisfies the hypotheses, whence for . Taking , by L’Hôpital’s rule we obtain
[TABLE]
Finally, the induction hypothesis implies that , , …, are non-negative at , whence , …, . It follows that for all , and hence, , as desired.
Remark 3.6**.**
The above argument is amenable to proving more refined results. For example, it can be used to prove the positivity of the first non-zero derivatives of a smooth preserver ; see Theorem 3.10.
The second step of Loewner’s proof begins by using mollifiers. Suppose is continuous; approximate it by a mollified family as . Thus is smooth and its first derivatives are non-negative on . By the mean-value theorem for divided differences, this implies that the divided differences of each , of orders up to are non-negative. Since is continuous, the same holds for .
Now one invokes a rather remarkable result by Boas and Widder [24], which can be viewed as a converse to the mean-value theorem for divided differences. It asserts that given an integer and an open interval , if all th order “equi-spaced” forward differences (whence divided differences) of a continuous function are non-negative on , then is times differentiable on ; moreover, is continuous and convex on , with non-decreasing left- and right-hand derivatives. Applying this result for each concludes the proof of Theorem 3.4. ∎
Note that this proof only uses matrices of the form , and the arguments are all local. Thus it is unsurprising that strengthened versions of the Horn–Loewner theorem can be found in the literature; see [12, 71], for example. We present here the stronger of these variants.
Theorem 3.7** (See [12, Section 3]).**
Suppose , , and . Fix and an integer , and define . Suppose for all , and also that for all Hankel matrices , with , such that . Then the conclusions of Theorem 3.4 hold.
Beyond the above strengthenings, the notable feature here is that the continuity hypothesis has been removed, akin to the Rudin and Christensen–Ressel results. We reproduce here an elegant argument to show continuity; this can be found in Vasudeva’s paper [134], and uses only the test set . By considering for with , it follows that is non-negative and non-decreasing on . One also shows that is either identically zero or never zero on . In the latter case, considering for shows that is multiplicatively mid-convex: the function
[TABLE]
is midpoint convex and locally bounded on the interval . Now the following classical result [113, Theorem 71.C] shows that is continuous on , so is continuous on .
Proposition 3.8**.**
Let be a convex open set in a real normed linear space. If is midpoint convex on and bounded above in an open neighborhood of a single point in , then is continuous, so convex, on .
We now move to variants of the Horn–Loewner result. Notice that Theorems 3.4 and 3.7 are results for arbitrary positivity preservers . When more is known about , such as smoothness or even real analyticity, stronger conclusions can be drawn from smaller test sets of matrices. A recent variant is the following lemma, shown by evaluating at matrices and using the invertibility of “generic” generalized Vandermonde matrices.
Lemma 3.9** **(Belton–Guillot–Khare–Putinar [11] and
Khare–Tao [89]).
Let and . Suppose is a convergent power series on that is positivity preserving entrywise on rank-one matrices in . Further assume that for some .
- (1)
If , then we have for at least values of . (In particular, the first non-zero Maclaurin coefficients of , if they exist, must be positive.) 2. (2)
If instead , then we have for at least values of and at least values of . (In particular, if is a polynomial, then the first non-zero coefficients and the last non-zero coefficients of , if they exist, are all positive.)
Notice that this lemma (a) talks about the derivatives of at [math] and not in ; and moreover, (b) considers not the first few derivatives, but the first few non-zero derivatives. Thus, it is morally different from the preceding two theorems, and one naturally seeks a common unification of these three results. This was recently achieved:
Theorem 3.10** (Khare [88]).**
Let , and let be smooth. Fix integers and , with if , and such that has non-zero derivatives at of order at least . Now let
[TABLE]
suppose further that
[TABLE]
are the lowest orders (above ) of the first non-zero derivatives of at .
Also fix distinct scalars , …, , and let . If for all , then the derivative is non-negative whenever .
Notice that varying allows one to control the number of initial derivatives versus the number of subsequent non-zero derivatives of smallest order. In particular, if , then the result implies the “stronger” Horn–Loewner theorem 3.7 (and so Theorem 3.4) pointwise at every . At the other extreme is the special case of (at any ), which strengthens the conclusions of Theorems 3.4 and 3.7 for smooth functions.
Corollary 3.11**.**
Suppose , , , , and are as in Theorem 3.10. If for all , then the first non-zero derivatives of at are positive.
Remark 3.12**.**
Theorem 3.10 further clarifies the nature of the Horn–Loewner result and its proof. The reduction from arbitrary functions, to continuous functions, to smooth functions, requires an open domain , in order to use mollifiers, for example. However, the result for smooth functions actually holds pointwise, as shown by Theorem 3.10.
The proof of Theorem 3.10 combines novel arguments together with the previously mentioned techniques of Loewner. The refinement of the determinant computations (3.1) is of particular note; see Theorem 4.20 and its consequence, Theorem 4.22.
3.3. Schoenberg redux: moment sequences and Hankel
matrices
In this section, we outline another approach to proving Schoenberg’s theorem 2.12, which yields a stronger version parallel to the strengthening by Rudin of Theorem 2.16. The present section reveals connections between positivity preservers, totally non-negative Hankel matrices, moment sequences of positive measures on the real line, and also a connection to semi-algebraic geometry.
We begin with Rudin’s Theorem 2.16 and the family (2.5). Notice that the positive definite sequences in (2.5) give rise to the Toeplitz matrices with entry equal to \alpha+\beta\cos\bigl{(}(j-k)\theta\bigr{)}. From the elementary identity
[TABLE]
it follows that these Toeplitz matrices have rank at most three:
[TABLE]
where
[TABLE]
In particular, Rudin’s work (see Theorem 2.16 and the subsequent discussion) implies the following result.
Proposition 3.13**.**
Let such that is irrational. An entrywise map preserves positivity on the set of Toeplitz matrices
[TABLE]
if and only if is a convergent power series on , with for all .
Thus, one can significantly reduce the set of test matrices.
Proof.
Given , let the restriction . Observe from the discussion following Theorem 2.16 that Rudin’s work explicitly shows the result for , whence for any by a change of variables. Thus,
[TABLE]
Given , it follows by the identity theorem that for all . Hence (which was Rudin’s ), now on all of . ∎
In a parallel vein to Rudin’s results and Proposition 3.13, the following strengthening of Schoenberg’s result can be shown, using a different (and perhaps more elementary) approach than those of Schoenberg and Rudin.
Theorem 3.14** **(Belton–Guillot–Khare–Putinar
[12]).
Suppose and . Then the following are equivalent for a function .
- (1)
The entrywise map preserves positivity on , for all . 2. (2)
The entrywise map preserves positivity on the Hankel matrices in of rank at most , for all . 3. (3)
The function is real analytic on and absolutely monotonic on . In other words, on , with .
Remark 3.15**.**
Recall the alternate notion of positive definite functions discussed in Remark 2.13. In [105] and related works, Pinkus and other authors study this alternate notion of positive definite functions on . Notice that such matrices form precisely the set of positive semidefinite symmetric matrices of rank at most . In particular, Theorem 3.14 and the far earlier 1959 paper [116] of Rudin both provide a characterization of these functions, on every Hilbert space of dimension or more.
Parallel to the discussions of the proofs of Schoenberg’s and Rudin’s results (see the previous chapter), we now explain how to prove Theorem 3.14. Clearly, in the theorem. We first outline how to weaken the condition even further and still imply . The key idea is to consider moment sequences of certain non-negative measures on the real line. This parallels Rudin’s considerations of Fourier–Stieltjes coefficients of non-negative measures on the circle.
Definition 3.16**.**
A measure with support in is said to be admissible if on , and all moments of exist and are finite:
[TABLE]
The sequence \mathbf{s}(\mu):=\bigl{(}s_{k}(\mu)\bigr{)}_{k=0}^{\infty} is termed the moment sequence of . Corresponding to and this moment sequence is the moment matrix of :
[TABLE]
note that is a semi-infinite Hankel matrix. Finally, a function acts entrywise on moment sequences, to yield real sequences:
[TABLE]
We are interested in understanding which entrywise functions preserve the space of moment sequences of admissible measures. The connection to positive semidefinite matrices is made through Hamburger’s theorem, which says that a real sequence is the moment sequence of an admissible measure on if and only if every (finite) principal minor of the moment matrix is positive semidefinite. For simplicity, this last will be reformulated below to saying that is positive semidefinite.
The weakening of Theorem 3.14(2) is now explained: it suffices to consider the reduced test set of those Hankel matrices, which arise as the moment matrices of admissible measures supported at three points. Henceforth, let denote the Dirac probability measure supported at . It is not hard to verify that the -point measure has Hankel matrix with rank no more than :
[TABLE]
Thus, a further strengthening of Schoenberg’s result is as follows.
Theorem 3.17** **(Belton–Guillot–Khare–Putinar
[12]).
In the setting of Theorem 3.14, the three assertions contained therein are also equivalent to
- (4)
For each measure
[TABLE]
there exists an admissible measure on such that f\bigl{(}s_{k}(\mu)\bigr{)}=s_{k}(\sigma_{\mu}) for all .
In fact, we will see in Section 3.4 below that this assertion (4) can be simplified to just assert that is positive semidefinite, and so completely avoid the use of Hamburger’s theorem.
We now discuss the proof of these results, working with for ease of exposition. The first observation is that the strengthening of the Horn–Loewner theorem 3.7, together with the use of Bernstein’s theorem (see remark (2) following Theorem 3.4), implies the following “stronger” form of Vasudeva’s theorem 3.3:
Theorem 3.18** (see [12]).**
Suppose and . Also fix . The following are equivalent:
- (1)
The entrywise map preserves positivity on for all . 2. (2)
The entrywise map preserves positivity on all moment matrices for . 3. (3)
The function equals a convergent power series for all , with the Maclaurin coefficients for all .
Notice that the test matrices in assertion (2) are all Hankel, and of rank at most two. This severely weakens Vasudeva’s original hypotheses.
Now suppose the assertion in Theorem 3.17(4) holds. By the preceding result, is given on by an absolutely monotonic function . The next step is to show that is continuous. For this, we will crucially use the following “integration trick”. Suppose for each admissible measure as in (3.4), there is a non-negative measure supported on such that f\bigl{(}s_{k}(\mu)\bigr{)}=s_{k}(\sigma_{\mu}) for all . (Note here that it is not immediate that the support is contained in .)
Now let be a polynomial that takes non-negative values on . Then,
[TABLE]
Remark 3.19**.**
For example, suppose for some . If , where and , , , then the inequality (3.5) gives that
[TABLE]
It is not clear a priori how to deduce this inequality using the fact that preserves matrix positivity and the Hankel moment matrix of . The explanation, which we provide in Section 3.4 below, connects moment problems, matrix positivity, and real algebraic geometry.
We now outline how (3.5) can be used to prove of the continuity of . First note that for as above and all . This fact and the easy observation that is bounded on compact subsets of together imply that all moments of are uniformly bounded. From this we deduce that is necessarily supported on .
The inequality (3.5) now gives the left-continuity of at , for every . Fix , and let
[TABLE]
Applying (3.5) to the polynomials , we deduce that
[TABLE]
Letting , the left continuity of at follows. Similarly, to show that is right continuous at , we apply the integral trick to and to instead of .
Having shown continuity, to prove the stronger Schoenberg theorem, we next assume that is smooth on . For all , define the function
[TABLE]
The function satisfies the estimates
[TABLE]
This is shown by another use of the integration trick (3.5), this time for the polynomials for all . In turn, the estimates (3.6) lead to showing that is real analytic on , for all . Now composing for with the function shows that is real analytic on and agrees with on . This concludes the proof for smooth functions.
Finally, to pass from smooth functions to continuous functions, we again use a mollified family as . Each is the restriction of an entire function, say , and the family forms a normal family on each open disc . It follows from results by Montel and Morera that converges uniformly to a function on each closed disc , and is analytic. Since restricts to on , it follows that is necessarily also real analytic on , and we are done.
3.4. The integration trick, and positivity
certificates
Observe that the inequality (3.5) can be written more generally as follows.
Given a polynomial which takes non-negative values on , as well as a positive semidefinite Hankel matrix , we have that
[TABLE]
As shown in (3.5), this assertion is clear via an application of Hamburger’s theorem. We now demonstrate how the assertion can instead be derived from first principles, with interesting connections to positivity certificates.
First note that the inequality (3.7) holds if is the square of a polynomial. For instance, if on , then
[TABLE]
where and . The non-negativity of (3.8) now follows immediately from the positivity of the matrix . The same reasoning applies if is a sum of squares of polynomials, or even the limit of a sequence of sums of squares. Thus, one approach to showing the inequality (3.7) for an arbitrary polynomial which is non-negative on is to seek a limiting sum-of-squares representation, which is also known as a positivity certificate, for .
If a -variate real polynomial is a sum of squares of real polynomials, then it is clearly non-negative on , but the converse is not true for .333This is connected to semi-algebraic geometry and to Hilbert’s seventeenth problem: recall the famous result of Motzkin that there are non-negative polynomials on that are not sums of squares, such as . Such phenomena have been studied in several settings, including polytopes (by Farkas, Handelman, and Pólya) and more general semi-algebraic sets (by Putinar, Schmüdgen, Stengel, Vasilescu, and others). Even when , while a sum-of-squares representation is an equivalent characterization for one-variable polynomials that are non-negative on , here we are working on the compact semi-algebraic set . We now give three proofs of the existence of such a positivity certificate in the setting used above.
Proof 1.
A result of Berg, Christensen, and Ressel (see the end of [14]) shows more generally that, for every dimension , any non-negative polynomial on has a limiting sum-of-squares representation. ∎
Proof 2.
The only polynomials used in proving the stronger form of Schoenberg’s theorem, Theorems 3.14 and 3.17, appear following (3.6):
[TABLE]
Each of these polynomials is composed of factors of the form , so it suffices to produce a limiting sum-of-squares representation for these two polynomials on . Note that
[TABLE]
and so on. Adding the first equations shows that is a sum-of-squares polynomial for all . Taking finishes the proof. ∎
Proof 3.
In fact, for any and any compact set , if is a non-negative continuous function on , then has a positivity certificate. The Stone–Weierstrass theorem gives a sequence of polynomials which converges to , and the squares of these polynomials then provide the desired limiting representation for . This is a simpler proof than Proof 1 from [14], but the convergence here is uniform, whereas the convergence in [14] is stronger. ∎
Remark 3.20**.**
In (3.5), we used , which was positive semidefinite by assumption. The previous discussion shows that Theorem 3.17(4) can be further weakened, by requiring only that is positive semidefinite, as opposed to being equal to for some admissible measure . Hence we do not require Hamburger’s theorem in order to prove the strengthening of Schoenberg’s theorem that uses the test set of low-rank Hankel matrices.
3.5. Variants of moment-sequence transforms
We now present a trio of results on functions which preserve moment sequences.
For , let denote the set of moment sequences corresponding to admissible measures with support in . We say that maps into , where , , if for every admissible measure with support in there exists an admissible measure with support in such that
[TABLE]
where is the th-power moment of , as in Definition 3.16.
Theorem 3.21**.**
A function maps into itself if and only if is the restriction to of an absolutely monotonic entire function.
Theorem 3.22**.**
A function maps into itself if and only if is absolutely monotonic on and .
Theorem 3.23**.**
A function maps into if and only if there exists an absolutely monotonic entire function such that
[TABLE]
It is striking to observe the possibility of a discontinuity at the origin which may occur in the latter two of these three theorems.
We will content ourselves here with sketching the proof of the second result. For the others, see [12], noting that the first of the results follows from Theorems 3.14 and 3.17 for .
Proof of Theorem 3.22.
Note that the moment matrix corresponding to an element of has a zero entry if and only if for some . This and the Schur product theorem give one implication.
For the converse, suppose preserves . Fix finitely many scalars , and an integer , and set
[TABLE]
where and . If then the integration trick (3.5), but working on , shows that the forward finite differences of alternate in sign:
[TABLE]
so . As this holds for all , and all , it follows that is completely monotonic. The weak density of measures of the form , together with Bernstein’s theorem (2.1), gives that is completely monotonic on for every completely monotonic function . Finally, a theorem of Lorch and Newman [96, Theorem 5] now gives that is absolutely monotonic. ∎
3.6. Multivariable positivity preservers and moment families
We now turn to the multivariable case, and begin with two results of FitzGerald, Micchelli, and Pinkus [52]. We first introduce some notation and a piece of terminology.
Fix and an integer , and let
[TABLE]
For any function , we have the matrix
[TABLE]
We say that is real positivity preserving if
[TABLE]
where, as above is the collection of positive semidefinite matrices with real entries. Similarly, we say that is positivity preserving if
[TABLE]
where is the collection of positive semidefinite matrices with complex entries. Finally, recall that a function is said to be real entire if there exists an entire function such that . We will also use the multi-index notation
[TABLE]
The following theorems are natural extensions of Schoenberg’s theorem and Herz’s theorem, respectively.
Theorem 3.24** ([52, Theorem 2.1]).**
Let , where . Then is real positivity preserving if and only is real entire of the form
[TABLE]
where for all .
Theorem 3.25** ([52, Theorem 3.1]).**
Let , where . Then is positivity preserving if and only is of the form
[TABLE]
where for all , and the power series converges absolutely for all .
We now consider the notion of moment family for measures on . As above, a measure on is said to be admissible if it is non-negative and has moments of all orders. Given such a measure , we define the moment family
[TABLE]
In line with the above, we let denote the set of all moment families of admissible measures supported on .
Note that a measure is supported in if and only if its moment family is uniformly bounded:
[TABLE]
Theorem 3.26** ([12, Theorem 8.1]).**
A function maps \mathcal{M}\bigl{(}[-1,1]^{d}\bigr{)} to itself if and only if is absolutely monotonic and entire.
Proof.
Since can be identified with , the forward implication follows from the one-dimensional result, Theorem 3.21.
For the converse, we use the fact [112] that a collection of real numbers is an element of \mathcal{M}\bigl{(}[-1,1]^{d}\bigr{)} if and only if the weighted Hankel-type kernels on
[TABLE]
are positive semidefinite, where
[TABLE]
with in the th position. Now suppose is absolutely monotonic and entire; given a family subject to these positivity constraints, we have to verify that the family satisfies them as well.
Theorem 3.14 gives that and are positive semidefinite, so we must show that
[TABLE]
is positive semidefinite for , …, . As is absolutely monotonic and entire, it suffices to show that
[TABLE]
is positive semidefinite for any , but this follows from the Schur product theorem: if , then
[TABLE]
We next consider characterizations of real-valued multivariable functions which map tuples of moment sequences to moment sequences.
Let , …, . A function acts on tuples of moment sequences of (admissible) measures as follows:
[TABLE]
Given , a function is absolutely monotonic if is continuous on , and for all interior points and , the mixed partial derivative exists and is non-negative, where
[TABLE]
With this definition, the multivariable analogue of Bernstein’s theorem is as one would expect; see [27, Theorem 4.2.2].
To proceed further, it is necessary to introduce the notion of a facewise absolutely monotonic function on . Observe that the orthant is a convex polyhedron, and is therefore the disjoint union of the relative interiors of its faces. These faces are in one-to-one correspondence with subsets of :
[TABLE]
note that this face has relative interior .
Definition 3.27**.**
A function is facewise absolutely monotonic if, for every , there exists an absolutely monotonic function on which agrees with on .
Thus a facewise absolutely monotonic function is piecewise absolutely monotonic, with the pieces being the relative interiors of the faces of the orthant . See [12, Example 8.4] for further discussion. In the special case , this broader class of functions (than absolutely monotonic functions on ) coincides precisely with the maps which are absolutely monotonic on and have a possible discontinuity at the origin, as in Theorem 3.22 above.
This definition allows us to characterize the preservers of -tuples of elements of \mathcal{M}\bigl{(}[0,1]\bigr{)}; the preceding observation shows that Theorem 3.22 is precisely the case.
Theorem 3.28** ([12, Theorem 8.5]).**
Let , where the integer . The following are equivalent.
- (1)
* maps into .* 2. (2)
* is facewise absolutely monotonic, and the functions are such that on whenever .* 3. (3)
* is such that*
[TABLE]
for all , and there exists some such that the products are distinct for all and maps \mathcal{M}\bigl{(}\{1,z_{1}\}\bigr{)}\times\cdots\times\mathcal{M}\bigl{(}\{1,z_{m}\}\bigr{)}\cup\mathcal{M}(\{0,1\})^{m} to .
The heart of Theorem 3.28 can be deduced from the following result on positivity preservation on tuples of low-rank Hankel matrices. In a sense, it is the multi-dimensional generalization of the ‘stronger Vasudeva theorem’ 3.18.
Fix , an integer and a point with distinct products, as in Theorem 3.28(3). For all , let
[TABLE]
where .
Theorem 3.29** ([12, Theorem 8.6]).**
If preserves positivity on \mathcal{P}_{2}\bigl{(}(0,\rho)\bigr{)}^{m} and for all , then is absolutely monotonic and is the restriction of an analytic function on the polydisc .
The notion of facewise absolute monotonicity emerges from the study of positivity preservers of tuples of moment sequences. If one focuses instead on maps preserving positivity of tuples of all positive semidefinite matrices, or even all Hankel matrices, then this richer class of maps does not appear.
Proposition 3.30**.**
Suppose and . The following are equivalent.
- (1)
* preserves positivity on the space of -tuples of Hankel matrices with entries in .* 2. (2)
* is absolutely monotonic on .* 3. (3)
* preserves positivity on the space of -tuples of all matrices with entries in .*
Proof.
Clearly , so suppose (1) holds. It follows from Theorem 3.29 that is absolutely monotonic on the domain and agrees there with an analytic function . To see that on , we use induction on , with the case being left as an exercise (see [12, Proof of Proposition 7.3]).
Now suppose , let and define
[TABLE]
Choosing such that , it follows that
[TABLE]
where the and entries are as claimed by the induction hypothesis. The determinants of the first and last principal minors now give that
[TABLE]
whence . ∎
Having considered functions defined on the positive orthant, we now look at the situation for functions defined over the whole of .
Theorem 3.31** ([12, Theorem 8.9]).**
Suppose for some integer . The following are equivalent.
- (1)
* maps \mathcal{M}\bigl{(}[-1,1]\bigr{)}^{m} into .* 2. (2)
The function is real positivity preserving. 3. (3)
The function is absolutely monotonic on and agrees with an entire function on .
As before, the proof reveals that verifying positivity preservation for tuples of low-rank Hankel matrices suffices. The following notation and corollary make this precise.
For all , let \mathcal{M}_{u}:=\mathcal{M}\bigl{(}\{-1,u,1\}\bigr{)} and
[TABLE]
Corollary 3.32** ([12, Theorem 8.10]).**
The hypotheses in Theorem 3.31 are also equivalent to the following.
- (4)
There exist and such that maps
[TABLE]
into .
4. Entrywise polynomials preserving positivity in fixed
dimension
Having discussed at length the dimension-free setting, we now turn our attention to functions that preserve positivity in a fixed dimension . This is a natural question from the standpoint of both theory as well as applications. This latter connection to applied fields and to high-dimensional covariance estimation will be explained below in Chapter 7.
Mathematically, understanding the functions such that for fixed , is a non-trivial and challenging refinement of Schoenberg’s 1942 theorem. A complete characterization was found for by Vasudeva [134]:
Theorem 4.1** (Vasudeva [134]).**
Given a function , the entrywise map preserves positivity on \mathcal{P}_{2}\bigl{(}(0,\infty)\bigr{)} if and only is non-negative, non-decreasing, and multiplicatively mid-convex:
[TABLE]
In particular, is either identically zero or never zero on , and is also continuous.
On the other hand, if , then such a characterization remains open to date. As mentioned above, perhaps the only known result for general entrywise preservers is the Horn–Loewner theorem 3.4 (or its more general variants such as Theorem 3.10).
In light of this challenging scarcity of results in fixed dimension, a strategy adopted in the literature has been to further refine the problem, in one of several ways:
- (1)
Restrict the class of functions, while operating entrywise on all of (over some given domain , say or for ). For example, in this survey we consider possibly non-integer power functions, polynomials and power series, and even linear combinations of real powers. 2. (2)
Restrict the class of matrices and study entrywise functions over this class in a fixed dimension. For instance, popular sub-classes of matrices include positive matrices with rank bounded above, or with a given sparsity pattern (zero entries), or classes such as Hankel or Toeplitz matrices; or intersections of these classes. For instance, in discussing the Horn–Loewner and Schoenberg–Rudin results, we encountered Toeplitz and Hankel matrices of low rank. 3. (3)
Study the problem under both of the above restrictions.
In this chapter we begin with the first of these restrictions. Specifically, we will study polynomial maps that preserve positivity, when applied entrywise to . Recall from the Schur product theorem that if the polynomial has only non-negative coefficients then preserves positivity on for every dimension . It is natural to expect that if one reduces the test set, from all dimensions to a fixed dimension, then the class of polynomial preservers should be larger. Remarkably, until 2016 not a single example was known of a polynomial positivity preserver with a negative coefficient. Then, in quick succession, the two papers [11, 89] provided a complete understanding of the sign patterns of entrywise polynomial preservers of . The goal of this chapter is to discuss some of the results in these works.
4.1. Characterizations of sign patterns
Until further notice, we work with entrywise polynomial or power-series maps of the form
[TABLE]
and typically non-zero, which preserve for various . Our goal is to try and understand their sign patterns, that is, which can be negative. The first observation is that as soon as contains the interval for any , by the Horn–Loewner type necessary conditions in Lemma 3.9, the lowest non-zero coefficients of must be positive.
The next observation is that if , then, in general, there is no structured classification of the sign patterns of the power series preservers on . For example, let be a non-negative integer; the polynomials
[TABLE]
do not preserve positivity entrywise on \mathcal{P}_{N}\bigl{(}(-\rho,\rho)\bigr{)} for any . This may be seen by taking and for some , and noting that
[TABLE]
Similarly, if one allows complex entries and uses higher-order roots of unity, such negative results (vis-a-vis Lemma 3.9) are obtained for complex matrices.
Given this, in the rest of the chapter we will focus on for .444That said, we also briefly discuss the one situation in which our results do apply more generally, even to (an open complex disc). As mentioned above, if as in (4.1) entrywise preserves positivity even on rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} then its first non-zero Maclaurin coefficients are positive. Our goal is to understand if any other coefficient can be negative (and if so, which of them). This has at least two ramifications:
- (1)
It would yield the first example of a polynomial entrywise map (for a fixed dimension) with at least one negative Maclaurin coefficient. Recall the contrast to Schoenberg’s theorem in the dimension-free setting. 2. (2)
This also yields the first example of a polynomial (or power series) that entrywise preserves positivity on but not . In particular it would imply that the Horn–Loewner type necessary condition in Lemma 3.9(1) is “sharp”.
These goals are indeed achieved in the particular case , …, in [11], and subsequently, for arbitrary in [89]. (In fact, in the latter work the need not even be integers; this is discussed below.) Here is a ‘first’ result along these lines. Henceforth we assume that ; we will relax this assumption midway through Section 4.5 below.
Theorem 4.2** **(Belton–Guillot–Khare–Putinar [11] and
Khare–Tao [89]).
Suppose and are non-negative integers, and , , …, are positive scalars. Given for all , there exists a power series
[TABLE]
such that is convergent on , the entrywise map preserves positivity on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} and has the same sign (positive, negative or zero) as for all .
Outline of proof.
The claim is such that it suffices to show the result for exactly one . Indeed, given the claim, for each there exists such that preserves positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} whenever . Now let for all , and define
[TABLE]
Then it may be verified that , and hence has the desired properties. ∎
Thus it suffices to show the existence of a polynomial positivity preserver on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} with precisely one negative Maclaurin coefficient, the leading term. In the next few sections we explain how to achieve this goal. In fact, one can show a more general result, for real powers as well.
Theorem 4.3** (Khare–Tao [89]).**
Fix an integer and real exponents in the set . Suppose , , …, as above. Then there exists such that the function
[TABLE]
preserves positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}. [Here and below, we set .]
The restriction of the lying in is a technical one that is explained in a later chapter on the study of entrywise powers preserving positivity on \mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)}; see Theorem 6.1.
Remark 4.4**.**
A stronger result, Theorem 4.15, which also applies to real powers, is stated below. We mention numerous ramifications of the results in this chapter following that result.
The proofs of the preceding two theorems crucially use type- representation theory (specifically, a family of symmetric functions) that naturally emerges here via generalized Vandermonde determinants. These symmetric homogeneous polynomials are introduced and used in the next section.
For now, we explain how Theorem 4.3 helps achieve a complete classification of the sign patterns of a family of generalised power series, of the form
[TABLE]
but without the requirement that that exponents are non-decreasing. In this generality, one first notes that the Horn–Loewner-type Lemma 3.9 still applies: if some coefficient , then there must be at least indices such that and . The following result shows that once again, this necessary condition is best possible.
Theorem 4.5** **(Classification of sign patterns for real-power series
preservers, Khare–Tao [89]).
Fix an integer , and distinct real exponents , , …in . Suppose is a choice of sign for each , such that if then for at least choices of such that . Given any , there exists a choice of coefficients with sign such that
[TABLE]
is convergent on and preserves positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}.
Notice this result is strictly more general than Theorem 4.2, because the sequence , , can contain an infinite decreasing sequence of positive non-integer powers, for example, all rational elements of . Thus Theorem 4.5 covers a larger class of functions than even Hahn or Puiseux series.
Theorem 4.5 is derived from Theorem 4.3 in a similar fashion to the proof of Theorem 4.2, and we refer the reader to [89, Section 1] for the details.
4.2. Schur polynomials; the sharp threshold bound for a single
matrix
We now explain how to prove Theorem 4.3. The present section will discuss the case of integer powers, and end by proving the theorem for a single ‘generic’ rank-one matrix. In the following section we show how to extend the results to all rank-one matrices for integer powers. The subsequent section will complete the proof for real powers, and then for matrices of all ranks.
The key new tool that is indispensable to the following analysis is that of Schur polynomials. These can be defined in a number of equivalent ways; we refer the reader to [30] for more details, including the equivalence of these definitions shown using ideas of Karlin–Macgregor, Lindström, and Gessel–Viennot. For our purposes the definition of Cauchy is the most useful:
Definition 4.6**.**
Given non-negative integers and , let
[TABLE]
and define .
Given a vector and a non-negative integer , let , and let be the matrix with entry .
The Schur polynomial in variables , …, of degree is given by
[TABLE]
Notice that the numerator is a generalized Vandermonde determinant, so a homogeneous and alternating polynomial, while the denominator is the usual Vandermonde determinant in the indeterminates . Hence their ratio is a homogeneous symmetric polynomial in . It follows that Schur polynomials are well defined when working over any commutative unital ring.
Schur polynomials are an extremely well-studied family of symmetric functions. Their appeal lies in the important observation that they are the characters of all irreducible (finite-dimensional) polynomial representations of the complex Lie group (or of the Lie algebra ). In this setting, the definition of Cauchy is a special case of the Weyl character formula. Thus, its specialization yields the corresponding Weyl dimension formula, which will be of use below:
[TABLE]
An alternate proof of (4.3) comes from the principal specialization formula: for a variable , one has that
[TABLE]
this follows from (4.2) because now the numerator is also a standard Vandermonde determinant. We also refer the reader to [98] for many more results and properties of Schur polynomials.
Returning to polynomial positivity preservers, we wish to consider functions of the form
[TABLE]
with non-negative integers and positive coefficients , …, . We are interested in characterizing those for which the entrywise map preserve positivity on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}. By the Schur product theorem, this is equivalent to finding the smallest such that is a preserver. We may assume that , so we rescale by and define
[TABLE]
The goal now is to find the smallest such that preserves positivity on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}. We next achieve this goal for a single rank-one matrix.
Proposition 4.7**.**
With notation as above, define
[TABLE]
for . Given a vector with distinct coordinates, the following are equivalent.
- (1)
The matrix is positive semidefinite. 2. (2)
. 3. (3)
.
In particular, this shows that for a generic rank-one matrix in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}, there does exist a positivity-preserving polynomial with a negative leading term.
In essence, the equivalences in Proposition 4.7 hold more generally; this is distilled into the following lemma.
Lemma 4.8** **(Khare–Tao [90]555The
work [90] is an extended abstract of the paper [89], but some of the results in it have different proofs from [89].).
Fix and a positive-definite matrix . Fix and define . The following are equivalent.
- (1)
* is positive semidefinite.* 2. (2)
. 3. (3)
.
We refer the reader to [90] for the detailed proof of Lemma 4.8, remarking only that the equality in assertion (3) follows by using Schur complements in two different ways to expand the determinant of the matrix .
Now Proposition 4.7 follows directly from Lemma 4.8, by setting
[TABLE]
where is positive definite because of the following general matrix factorization (which is also used below).
Proposition 4.9**.**
Let be a polynomial with coefficients in a commutative ring . For any integer and any vectors and , it holds that
[TABLE]
where is a multiplicative identity which is adjoined to if necessary.
Now to adopt Lemma 4.8(3), this same equation and the Cauchy–Binet formula allow one to compute in the present situation, and this yields precisely that , as desired.
4.3. The threshold for all rank-one matrices: a Schur positivity
result
We continue toward a proof of Theorem 4.3. The next step is to use Proposition 4.7 to achieve an intermediate goal: a threshold bound for that works for all rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}, still working with integer powers. Clearly, to do so one has to understand the supremum of each ratio , as runs over vectors in with distinct coordinates. More precisely, one has to understand the supremum of the weighted sum .
This observation was first made in the work [11] for the case , that is, . It led to the first proof of Theorem 4.3, with all of the denominators being the same: . We now use another equivalent definition of Schur polynomials, by Littlewood, realizing them as sums of monomials corresponding to certain Young tableaux. Every monomial has a non-negative integer coefficient. It follows by the continuity and homogeneity of and the Weyl Dimension Formula (4.3), that the supremum in the previous paragraph equals the value at , namely
[TABLE]
Since all of these suprema are attained at the same point , the weighted sum in Proposition 4.7(3) also attains its supremum at the same point. Thus, we conclude using Proposition 4.7 that
[TABLE]
preserves positivity entrywise on all rank-one matrices \mathbf{u}\mathbf{u}^{T}\in\mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} if and only if
[TABLE]
In fact, if then the entire argument above goes through even when one changes the domain to the open complex disc , or any intermediate domain . This is precisely the content of the main result in [11].
Theorem 4.10** **(Belton–Guillot–Khare–Putinar
[11]).
Fix and integers . Let
[TABLE]
and let be the closed disc in the complex plane with centre [math] and radius . The following are equivalent.
- (1)
The entrywise map preserves positivity on . 2. (2)
The entrywise map preserves positivity on rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}. 3. (3)
Either , …, , are all non-negative, or , …, are positive and
[TABLE]
where for .
This theorem provides a complete understanding of which polynomials of degree at most preserve positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} and, more generally, on any subset of \mathcal{P}_{N}\bigl{(}\overline{D}(0,\rho)\bigr{)} that contains the rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}.
Remark 4.11**.**
Clearly here, and the proof of was outlined above via Proposition 4.7. We defer mentioning the proof strategy for , because we will later see a similar theorem over for more general powers . The proof of that result, Theorem 4.15, will be outlined in some detail.
Having dealt with the base case of , as well as for any , which holds by the Schur product theorem, we now turn to the general case. In general, is no longer a monomial, and so it is no longer clear if and where the supremum of each ratio , or of their weighted sum, is attained for . The threshold bound for all rank-one matrices itself is not apparent, and the bound for all matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} is even more inaccessible.
By a mathematical miracle, it turns out that the same phenomena as in the base case hold in general. Namely, the ratio of each and attains its supremum at . Hence one can proceed as above to obtain a uniform threshold for , which works for all rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}.
Example 4.12**.**
To explain the ideas of the preceding paragraph, we present an example. Suppose
[TABLE]
Then
[TABLE]
The claim is that is coordinatewise non-decreasing for ; the assertion about its supremum on immediately follows from this. It suffices by symmetry to show the claim only for one variable, say . By the quotient rule,
[TABLE]
and this is clearly non-negative on the positive orthant, proving the claim. As we see, the above expression is, in fact, monomial positive, from which numerical positivity follows immediately.
In fact, an even stronger fact holds. Viewed as a polynomial in , every coefficient in the above expression is in fact Schur positive. In other words, the coefficient of each is a non-negative combination of Schur polynomials in and :
[TABLE]
where
[TABLE]
In particular, this implies that each coefficient is monomial positive, whence numerically positive. We recall here that the monomial positivity of Schur polynomials follows from the definition of using Young tableaux.
The miracle to which we alluded above, is that the Schur positivity in the preceding example in fact holds in general.
Theorem 4.13** (Khare–Tao [89]).**
If and are -tuples of non-negative integers such that for , …, , then the function
[TABLE]
is non-decreasing in each coordinate. Furthermore, if
[TABLE]
is considered as a polynomial in , then the coefficient of every monomial is a Schur-positive polynomial in ,…, .
The second, stronger part of Theorem 4.13 follows from a deep and highly non-trivial result in symmetric function theory (or type- representation theory) by Lam, Postnikov, and Pylyavskyy [92], following earlier results by Skandera. We refer the reader to this paper and to [89] for more details. Notice also that the first assertion in Theorem 4.13 only requires the numerical positivity of the expression (4.7). This is given a separate proof in [89], using the method of condensation due to Charles Lutwidge Dodgson [40].666This article by Dodgson immediately follows his better-known 1865 publication, Alice’s Adventures in Wonderland. In this context, we add for completeness that in [89] the authors also show a log-supermodularity (or FKG, or ) phenomenon for determinants of totally positive matrices.
4.4. Real powers; the threshold works for all matrices
We now return to the proof of Theorem 4.3, which holds for real powers. Our next step is to observe that the first part of Theorem 4.13 now holds for all real powers. Since one can no longer define Schur polynomials in this case, we work with generalized Vandermonde determinants instead:
Corollary 4.14**.**
Fix -tuples of real powers and , such that for all . Letting as above, the function
[TABLE]
is non-decreasing in each coordinate.
We sketch here one proof. The version for integer powers, Theorem 4.13, gives the version for rational powers, by taking a “common denominator” such that and are all integers, and using a change of variables . The general version for real powers then follows by considering rational approximations and taking limits.
Corollary 4.14 helps prove the real-power version of Theorem 4.3, just as Theorem 4.13 would have shown the integer powers case of Theorem 4.3. Namely, first note that Proposition 4.7 holds even when the are real powers; the only changes are (a) to assume that the coordinates of are distinct, and (b) to rephrase the last assertion (3) to the following:
[TABLE]
These arguments help prove the first part of the following result, which is the culmination of these ideas.
Theorem 4.15** (Khare–Tao [89]).**
Fix an integer and real exponents , as well as scalars and , …, , . Let
[TABLE]
The following are equivalent.
- (1)
The function preserves positivity entrywise on all rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}. 2. (2)
The function preserves positivity entrywise on all Hankel rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}. 3. (3)
Either the coefficients , …, and are non-negative, or , …, are positive and
[TABLE]
where and are as defined above.
If, moreover, the exponents all lie in , then these assertions are also equivalent to the following.
- (4)
The function preserves positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}.
Before sketching the proof, we note several ramifications of this result.
- (1)
The theorem completely characterizes linear combinations of up to powers that entrywise preserve positivity on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}. The same is true for any subset of \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} that contains all rank-one positive semidefinite Hankel matrices. 2. (2)
As discussed above, Theorem 4.15 implies Theorem 4.5, which helps in understanding which sign patterns correspond to countable sums of real powers that preserve positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} (or on the subset of rank-one matrices). In particular, the existence of sign patterns which are not all non-negative shows the existence of functions which preserve positivity on but not on . 3. (3)
Theorem 4.15 bounds in terms of a multiple of . More generally, one can do this for an arbitrary convergent power series instead of a monomial, in the spirit of Theorem 4.2. Even more generally, one may work with Laplace transforms of measures; see Corollary 4.17 below.
For completeness, we also mention two developments related (somewhat more distantly) to the above results.
- •
A refinement of a conjecture of Cuttler, Greene, and Skandera (2011) and its proof; see [89] for more details. In particular, this approach assists with a novel characterization of weak majorization, using Schur polynomials.
- •
A related “Schubert cell-type” stratification of the cone ; see [11] for further details.
We conclude this section by outlining the proof of Theorem 4.15.
Proof.
Clearly, . If holds, then, by Corollary 3.11 at , either all the and are non-negative, or is positive for all . Thus, we suppose that .
Note that if for some , then
[TABLE]
is a rank-one Hankel matrix and hence in our test set. Repeating the analysis in Section 4.2, using generalized Vandermonde determinants instead of Schur polynomials and rank-one Hankel matrices of the form ,
[TABLE]
where the equality follows from Corollary 4.14 above. The real-exponent version of (4.4) holds if and the exponents are real and non-decreasing:
[TABLE]
Applying this identity, the above computation yields
[TABLE]
Thus . Conversely, that follows by a similar analysis to that given above, using Corollary 4.14 and the density of matrices , where \mathbf{u}\in\bigl{(}0,\sqrt{\rho}\bigr{)}^{N} has distinct entries, in the set of all rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}.
It remains to show that if all the exponents . We proceed by induction on . The case is immediate. For the inductive step, we apply the extension principle of the following Proposition 4.16 with , which requires verification that preserves positivity on . This is a straightforward calculation via the induction hypothesis. ∎
The following extension principle was inspired by work of FitzGerald and Horn [51].
Proposition 4.16** (Khare–Tao [89]).**
Suppose , and , or the closure of one of these sets. Let be a continuously differentiable function on the interior of . If preserves positivity entrywise on and does so on the rank-one matrices in , then in fact preserves positivity on all of .777An analogous version of this results holds for or its closure in , with analytic. This is used to prove the corresponding implication in Theorem 4.10 above.
Proposition 4.16 relies on two arguments found in [51]: (a) every matrix in may be written as the sum of a rank-one matrix in , and a matrix in with its last row and column both zero, and (b) applying the integral identity
[TABLE]
entrywise to this decomposition. See [89, Section 3] for more details. The original use of these arguments was when is a power function; this is explained in Chapter 6 below.
4.5. Power series preservers and beyond; unbounded
domains
In the remainder of this chapter, we use Theorem 4.15 to derive several corollaries; thus, we retain and use the notation of that theorem. As discussed following Theorem 4.15, the first consequence extends the theorem from bounding monomials by a multiple of , to bounding for more general power series. Even more generally, one can work with Laplace transforms of real measures on .
Corollary 4.17** (Khare–Tao [89]).**
Let the notation be as for Theorem 4.15, with for all . Suppose is a real measure supported on for some , and let
[TABLE]
If is absolutely convergent at , then there exists a finite threshold such that, for all A\in\mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}, the matrix
[TABLE]
is positive semidefinite.
Proof.
By Theorem 4.15 and the fact that is a closed convex cone, it suffices to show the finiteness of the quantity
[TABLE]
where is the positive part of . This follows from the hypotheses. ∎
We now turn to the case, which was briefly alluded to above. In other words, the domain is now unbounded: . As in the bounded-domain case, the question of interest is to classify all possible sign patterns of polynomial or power-series preservers on for a fixed integer .
Similar to the above discussion for bounded , the crucial step in classifying sign patterns of power series (or more general functions, as in Theorem 4.5) is to work with integer powers and precisely one coefficient that can be negative. Thus, one first observes that Lemma 3.9(2) holds in the unbounded-domain case . Hence given a polynomial
[TABLE]
where
[TABLE]
if preserves positivity on \mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)}, then either all the coefficients , …, , are non-negative, or , …, are positive and can be negative. In this case, an explicit threshold is not known as it is in Theorem 4.15, but we now explain why such a threshold exists.
We start from (4.6) and repeat the subsequent analysis via the Cauchy–Binet formula. To find a uniform threshold for that works for all rank-one matrices in \mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)}, it suffices to bound, uniformly from above, certain ratios of sums of squares of Schur polynomials. This may be done because of the following tight bounds.
Proposition 4.18** (Khare–Tao [89]).**
If and , where are non-negative integers and are non-negative real numbers, then
[TABLE]
where . The constants and on each side of (4.9) cannot be improved.
We refer the reader to [89, Section 4] for further details, including how Proposition 4.18 implies the existence of preservers as above for rank-one matrices with . The extension from rank-one matrices to all of \mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)} is carried out using the extension principle in Proposition 4.16.
In a sense, Proposition 4.18 isolates the ‘leading term’ of every Schur polynomial. This calculation can be generalized to the case of non-integer powers,888We refer the reader again to [89, Section 5] for the details, which use additional concepts from type- representation theory: the Harish-Chandra–Itzykson–Zuber integral and Gelfand–Tsetlin patterns.which helps extend the above results for the unbounded domain to real powers. This yields the desired classification, similar to Theorem 4.5 in the bounded-domain case.
Theorem 4.19** (Khare–Tao [89]).**
Let , and let be a set of distinct real numbers. For each , let be a sign and suppose that, whenever , then for at least choices of such that and also for at least choices of such that . There exists a series with real coefficients,
[TABLE]
which converges on , preserves positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)}, and is such that has the same sign as for all .
Note that, in particular, Theorem 4.19 reaffirms that the Horn–Loewner-type conditions in Lemma 3.9(2) are sharp.
4.6. Digression: Schur polynomials from smooth functions, and new
symmetric function identities
Before proceeding to additional applications of Theorem 4.15 and related results, we take a brief detour to explain how Schur polynomials arise naturally from any sufficiently differentiable function.
Theorem 4.20** (Khare [88]).**
Fix non-negative integers , as well as scalars and . Let and suppose the function is -times differentiable at . Given vectors , , define for a sufficiently small by setting
[TABLE]
Then,
[TABLE]
where the first factor in the summand is a multinomial coefficient, and we sum over all partitions of with unequal parts, that is, and .
In particular, .
Remark 4.21**.**
As a special case, if is smooth at , and , , then defining gives a function which is smooth at [math], and Theorem 4.20 gives all of these derivatives via the formula (4.10). The general version of Theorem 4.20 is a key ingredient in showing Theorem 3.10, which subsumes all known variants of Horn–Loewner-type necessary conditions in fixed dimension.
The key determinant computation required to prove the original Horn–Loewner necessary condition in fixed dimension (see Theorem 3.4) is the special case of Theorem 4.20 where and for all . In this situation, , so Schur polynomials do not appear. The general version of Theorem 4.20 decouples the vectors and , and holds for all if is smooth (as in Loewner’s setting). Moreover, it reveals the presence of Schur polynomials in every other case than the ones studied by Loewner, that is, when .
While Theorem 4.20 involves derivatives of a smooth function, the result and its proof are, in fact, completely algebraic, and valid over any commutative ring. To show this, an algebraic analogue of the differential operator is required, with more structure than is given by a derivation. The precise statement and its proof may be found in [88, Section 2].
We conclude this section by applying Theorem 4.20 and its algebraic avatar to symmetric function theory. We begin by recalling the famous Cauchy summation identity [98, Example I.4.6]: if is the geometric series, viewed as a formal power series over a commutative unital ring , and , …, , , …, are commuting variables, then
[TABLE]
where the sum runs over all partitions with at most parts.999Usually one uses infinitely many indeterminates in symmetric function theory, but given the connection to the entrywise calculus in a fixed dimension, we will restrict our attention to and for .
A natural question is whether similar formulae hold when is replaced by other formal power series. Very few such results were known; this includes one due to Frobenius [55], for the function with an scalar. (This is also connected to theta functions and elliptic Frobenius–Stickelberger–Cauchy determinant identities.) For this function,
[TABLE]
A third, obvious identity is if is a ‘fewnomial’ with at most terms. In this case, is a sum of at most rank-one matrices, and so its determinant vanishes.
The following result extends all three of these cases to an arbitrary formal power series over an arbitrary commutative ring , and with an additional -grading.
Theorem 4.22** (Khare [88]).**
Fix a commutative unital ring and let be an indeterminate. Let be an arbitrary formal power series. Given vectors , , where , we have that
[TABLE]
The heart of the proof involves first computing, for each , the coefficient of in , over the “universal ring”
[TABLE]
where , and are algebraically independent over . These coefficients are seen to equal , by the algebraic version of Theorem 4.20. Thus, (4.13) holds over . Then note that both sides of (4.13) lie in the subring , so the identity holds in . Finally, it holds as claimed by specializing from to .
An alternate approach to proving Theorem 4.22 is also provided in [88]. The identity (4.6) is applied, along with the Cauchy–Binet formula, to each truncated Taylor–Maclaurin polynomial of . The result follows by taking limits in the -adic topology, using the -adic continuity of the determinant function.
4.7. Further applications: linear matrix inequalities, Rayleigh
quotients, and the cube problem
This chapter ends with further ramifications and applications of the above results. First, notice that Theorem 4.15 implies the following linear matrix inequality version that is ‘sharp’ in more than one sense:
Corollary 4.23**.**
Fix , real exponents for some integer , and scalars for all . Then,
[TABLE]
for all A\in\mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} of rank one, or of all ranks if , …, . Moreover, the constant is the smallest possible, as is the number of terms on the right-hand side.
Seeking a uniform threshold such as in the preceding inequality can also be achieved (as explained above) by first working with a single positive matrix, then optimizing over all matrices. The first step here can be recast as an extremal problem that involves Rayleigh quotients:
Proposition 4.24** (see [11, 89]).**
Fix an integer and real exponents , where each . Given positive scalars , …, , let
[TABLE]
Then, for and A\in\mathcal{P}_{N}\bigl{(}[0,\rho]\bigr{)},
[TABLE]
where and denote the spectral radius and the Moore–Penrose pseudo-inverse of a square matrix , respectively. Moreover, for every non-zero matrix A\in\mathcal{P}_{N}\bigl{(}[0,\rho]\bigr{)}, the following variational formula holds:
[TABLE]
Proposition 4.24 is shown using the Kronecker normal form for matrix pencils; see the treatment in [57, Section X.6]. When the matrix is a generic rank-one matrix, the above generalized Rayleigh quotient has a closed-form expression, which features Schur polynomials for integer powers. This reveals connections between Rayleigh quotients, spectral radii, and symmetric functions.
Proposition 4.25**.**
Notation as in Proposition 4.24; but now with not necessarily in . If , where has distinct coordinates, then is invertible, and the threshold bound
[TABLE]
In fact, the proof of the final equality in (4.15) is completely algebraic, and reveals new determinantal identities that hold over any field with at least elements.
Proposition 4.26** (Khare–Tao [89]).**
Suppose and are integers, and each have distinct coordinates. Let and define . Then is invertible, and
[TABLE]
The final result is a variant of the matrix-cube problem [104], and connects to spectrahedra [22, 135] and modern optimization theory. Given two or more real symmetric matrices , …, for the corresponding matrix cube of size is
[TABLE]
The matrix-cube problem is to find the largest such that . In the present setting of the entrywise calculus, the above results imply asymptotically matching upper and lower bounds for the size of the matrix cube.
Theorem 4.27** (see [11, 89]).**
Suppose and are integers. Fix positive scalars , , and , and define for each and each matrix A\in\mathcal{P}_{N}\bigl{(}[0,\rho]\bigr{)}, the cube
[TABLE]
Also define for and :
[TABLE]
where , and
[TABLE]
Then for each fixed , we have the uniform upper and lower bounds:
[TABLE]
Moreover, if the grow linearly, in that
[TABLE]
then the lower and upper bounds for in (4.18) are asymptotically equal as :
[TABLE]
5. Totally non-negative matrices and positivity preservers
In this chapter, we discuss variant notions of matrix positivity that are well studied in the literature, total positivity and total non-negativity, and characterize the maps which preserve these properties.
Definition 5.1**.**
A real matrix is said to be totally non-negative or totally positive if every minor of is non-negative or positive, respectively. We will denote these matrices, as well as the property, by TN and TP.
In older texts, such matrices were called totally positive and strictly totally positive, respectively.
To introduce the theory of total positivity, we can do no better than quote from the preface of Karlin’s magisterial book [85]: “Total positivity is a concept of considerable power that plays an important role in various domains of mathematics, statistics and mechanics”. Karlin goes on to list “problems involving convexity, moment spaces, eigenvalues of integral operators, … oscillation properties of solutions of linear differential equations … the theory of approximations … statistical decision procedures … discerning uniformly most powerful tests for hypotheses … ascertaining optimal policy for inventory and production processes … analysis of diffusion-type stochastic processes, and … coupled mechanical systems.”
Perhaps the earliest result on total positivity is due to Fekete, in correspondence with Pólya [50] published in 1912 (see Lemma 5.10). Schoenberg observed the variation-diminishing properties of TP matrices in 1930 [119], and published a series of papers on Pólya frequency functions, which are defined in terms of total positivity, in the 1950s [127, 126, 128]. Independently of Schoenberg, Krein’s investigation of ordinary differential equations led him to the total positivity of Green’s functions for certain differential operators, and in the mid-1930s his works with Gantmacher looked at spectral and other properties of totally positive matrices and kernels; see [58] and [85, Section 10.6].
For more on these four authors, one may consult the afterwork of Pinkus’s book on total positivity [106], which also contains a wealth of results on totally positive and totally non-negative matrices. For a modern collection of applications of the theory of total positivity, see the book edited by Gasca and Micchelli [60].
More recently, total positivity has had a major impact on Lie theory. Lusztig extended the theory of total positivity to the setting of linear algebraic groups; see [97] for an exposition of this work. This led Fomin and Zelevinsky to investigate the combinatorics of Lusztig’s theory [53] and resulted in the invention of cluster algebras [54]. These objects have generated an enormous amount of activity in a short period of time, with connections across a wide range of areas within representation theory, combinatorics, geometry, and mathematical physics. For the latter, we will mention only the totally non-negative Grassmannian [110], its connections with scattering amplitudes for quantum field theories [4], and the work by Kodama and Williams on regular soliton solutions of the Kadomtsev–Petviashvili equation [91].
Example 5.2**.**
Perhaps the most well-known class of totally positive matrices consists of the (generalized) Vandermonde matrices: for real numbers and , the matrix
[TABLE]
is totally positive. Indeed, it suffices to show the positivity of any such matrix determinant when . That is non-zero follows from Laguerre’s extension of Descartes’ rule of signs (see [82]) and by fixing the and considering a linear homotopy from to , one obtains a continuous non-vanishing function from the usual Vandermonde determinant (which is positive) to .
Example 5.3**.**
Another prominent class of symmetric totally positive matrices consists of the Hankel moment matrices corresponding to admissible measures ; see Definition 3.16.
5.1. Totally non-negative and totally positive kernels
An important generalization of TN and TP matrices is given by the following functional form.
Definition 5.4**.**
Let and be totally ordered sets, and let be a kernel.
- (1)
The kernel is totally positive of order , denoted , if, for any -tuples of points in and in , where , the matrix
[TABLE]
has positive determinant. 2. (2)
The kernel is totally positive if is for all . 3. (3)
Similarly, one defines kernels and totally non-negative kernels by replacing the word “positive” in the above by “non-negative.”
If and , we recover the earlier notions of totally positive and totally non-negative matrices. When and are taken to be real intervals, TN and TP kernels can be thought of as continuous analogues of TN and TP matrices. In fact, one has a continuous analogue of the Cauchy–Binet formula, which generalizes its traditional version.
Theorem 5.5** **(Basic Composition Lemma, see
Suppose , , and let be a non-negative Borel measure on . Suppose and are pointwise Borel measurable with respect to , and let
[TABLE]
If is well defined on the whole of , then
[TABLE]
As an immediate consequence, we have the following corollary.
Corollary 5.6**.**
In the setting of Theorem 5.5, if the kernels and are both or for some , then has the same property. In particular, if and are both TN or TP, then so is .
We conclude this part with an observation of Pólya that connects to a class of well-studied functions, and also implies the positive definiteness of the Gaussian kernel. Recall from the proof of Theorem 2.4 above that this latter property was crucially used by Schoenberg in characterizing metric space embeddings into Hilbert space; however, its proof above was only outlined (via the more sophisticated machinery of Fourier analysis and Bochner’s theorem).
Lemma 5.7** (Pólya).**
The Gaussian kernel given by is totally positive.
Proof.
It suffices to show that every square matrix generated from the kernel has positive determinant. Given real numbers and , we observe the following factorization:
[TABLE]
The proof concludes by observing that all three matrices on the right-hand side have positive determinants, the second because it is a Vandermonde matrix with and . ∎
Example 5.8**.**
The Gaussian function is thus an example of a Pólya frequency function, that is, one for which is a TP kernel on . As noted above, these functions were intensively studied by Schoenberg, and continue to be much studied in mathematics and statistics; two of the classic references are [29, 43].
The case of the multivariate Gaussian kernel follows immediately from the one-dimensional version.
Corollary 5.9**.**
For all , the Gaussian kernel
[TABLE]
is positive semidefinite on . In other words, the matrix is positive semidefinite for all , …, .
Proof.
The case is a direct consequence of Lemma 5.7, and the case of general follows from this by using the Schur product theorem. ∎
5.2. Entrywise preservers of totally non-negative Hankel
matrices
In the recent article [48] by Fallat, Johnson, and Sokal, the authors study when various classes of totally non-negative (TN) matrices are closed under taking sums or Schur products. As they observe, the set of all TN matrices is not closed under these operations; for example, the identity matrix and the all-ones matrix are both TN but their sum is not.
It is of interest to isolate a class of TN matrices that is a closed convex cone, and is furthermore closed under taking Schur products. Indeed, it is under these conditions that the observation of Pólya–Szegö (see Section 3.1) holds, leading to large classes of TN preservers.
Such a class of matrices has been identified in both the dimension-free as well as fixed-dimension settings. It consists of the TN Hankel matrices. In a fixed dimension, there is the following classical result from 1912.
Lemma 5.10** (Fekete [50]).**
Let be a possibly rectangular real Hankel matrix such that all of its contiguous minors are positive. Then is totally positive.
Recall that a minor is said to be contiguous if it is obtained from successive rows and successive columns of .
If is a square Hankel matrix, let be the square submatrix of obtained by removing the first row and the last column. Notice that every contiguous minor of is a principal minor of either or . Combined with Fekete’s lemma, these observations help show another folklore result.
Theorem 5.11**.**
Let be a square real Hankel matrix. Then is TN or TP if and only if both and are positive semidefinite or positive definite, respectively.
Theorem 5.11 is a very useful bridge between matrix positivity and total non-negativity. A related dimension-free variant (see [2, 59]) concerns the Stieltjes moment problem: a sequence is the moment sequence of an admissible measure on (see Definition 3.16) if and only if the Hankel matrices and (obtained by excising the first row of , or equivalently, the first column) are both positive semidefinite. By Theorem 5.11, this is equivalent to saying that is totally non-negative.
With Theorem 5.11 in hand, one can easily show several basic facts about Hankel TN matrices; we collect these in the following result for convenience.
Lemma 5.12**.**
For an integer and a set , let denote the set of TN Hankel matrices with entries in . For brevity, we let HTN_{N}:=HTN_{N}\bigl{(}\mathbb{R}_{+}).
- (1)
The family is closed under taking sums and non-negative scalar multiples, or more generally, integrals against non-negative measures (as long as these exist). 2. (2)
In particular, if is an admissible measure supported on , then its moment matrix H_{\mu}:=\bigl{(}s_{j+k}(\mu)\bigr{)}_{j,k=0}^{\infty} is totally non-negative. 3. (3)
* is closed under taking entrywise products.* 4. (4)
If the power series is convergent on , with for all , then the entrywise map preserves total non-negativity on , for all .
Given Lemma 5.12(4), which is identical to the start of the story for positivity preservers, it is natural to expect parallels between the two settings. For example, one can ask if a Schoenberg-type phenomenon also holds for preservers of total non-negativity on \bigcup_{N\geq 1}HTN_{N}\bigl{(}[0,\rho)\bigr{)} with . As we now explain, this is indeed the case; we will set for ease of exposition. From Theorem 3.14 and the subsequent discussion, it follows via Hamburger’s theorem that the class of functions with all characterizes the entrywise maps preserving the set of moment sequences of admissible measures supported on . By the above discussion, in considering the family of matrices for all , we are studying moment sequences of admissible measures supported on , or the related Hausdorff moment problem for . In this case, one also has a Schoenberg-like characterization, outside of the origin.
Theorem 5.13** (Belton–Guillot–Khare–Putinar [12]).**
Let . The following are equivalent.
- (1)
Applied entrywise, the map preserves the set for all . 2. (2)
Applied entrywise, the map preserves positive semidefiniteness on for all . 3. (3)
Applied entrywise, the map preserves the set of moment sequences of admissible measures supported on . 4. (4)
Applied entrywise, the map preserves the set of moment sequences of admissible measures supported on . 5. (5)
The function agrees on with an absolutely monotonic entire function, hence is non-decreasing, and .
Remark 5.14**.**
If we work only with , then we are interested in matrices in with positive entries. Since the only matrices in with a zero entry are scalar multiples of the elementary square matrices or (equivalently, the only admissible measures supported in with a zero moment are of the form ), the test set does not really reduce, and hence the preceding theorem still holds in essence: we must replace by HTN_{N}\bigl{(}(0,\infty)\bigr{)} in (1) and (2), reduce the class of admissible measures to those that are not of the form in (3) and (4), and end (5) at ‘entire function’. These five modified statements are, once again, equivalent, and provide further equivalent conditions to those of Vasudeva (Theorems 3.3 and 3.18).
In a similar vein, we now present the classification of sign patterns of polynomial or power-series functions that preserve TN entrywise in a fixed dimension on Hankel matrices. This too turns out to be exactly the same as for positivity preservers.
Theorem 5.15** (Khare–Tao [89]).**
Fix and real exponents . For any real coefficients , …, , , let
[TABLE]
The following are equivalent.
- (1)
The entrywise map preserves TN on the rank-one matrices in HTN_{N}\bigl{(}(0,\rho)\bigr{)}. 2. (2)
The entrywise map preserves positivity on the rank-one matrices in HTN_{N}\bigl{(}(0,\rho)\bigr{)}. 3. (3)
Either all the coefficients , …, , are non-negative, or , …, are positive and , where
[TABLE]
If for , …, , then conditions (1), (2) and (3) are further equivalent to the following.
- (4)
The entrywise map preserves TN on HTN_{N}\bigl{(}[0,\rho]\bigr{)}.
In particular, this produces further equivalent conditions to Theorem 4.15. Notice that assertion (2) here is valid because the rank-one matrices used in proving Theorem 4.15 are of the form , where , , and , so that c\mathbf{u}\mathbf{u}^{T}\in HTN_{N}\bigl{(}(0,\rho)\bigr{)}.
The consequences of Theorem 4.15 also carry over for TN preservers. For instance, one can bound Laplace transforms analogously to Corollary 4.17, by replacing the words “positive semidefinite” by “totally non-negative” and the set \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)} by HTN_{N}\bigl{(}(0,\rho)\bigr{)}. Similarly, one can completely classify the sign patterns of power series that preserve TN entrywise on Hankel matrices of a fixed size:
Theorem 5.16** (Khare–Tao [89]).**
Theorems 4.5 and 4.19 hold upon replacing the phrase “preserves positivity entrywise on \mathcal{P}_{N}\bigl{(}(0,\rho)\bigr{)}” with “preserves TN entrywise on HTN_{N}\bigl{(}(0,\rho)\bigr{)}”, for both and for .
We point the reader to [89, End of Section 9] for details.
To conclude, it is natural to seek a general result that relates the positivity preservers on and TN preservers on the set for domains . Here is one variant which helps prove the above theorems, and which essentially follows from Theorem 5.11.
Proposition 5.17** (Khare–Tao [89]).**
Fix integers and a scalar . Suppose is such that the entrywise map preserves positivity on \mathcal{P}_{N}^{k}\bigl{(}[0,\rho)\bigr{)}, the set of matrices in \mathcal{P}_{N}\bigl{(}[0,\rho)\bigr{)} with rank no more than . Then preserves total non-negativity on HTN_{N}\bigl{(}[0,\rho)\bigr{)}\cap\mathcal{P}_{N}^{k}\bigl{(}[0,\rho)\bigr{)}.
5.3. Entrywise preservers of totally non-negative matrices
The TN property is very rigid when it comes to entrywise operations, as the following result makes clear.
Theorem 5.18** ([13, Theorem 2.1]).**
Let be a function and let , where and are positive integers. The following are equivalent.
- (1)
* preserves TN entrywise on matrices.* 2. (2)
* preserves TN entrywise on matrices.* 3. (3)
* is either a non-negative constant or*
- (a)
* ;* 2. (b)
* for some and some ;* 3. (c)
* for some and some ;* 4. (d)
* for some .*
Proof.
That is immediate, as is the equivalence of and when . For larger values of , we sketch the implication .
For , let the totally non-negative matrices
[TABLE]
If the non-constant function preserves TN entrywise for matrices, then the non-negativity of the determinants of and gives that
[TABLE]
It follows that is strictly positive. Applying Vasudeva’s argument, as set out before Proposition 3.8, now implies that is continuous on . Since the identity (5.4) shows that is multiplicative, there exists an exponent such that for all . The final details are left as an exercise.
For , note that the matrix is totally non-negative if and only if the matrix is. Hence the previous working gives that for some and . Looking at for the totally non-negative matrix
[TABLE]
shows that we must have .
The argument to rule out the possibility that when is more involved, but makes use of an example of Fallat, Johnson and Sokal [48, Example 5.8]. Full details are provided in [13]. ∎
If our totally non-negative matrices are also required to be symmetric, and so positive semidefinite, then the classes of preservers are enlarged somewhat, but still fairly restrictive.
Theorem 5.19** ([13, Theorem 2.3]).**
Let and let be a positive integer. The following are equivalent.
- (1)
* preserves TN entrywise on symmetric matrices.* 2. (2)
* is either a non-negative constant or*
- (a)
* ;* 2. (b)
* is non-negative, non-decreasing, and multiplicatively mid-convex, that is, for all , , so continuous;* 3. (c)
* for some and some ;* 4. (d)
* for some and some ;* 5. (e)
) for some .
5.4. Entrywise preservers of totally positive matrices
In moving from total non-negativity to total positivity, we face two significant technical challenges. Firstly, the idea of realizing totally non-negative matrices as submatrices of totally non-negative matrices, by padding with zeros, does not transfer to the TP setting. Secondly, it is no longer possible to use Vasudeva’s idea to establish multiplicative mid-point convexity, since the test matrices used for this are not always totally positive.
The first issue leads us into the domain of totally positive completion problems [47]. It is possible to do this generality, using parametrizations of TP matrices [53] or exterior bordering [46, Chapter 9], but the following result has the advantage of providing an explicit embedding into a well-known class of matrices.
Lemma 5.20** ([13, Lemma 3.2]).**
Any totally positive matrix may be realized as the leading principal submatrix of a positive multiple of a rectangular totally positive generalized Vandermonde matrix of any larger size.
Remark 5.21** ([13, Remark 3.4]).**
Lemma 5.20 can be strengthened to the following completion result: given integers , , an arbitrary matrix occurs as a minor in a totally positive matrix at any given position (that is, in a specified pair of rows and pair of columns) if and only if is totally positive.
The other tool which will be vital to our deliberations is the following result of Whitney.
Theorem 5.22** ([139, Theorem 1]).**
The set of totally positive matrices is dense in the set of totally non-negative matrices.
With these tools in hand, we are able to provide a complete classification of the entrywise TP preservers of each fixed size, akin to the results in the preceding section.
Theorem 5.23** ([13, Theorem 3.1]).**
Let be a function and let , where and are positive integers. The following are equivalent.
- (1)
* preserves total positivity entrywise on matrices.* 2. (2)
* preserves total positivity entrywise on matrices.* 3. (3)
The function satisfies
- (a)
* ;* 2. (b)
* for some and some ;* 3. (c)
* for some and some ;* 4. (d)
* for some .*
Proof.
We sketch the proof that when and . For the first case, working with the matrix
[TABLE]
shows that takes positive values and is increasing, so is Borel measurable and continuous except on a countable set. We now fix a point of continuity and use the totally positive matrices
[TABLE]
to show that
[TABLE]
for all , . Hence is such that
[TABLE]
so is a measurable solution of the Cauchy functional equation. It follows that for some . As , and so , is increasing, we must have .
Finally, if , then the embedding of Lemma 5.20 and the previous working give positive constants and such that . In particular, the function admits a continuous extension to . The density of TP in TN, that is, Theorem 5.22, implies that preserves TN entrywise on matrices. Theorem 5.18 now establishes the form of , and so of . ∎
We may consider a version of the previous theorem which restricts to the case of totally positive matrices which are symmetric. A moment’s thought leads to the consideration of a symmetric version of the matrix completion problem.
Lemma 5.24** ([13, Lemma 3.7]).**
Any symmetric totally positive matrix occurs as the leading principal submatrix of a totally positive Hankel matrix, where can be taken arbitrary large.
Proof.
It suffices to embed the matrix
[TABLE]
into such a Hankel matrix. It is an exercise to prove the existence of a continuous function such that
[TABLE]
and then setting
[TABLE]
gives a Hankel matrix as required. The verification of total positivity may be made with the help of Andréief’s identity,
[TABLE]
where and , with
[TABLE]
together with the total positivity of generalized Vandermonde matrices. ∎
We remark here that the preceding result can be further strengthened to have the symmetric TP matrix occur in any “symmetric” position inside a larger square symmetric TP Hankel matrix, in the spirit of Remark 5.21. See [13, Theorem 3.9] for details.
We now state the symmetric version of Theorem 5.23.
Theorem 5.25** ([13, Theorem 3.6]).**
Let and let be a positive integer. The following are equivalent.
- (1)
* preserves total positivity entrywise on symmetric matrices.* 2. (2)
The function satisfies
- (a)
* ;* 2. (b)
* is positive, increasing, and multiplicatively mid-convex, that is, for all , , so continuous;* 3. (c)
* for some and some ;* 4. (d)
* for some and some .* 5. (e)
* for some .*
Although we have developed the key ingredients to prove this theorem, we content ourselves with referring the interested reader to [13].
6. Power functions
A natural approach to tackle the problem of characterizing entrywise preservers in fixed dimension is to examine if some natural simple functions preserve positivity. One such family is the collection of power functions, for . Characterizing which fractional powers preserve positivity entrywise has recently received much attention in the literature. One of the first results in this area reads as follows.
Theorem 6.1** (FitzGerald and Horn [51, Theorem 2.2]).**
Let and let A=[a_{jk}]\in\mathcal{P}_{N}\bigl{(}\mathbb{R}_{+}\bigr{)}. For any real number , the matrix is positive semidefinite. If and is not an integer, then there exists a matrix A\in\mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)} such that is not positive semidefinite.
Theorem 6.1 shows that every real power entrywise preserves positivity, while no non-integers in do so. This surprising “phase transition” phenomenon at the integer is referred to as the “critical exponent” for preserving positivity. Studying which powers entrywise preserve positivity is a very natural and interesting problem. It also often provides insights to determine which general functions preserve positivity. For example, Theorem 6.1 suggests that functions that entrywise preserve positivity on should have a certain number of non-negative derivatives, which is indeed the case by Theorem 3.4.
Outline of the proof.
The first part of Theorem 6.1 relies on an ingenious idea that we now sketch. The result is obvious for . Let us assume it holds for some , let , and let . Write in block form,
[TABLE]
where has dimension and . Assume without loss of generality that (as the case where follows from the induction hypothesis) and let . Then , where is the Schur complement of in . Hence is positive semidefinite. By the fundamental theorem of calculus, for any , ,
[TABLE]
Using the above expression entrywise, we obtain
[TABLE]
Observe that the entries of the last row and column of the matrix are all zero. Using the induction hypothesis and the Schur product theorem, it follows that the integrand is positive semidefinite, and therefore so is .
The converse implication in Theorem 6.1 is shown by considering a matrix of the form , where , , the coordinates of are distinct, and is small. Recall this is the exact same class of matrices that was useful in proving the Horn–Loewner theorem 3.4 as well as its strengthening in Theorem 3.10. The original proof, by FitzGerald and Horn [51], used , while a later proof by Fallat, Johnson and Sokal [48] used the same argument, now with ; the motivation in [48] was to work with Hankel matrices, and the matrix is indeed Hankel. That said, the argument of FitzGerald and Horn works more generally than both of these proofs, to show that, for any non-integral power , , and vector with distinct coordinates, there exists such that is not positive semidefinite. ∎
In her 2017 paper [81], Jain provided a remarkable strengthening of the result mentioned at the end of the previous proof, which removes the dependence on entirely.
Theorem 6.2** (Jain [81]).**
Let
[TABLE]
where and has distinct entries. Then is positive semidefinite for if and only if .
Jain’s result identifies a family of rank-two positive semidefinite matrices, every one of which encodes the classification of powers preserving positivity over all of \mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)}. In a sense, her rank-two family is the culmination of previous work on positivity preserving powers for \mathcal{P}_{N}\bigl{(}(0,\infty)\bigr{)}, since for rank-one matrices, every entrywise power preserves positivity: .
An immediate consequence of these results is the classification of the entrywise powers preserving positivity on the Hankel TN matrices. Recall from the results in Section 5.2 (including Lemma 5.12(4)) that there is to be expected a strong correlation between this classification and the one in Theorem 6.1.
Corollary 6.3**.**
Given , the following are equivalent for an exponent .
- (1)
The entrywise power function preserves total non-negativity on (see Lemma 5.12). 2. (2)
The entrywise map preserves positivity on . 3. (3)
The entrywise map preserves positivity on the matrices in HTN_{N}\bigl{(}(0,\infty)\bigr{)} of rank at most two. 4. (4)
The exponent .
Proof.
That and follow from Theorems 6.1 and 5.11, respectively. That and are obvious, and Jain’s theorem 6.2 shows that . ∎
A problem related to the above study of entrywise powers preserving positivity, is to characterize infinitely divisible matrices. This problem was also considered by Horn in [80]. Recall that a complex matrix is said to be infinitely divisible if for all . Denote the incidence matrix of by :
[TABLE]
Also, let
[TABLE]
and note that is the kernel of if is positive semidefinite.
Assuming the arguments of the entries are chosen in a consistent way [80], we let
[TABLE]
with the usual convention .
Theorem 6.4** (Horn [80, Theorem 1.4]).**
An matrix is infinitely divisible if and only if (a) is Hermitian, with for all , (b) , and (c) is positive semidefinite on .
6.1. Sparsity constraints
Theorem 6.1 was recently extended to more structured matrices. Given and a graph on the finite vertex set , we define the cone of positive-semidefinite matrices with zeros according to :
[TABLE]
Note that if , then the entry is unconstrained; in particular, it is allowed to be [math]. Consequently, the cone is a closed subset of .
A natural refinement of Theorem 6.1 involves studying powers that entrywise preserve positivity on . In that case, the flavor of the problem changes significantly, with the discrete structure of the graph playing a prominent role.
Definition 6.5** (Guillot–Khare–Rajaratnam [69]).**
Given a simple graph , let
[TABLE]
Define the Hadamard critical exponent of to be
[TABLE]
Notice that, by Theorem 6.1, for every graph , the critical exponent exists, and lies in , where is the size of the largest complete subgraph of , that is, the clique number. To compute such critical exponents is natural and highly non-trivial.
FitzGerald and Horn proved that for all (Theorem 6.1), while it follows from [70, Proposition 4.2] that for every tree . For a general graph, it is not a priori clear what the critical exponent is or how to compute it. A natural family of graphs that encompasses both complete graphs and trees is that of chordal graphs. Recall that a graph is chordal if it does not contain an induced cycle of length or more. Chordal graphs feature extensively in many areas, such as the theory of graphical models [93], and in problems involving positive-definite completions (see [130]). Examples of important chordal graphs include trees, complete graphs, Apollonian graphs, band graphs, and split graphs.
Recently, Guillot, Khare, and Rajaratnam [69] were able to compute the complete set of entrywise powers preserving positivity on for all chordal graphs . Here, the critical exponent can be described purely combinatorially.
Theorem 6.6** (Guillot–Khare–Rajaratnam [69]).**
Let denote the complete graph with one edge removed, and let be a finite simple connected chordal graph. The critical exponent for entrywise powers preserving positivity on is , where is the largest integer such that or is an induced subgraph of . More precisely, the set of entrywise powers preserving is , with as before.
The set of entrywise powers preserving positivity was also computed in [69] for cycles and bipartite graphs.
Theorem 6.7** (Guillot–Khare–Rajaratnam [69]).**
The critical exponent of cycles and bipartite graphs is .
Surprisingly, the critical exponent does not depend on the size of the graph for cycles and bipartite graphs. In particular, it is striking that any power greater than preserves positivity for families of dense graphs such as bipartite graphs. Such a result is in sharp contrast to the general case, where there is no underlying structure of zeros. That small powers can preserve positivity is important for applications, since such entrywise procedures are often used to regularize positive definite matrices, such as covariance or correlation matrices, where the goal is to minimally modify the entries of the original matrix (see [94, 143] and Chapter 7 below).
For a general graph, the problem of computing the set or the critical exponent remains open. We now outline some other natural open problems in the area.
Problems.
- (1)
In every currently known case (Theorems 6.6, 6.7), is equal to , where is the largest integer such that or is an induced subgraph of . Is the same true for every graph ? 2. (2)
Is always an integer? Can this be proved without computing explicitly? 3. (3)
Recall that every chordal graph is perfect. Can the critical exponent be calculated for other broad families of graphs such as the family of perfect graphs?
6.2. Rank constraints and other Loewner properties
Another approach to generalize Theorem 6.1 is to examine other properties of entrywise functions such as monotonicity, convexity, and super-additivity (with respect to the Loewner semidefinite ordering) [78, 68]. Given a set , recall that a function is
- •
positive on with respect to the Loewner ordering if for all ;
- •
monotone on with respect to the Loewner ordering if for all , such that ;
- •
convex on with respect to the Loewner ordering if for all and all , such that ;
- •
super-additive on with respect to the Loewner ordering if for all , for which is defined.
The following relations between the first three notions were obtained by Hiai.
Theorem 6.8** (Hiai [78, Theorem 3.2]).**
Let for some .
- (1)
For each , the function is monotone on if and only if is differentiable on and is positive on . 2. (2)
For each , the function is convex on if and only if is differentiable on and is monotone on .
Power functions satisfying any of the above four properties have been characterized by various authors. In recent work, Hiai [78] has extended Theorem 6.1 by considering the odd and even extensions of the power functions to . For , the even and odd extensions to of the power function are defined to be and . The first study of powers for which preserves positivity entrywise on was carried out by Bhatia and Elsner [18]. Subsequently, Hiai studied the power functions and that preserve Loewner positivity, monotonicity, and convexity entrywise, and showed for positivity preservers that the same phase transition occurs at for and , as demonstrated in [51]. The work was generalized in [68] to matrices satisfying rank constraints.
Definition 6.9**.**
Fix non-negative integers and , and a set . Let denote the subset of matrices in that have rank at most , and let
[TABLE]
Similarly, let , and denote sets of the entrywise powers preserving Loewner properties on or , where .
The set of entrywise powers preserving the above notions are given in the table below (see [68, Theorem 1.2]).
7. Motivation from statistics
The study of entrywise functions preserving positivity has recently attracted renewed attraction due to its importance in the estimation and regularization of covariance/correlation matrices. Recall that the covariance between two random variables and is given by
[TABLE]
where denotes the expectation of . In particular, , the variance of . The covariance matrix of a random vector , is the matrix . Covariance matrices are a fundamental tool that measure linear dependencies between random variables. In order to discover relations between variables in data, statisticians and applied scientists need to obtain estimates of the covariance matrix from observations , …, of . A traditional estimator of is the sample covariance matrix given by
[TABLE]
where is the average of the observations. In the case where the random vector has a multivariate normal distribution with mean and covariance matrix , one can show that and are the maximum likelihood estimators of and , respectively [3, Chapter 3]. It is not difficult to show that is an unbiased estimator of . More generally, under weak assumptions, one can show that the distribution of is asymptotically normal as . The exact description of the limiting distribution depends on the moments and the cumulants of (see [20, Chapter 6.3]). For example, in the two-dimensional case, we have the following result.
Let denote the -dimensional normal distribution with mean and covariance matrix .
Proposition 7.1** (see [20, Example 6.4]).**
Let , …, be an independent and identically distributed sample from a bivariate vector with mean and finite fourth-order moments, and let be as in Equation (7.1). Then
[TABLE]
where is the symmetric matrix
[TABLE]
and and .
In traditional statistics, one usually assumes the number of samples is large enough for asymptotic results such as the one above to apply. In covariance estimation, one typically requires a sample size at least a few times the number of variables for that to apply. In such a case, the sample covariance matrix provides a good approximation of the true covariance matrix . However, this ideal setting is rarely seen nowadays. Indeed, our systematic and automated way of collecting data today yields datasets where the number of variables is often orders of magnitude larger than the number of instances available for study [41]. Classical statistical methods were not designed and are not suitable to analyze data in such settings. Developing new methodologies that are adapted to modern high-dimensional problems is the object of active research. In the case of covariance estimation, several strategies have been proposed to replace the traditional sample covariance matrix estimator . These approaches typically leverage low-dimensional structures in the data (low rank, sparsity, …) to obtain reasonable covariance estimates, even when the sample size is small compared to the dimension of the problem (see [111] for a detailed description of such techniques). One such approach involves applying functions to the entries of sample covariance matrices to improve their properties (see e.g. [6, 19, 44, 75, 76, 94, 114, 143]). For example, hard thresholding a matrix entails setting to zero the entries of the matrix that are smaller in absolute value than a prescribed value (thinking the corresponding variables are independent, for example). Letting
[TABLE]
thresholding is equivalent to applying the function entrywise to the entries of the matrix. Another popular example that was first studied in the context of wavelet shrinkage [42] is soft thresholding, where is replaced by
[TABLE]
Soft thresholding not only sets small entries to zero, it also shrinks all the other entries continuously towards zero. Several other thresholding and shrinkage procedures were also recently proposed in the context of covariance estimation (see [49] and the references therein).
Compared to other techniques, the above procedure has several advantages. Firstly, the resulting estimators are often significantly more precise than the sample covariance matrices. Secondly, applying a function to the entries of a matrix is very simple and not computationally intensive. The procedure can therefore be performed in very high dimensions and in real-time applications. This is in contrast to several other techniques that require solving optimization problems and often become too intensive to be used in modern applications. A downside of the entrywise calculus, however, is that the positive definiteness of the resulting matrices is not guaranteed. As the parameter space of covariance matrices is the cone of positive definite matrices, it is critical that the resulting matrices be positive definite for the technique to be useful and widely applicable. The problem of characterizing positivity preservers thus has an immediate impact in the area of covariance estimation by providing useful functions that can be applied entrywise to covariance estimates in order to regularize them.
Several characterizations of when thresholding procedures preserve positivity have recently been obtained.
7.1. Thresholding with respect to a graph
In [72], the concept of thresholding with respect to a graph was examined. In this context, the elements to threshold are encoded in a graph with . If is a matrix, we denote by the matrix with entries
[TABLE]
We say that is the matrix obtain by thresholding with respect to the graph . The main result of [72] characterizes the graphs for which the corresponding thresholding procedure preserves positivity. Denote by the set of real symmetric positive definite matrices and by the subset of positive definite matrices contained in (see Equation (6.1)).
Theorem 7.2** (Guillot–Rajaratnam [72, Theorem 3.1]).**
The following are equivalent:
- (1)
* for all ;* 2. (2)
, where , …, are disconnected and complete components of .
The implication of the theorem is intuitive and straightforward, since principal submatrices of positive definite matrices are positive definite. That may come as a surprise though, and shows that indiscriminate or arbitrary thresholding of a positive definite matrix can quickly lead to loss of positive definiteness.
Theorem 7.2 also generalizes to matrices that already have zero entries. In that case, the characterization of the positivity preservers remains essentially the same.
Theorem 7.3** (Guillot–Rajaratnam [72, Theorem 3.3]).**
Let be an undirected graph and let be a subgraph of , so that . Then is positive definite for every if and only if , where , …, are disconnected induced subgraphs of .
7.2. Hard and soft thresholding
Theorems 7.2 and 7.3 address the case where positive definite matrices are thresholded with respect to a given pattern of entries, regardless of the magnitude of the entries of the original matrix. The more natural case where the entries are hard or soft-thresholded was studied in [72, 73]. In applications, it is uncommon to threshold the diagonal entries of estimated covariance matrices, as the diagonal contains the variance of the underlying variables. Hence, for a given function and a real matrix , we let the matrix be defined by setting
[TABLE]
Theorem 7.4** (Guillot–Rajaratnam [72, Theorem 3.6]).**
Let be a connected undirected graph with vertices. The following are equivalent.
- (1)
There exists such that, for every , we have . 2. (2)
For every and every , we have . 3. (3)
* is a tree.*
The case of soft-thresholding was considered in [73]. Surprisingly, the characterization of the thresholding levels that preserve positivity is exactly the same as in the case of hard-thresholding.
Theorem 7.5** (Guillot–Rajaratnam [73, Theorem 3.2]).**
Let be a connected graph with vertices. Then the following are equivalent:
- (1)
There exists such that for every , we have . 2. (2)
For every and every , we have . 3. (3)
* is a tree.*
An extension of Schoenberg’s theorem (Theorem 2.12) to the case where the function is only applied to the off-diagonal entries of the matrix was also obtained in [73].
Theorem 7.6** (Guillot–Rajaratnam [73, Theorem 4.21]).**
Let and . The matrix is positive semidefinite for all A\in\mathcal{P}_{n}\bigl{(}(-\rho,\rho)\bigr{)} and all if and only if , where
- (1)
* is analytic on the disc ;* 2. (2)
; 3. (3)
* is absolutely monotonic on .*
When , the only functions satisfying the above conditions are the affine functions for .
7.3. Rank and sparsity constraints
An explicit and useful characterization of entrywise functions preserving positivity on for a fixed still remains out of reach as of today. Motivated by applications in statistics, the authors in [70, 71] examined the cases where the matrices in satisfy supplementary rank and sparsity constraints that are common in applications.
Observe that the sample covariance matrix (Equation (7.1)) has rank at most , where is the number of samples used to compute it. Moreover, as explained in Chapter 7, it is common in modern applications that is much smaller than the dimension . Hence, when studying the regularization approach described in Chapter 7, it is natural to consider positive semidefinite matrices with rank bounded above.
An immediate application of Schoenberg’s theorem on spheres (see Equation (2.3)) provides a characterization of entrywise positivity preservers of correlation matrices of all dimensions, with rank bounded above by . Recall that a correlation matrix is the covariance matrix of a random vector where each variable has variance , so is a positive semidefinite matrix with diagonal entries equal to . As in Equation (2.3), we denote the ultraspherical orthogonal polynomials by .
Theorem 7.7** (Reformulation of [125, Theorem 1]).**
Let and let . The following are equivalent.
- (1)
* for all correlation matrices A\in\mathcal{P}_{N}\bigl{(}[-1,1]\bigr{)} with rank no more than and all .* 2. (2)
* with for all and .*
Proof.
The result follows from [125, Theorem 1] and the observation that correlation matrices of rank at most are in correspondence with Gram matrices of vectors in . ∎
In order to approach the case of matrices of a fixed dimension, we introduce some notation.
Definition 7.8**.**
Let . Define to be the set of symmetric matrices with entries in . Let denote the rank of a matrix . We define:
[TABLE]
The main result in [71] provides a characterization of entrywise functions mapping into .
Theorem 7.9** **(Guillot–Khare–Rajaratnam [71, Theorem
B]).
Let and or . Fix integers , , and . Suppose . The following are equivalent.
- (1)
* for all ;* 2. (2)
* for some and some such that*
[TABLE]
Similarly, if and only if satisfies (2) and for all . Moreover, if and , then the assumption that is not required.
Notice that Theorem 7.9 is a fixed-dimension result with rank constraints. This may be considered a refinement of a similar, dimension-free result with rank constraints shown in [5], in which the authors arrive at the same conclusion as in part (2) above. We compare the two settings: in [5], (a) the hypotheses held for all dimensions rather than in a fixed dimension; (b) the test matrices were a larger set in each dimension, compared to just the positive matrices considered in Theorem 7.9; (c) the test matrices did not consist only of rank-one matrices, similar to Theorem 7.9; and (d) the test functions in the dimension-free case were assumed to be measurable, rather than as in the fixed-dimension case. Thus, Theorem 7.9 is (a refinement of) the fixed-dimension case of the first main result in [5].101010We also point out the second main result in loc. cit., that is, [5, Theorem 2], which classifies all continuous entrywise maps that obey similar rank constraints in all dimensions. Such maps are necessarily of the form , where the exponents and are non-negative integers. This should immediately remind the reader of Rudin’s conjecture in the ‘dimension-free’ case, and its resolution by Herz; see Theorem 3.2.
The implication in Theorem 7.9 is clear. Indeed, let and . Then
[TABLE]
and is a multinomial coefficient. Note that there are exactly terms in the previous summation. Therefore , and so easily follows from . The proof that is much more challenging; see [71] for details.
In [70], the authors focus on the case where sparsity constraints are imposed to the matrices instead of rank constraints. Positive semidefinite matrices with zeros according to graphs arise naturally in many applications. For example, in the theory of Markov random fields in probability theory ([93, 140]), the nodes of a graph represent components of a random vector, and edges represent the dependency structure between nodes. Thus, absence of an edge implies marginal or conditional independence between the corresponding random variables, and leads to zeros in the associated covariance or correlation matrix (or its inverse). Such models therefore yield parsimonious representations of dependency structures. Characterizing entrywise functions preserving positivity for matrices with zeros according to a graph is thus of tremendous interest for modern applications. Obtaining such characterizations is, however, much more involved than the original problem considered by Schoenberg as one has to enforce and maintain the sparsity constraint. The problem of characterizing functions preserving positivity for sparse matrices is also intimately linked to problems in spectral graph theory and many other problems (see e.g. [79, 1, 105, 31]).
As before, for a given graph on the finite vertex set , we denote by the set of positive-semidefinite matrices with entries in and zeros according to , as in (6.1). Given a function and , denote by the matrix such that
[TABLE]
The first main result in [70] is an explicit characterization of the entrywise positive preservers of for any collection of trees (other than copies of ). Following Vasudeva’s classification for in Theorem 4.1, trees are the only other graphs for which such a classification is currently known.
Theorem 7.10** **(Guillot–Khare–Rajaratnam [70, Theorem
A]).
Suppose for some , and . Let be a tree with at least vertices, and let denote the path graph on vertices. The following are equivalent.
- (1)
* for every ;* 2. (2)
* for all trees and all matrices ;* 3. (3)
* for every ;* 4. (4)
The function satisfies
[TABLE]
and is super-additive on , that is,
[TABLE]
The implication was further extended to all chordal graphs: it is the following result with and .
Theorem 7.11** (Guillot–Khare–Rajaratnam [69]).**
Let be a chordal graph with a perfect elimination ordering of its vertices . For all , denote by the induced subgraph on formed by , so that the neighbors of in form a clique. Define to be the clique number of , and let
[TABLE]
If is any function such that preserves positivity on and for all and , then preserves positivity on . [Here, denotes the matrices in of rank at most one.]
See [69] for other sufficient conditions for a general entrywise function to preserve positivity on for chordal.
To state the final result in this section, recall that Schoenberg’s theorem (Theorem 2.12) shows that entrywise functions preserving positivity for all matrices (that is, according to the family of complete graphs for ) are absolutely monotonic on the positive axis. It is not clear if functions satisfying (7.4) and (7.5) in Theorem 7.10 are necessarily absolutely monotonic, or even analytic. As shown in [70, Proposition 4.2], the critical exponent (see Definition 6.5) of every tree is . Hence, functions satisfying (7.4) and (7.5) do not need to be analytic. The second main result in [70] demonstrates that even if the function is analytic, it can in fact have arbitrarily long strings of negative Taylor coefficients.
Theorem 7.12** (Guillot–Khare–Rajaratnam [70, Theorem B]).**
There exists an entire function such that
- (1)
* for every ;* 2. (2)
The sequence contains arbitrarily long strings of negative numbers; 3. (3)
For every tree , for every A\in\mathcal{P}_{G}\bigl{(}\mathbb{R}_{+}\bigr{)}.
In particular, if denotes the maximum degree of the vertices of , then there exists a family of graphs and an entire function that is not absolutely monotonic, such that
- (1)
; 2. (2)
* for every .*
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Jim Agler, J. William Helton, Scott Mc Cullough, and Leiba Rodman. Positive semidefinite matrices with a given sparsity pattern. In Proceedings of the Victoria Conference on Combinatorial Matrix Analysis (Victoria, BC, 1987) , volume 107, pages 101–149, 1988.
- 2[2] Naum Ilyich Akhiezer. The classical moment problem and some related questions in analysis . Translated by N. Kemmer. Hafner Publishing Co., New York, 1965.
- 3[3] Theodore W. Anderson. An introduction to multivariate statistical analysis . Wiley Series in Probability and Statistics. Wiley-Interscience (John Wiley & Sons), Hoboken, third edition, 2003.
- 4[4] Nima Arkani-Hamed, Jacob L. Bourjaily, Freddy Cachazo, Alexander B. Goncharov, Alexander Postnikov, and Jaroslav Trnka. Scattering amplitudes and the positive Grassmannian. Preprint , available at http://arxiv.org/abs/1212.5605, 2012.
- 5[5] Aharon Atzmon and Allan Pinkus. Rank restricting functions. Linear Algebra Appl. , 372:305–323, 2003.
- 6[6] Zhi Dong Bai and Li-Xin Zhang. Semicircle law for Hadamard products. SIAM J. Matrix Anal. Appl. , 29(2):473–495, 2007.
- 7[7] Victor S. Barbosa and Valdir Antonio Menegatto. Strictly positive definite kernels on compact two-point homogeneous spaces. Math. Inequal. Appl. , 19(2):743–756, 2016.
- 8[8] Victor S. Barbosa and Valdir Antonio Menegatto. Strict positive definiteness on products of compact two-point homogeneous spaces. Integral Transforms Spec. Funct. , 28(1):56–73, 2017.
