
TL;DR
This paper extends smoothed analysis estimates for condition numbers to Gaussian distributions and introduces a local analysis concept to study their behavior around specific points.
Contribution
It provides a Gaussian extension of smoothed analysis estimates and introduces a new local analysis framework for condition numbers.
Findings
Extended smoothed analysis estimates to Gaussian distributions.
Introduced a local analysis notion for condition numbers.
Captured behavior of condition numbers around specific points.
Abstract
We extend to Gaussian distributions a result providing smoothed analysis estimates for condition numbers given as relativized distances to illposedness. We also introduce a notion of local analysis meant to capture the behavior of these condition numbers around a point.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsadvanced mathematical theories · Point processes and geometric inequalities · Bayesian Methods and Mixture Models
On local analysis
Felipe Cucker
Dept. of Mathematics
City University of Hong Kong
[email protected] Partially supported by a GRF grant from the Research Grants Council of the Hong Kong SAR (project number CityU 11302418).
Teresa Krick
Departamento de Matemática & IMAS
Univ. de Buenos Aires & CONICET
ARGENTINA
[email protected] Corresponding author. Partially supported by grant CONICET-PIP2014-2016-112 20130100073CO.
**Abstract. We extend to Gaussian distributions a result providing smoothed analysis estimates for condition numbers given as relativized distances to ill-posedness. We also introduce a notion of local analysis meant to capture the behavior of these condition numbers around a point.
2010 Mathematics Subject Classification: Primary 65Y20, Secondary 65F35.
Keywords: Conic condition number. Smoothed analysis. Local analysis.**
1 Introduction
In the 1990s D. Spielman and S.H. Teng introduced the notion of smoothed analysis, in an attempt to give a more realistic analysis of the practical performance of an algorithm than those obtained through the use of worst-case or average-case analyses. In a nutshell, this new paradigm in probabilistic analysis interpolates between worst-case and average-case by considering the worst-case (over the data) of the average value (over possible random perturbations) of the analyzed quantity. See, for instance, [7] for an overview.
An example of this analysis to the quantity , where is a square matrix and , was provided by M. Wschebor in [10]. Wschebor showed that
[TABLE]
where here, and in what follows, indicates that is drawn from an isotropic Gaussian distribution centered at with covariance matrix . The behavior of the bound in the right-hand side of (1) shows two expected properties of a smoothed analysis:
(SA1)
When , tends to its worst-case value (there are no random perturbations of the input in this case).
(SA2)
When , tends to the average value of the analyzed quantity (the random perturbation is over all the input data in this case).
Indeed, the convergence of to infinity when is clear, and with it (SA1). And a result of A. Edelman [6] proves that , thus showing (SA2).
The main agenda of this paper is to introduce the notion of local analysis, which aims to study locally at a base point the average value over possible random perturbations of the analyzed quantity, without taking then the worst-case over all input data. The benefit of such analysis is that it provides information depending directly on the base point instead of assuming a worst-case, as in the smoothed analysis.
We illustrate this notion by developing it for a conic condition number. This is a condition number satisfying a Condition Number Theorem. We next describe more precisely this notion and its context.
In 1936 Eckart and Young [5] proved that for a square matrix , where is the set of non-invertible matrices and denotes distance. This result came to be known as the Condition Number Theorem, even though it was proved more than ten years before the introduction of condition numbers by Turing [8] and von Neumann and Goldstine [9]. In 1987 J. Demmel observed (and proved) that similar Condition Number Theorems hold true for the condition numbers of various problems [3]. More precisely, he showed that these condition numbers were either equal to or closely bounded by the (normalized) inverse to the distance to ill-posedness. That is, that for an input data of the problem at hand, the condition number of for that problem is either equal to or closely bounded by
[TABLE]
where is an algebraic cone of ill-posed inputs. One year later, Demmel [4] derived general average analysis bounds for those (conic) condition numbers. These bounds depend only on the dimension of the ambient space, the codimension of , and its degree. He carried out this idea for the complex case and stated it for the real case (requiring to be complete intersection) based on an unpublished (and not findable anywhere) result by Ocneanu. The underlying probability distribution is the isotropic Gaussian on but it is easy to observe that the bounds hold as well for the uniform distribution on the unit sphere (or, equivalently, on any half-sphere, due to the equality ).
In [2] Demmel’s idea was extended to perform a smoothed analysis of the conic condition number in the case that is the zero set of a single real homogeneous polynomial in variables. For this analysis one considers the centers of the distributions in (as in (1)) and there are two natural choices for the distribution itself: a Gaussian supported in or a uniform on a spherical cap in . The uniform case is studied in [2], where the following bound is obtained for :
[TABLE]
where is the degree of and is the spherical cap of radius centered at which we endow with the uniform distribution. This bound recovers an average analysis in the particular case that the spherical cap is a half-sphere. That is,
(SA2’)
, is the average value of for , see [4].
A smoothed analysis of the conic condition number in the Gaussian case was still lacking, and it is one of the results we present in this paper, since it is strongly linked with our local analysis as we will see below. Theorem 4.1 shows that
[TABLE]
where is an explicit bound that satisfies (SA1) and (SA2). That is
[TABLE]
With respect to local analysis, the gist is to obtain bounds for the quantities
[TABLE]
where and is either the uniform distribution on the spherical cap or the Gaussian .
These bounds will be expressions where is either or depending on the underlying distribution, which should coincide with smoothed analysis bounds when . More precisely, if we denote by the result of replacing by in then we want the following:
(LA0)
has the same behavior as the smoothed analysis bound .
Furthermore, when we seek the following limiting behavior:
(LA1)
, the local complexity at .
(LA2)
in the Gaussian case, the average complexity.
(LA2’)
in the uniform case, the average complexity.
Indeed, we show that this is the case in Theorem 3.1 (uniform case) and Theorem 4.8 (Gaussian case).
Acknowledgments. We are grateful to Pierre Lairez for many useful discussions. In particular, for pointing to us an argument in Proposition 4.2.
2 Notations and preliminaries
In all what follows we consider the space endowed with the standard inner product and its induced norm . Within this space we have the unit sphere , and for we denote by the closed ball centered at with radius , and by
[TABLE]
the spherical cap in centered at with radius , that is the closed ball of radius around in with respect to the Riemannian distance in .
We will also refer to the sine distance in given by . Let denote the closed ball of radius with respect to around . This is the union of with where is such that .
We will denote by the volume of . We recall (see [1, Prop. 2.19(a)]) that
[TABLE]
as well as [1, Cor. 2.20]
[TABLE]
and, for and , the bound (see [1, Lem. 2.34])
[TABLE]
The main object in this paper is a conic condition number on , i.e. a function given by
[TABLE]
where is the set of ill-posed inputs in , which we assume closed under scalar multiplication. We note that for all since . As is scale invariant we may restrict to data lying in where can also be expressed as
[TABLE]
3 The uniform case
We endow with the uniform probability measure. A smoothed analysis for this measure is given in [1, Th. 21.1]. Assume that is contained in a real algebraic hypersurface, given as the zero set of a homogeneous polynomial of degree . Then, for all and , we have
[TABLE]
and
[TABLE]
where . Here denotes Neperian logarithm. We observe that the equality above is due to the fact that for all and that .
The same observation applies to the following result.
Theorem 3.1**.**
Let ba a conic condition number on with set of ill-posed inputs . Assume that is contained in a real algebraic hypersurface, given as the zero set of a homogeneous polynomial of degree . Let and . Then, for ,
[TABLE]
In particular, there is a uniform explicit bound –defined in (10) below– such that
[TABLE]
This bound satisfies satisfies (LA0), since as in (3), (LA1) and (LA2’).
Proof. Assume first that . In this case, we have
[TABLE]
and we can decompose
[TABLE]
Therefore, by (7),
[TABLE]
We next assume . In this case,
[TABLE]
since . Equivalently,
[TABLE]
We also use here that for all ,
[TABLE]
and therefore
[TABLE]
which implies
[TABLE]
This shows the first statement. We now derive the expression of a bound .
Let be the function defined by
[TABLE]
where the exponent of in the numerator is the logarithm in base of , which, by continuity, we take to be 0 when . We note that is concave, monotonically increasing, d satisfies , , and when , . Moreover, by monotonicity,
[TABLE]
This implies, since
[TABLE]
and using also concavity,
[TABLE]
that
[TABLE]
That is,
[TABLE]
Finally, it is trivial to verify, from the specific values taken by mentioned previously, that satisfies (LA0), (LA1) and (LA2’). ∎
4 The Gaussian case
We keep the same conic condition number but now consider a Gaussian measure in centered at and with covariance matrix for , that is with density function given by
[TABLE]
Since our local analysis will rely on a smoothed analysis in this case, which is not yet known, we begin by studying a general smoothed analysis for the Gaussian case.
4.1 Smoothed analysis
Let . We recall that, for any ,
[TABLE]
and in the particular case we denote
[TABLE]
the open half-sphere centered at .
The main result of this section is the following smoothed analysis for the Gaussian distribution.
Theorem 4.1**.**
Let be a conic condition number on with set of ill-posed inputs . Assume that is contained in a real algebraic hypersurface, given as the zero set of a homogeneous polynomial of degree , and that . Then, there exists an explicit bound –defined in (13)– such that
[TABLE]
This bound satisfies
(SA1)
, the worst-case value.
(SA2)
, the average value, in remarkable coincidence with (8).
The following map plays a central role in all what follows,
[TABLE]
The main stepping stone towards the proof of Theorem 4.1 is the following.
Proposition 4.2**.**
Let . There exists a probability density of a random variable , associated to , and , such that for all measurable function satisfying for all , one has
[TABLE]
We begin by proving the following lemma.
Lemma 4.3**.**
For any measurable function satisfying , one has
[TABLE]
where is a decreasing function of defined by
[TABLE]
Proof. We have
[TABLE]
where the second equality follows from the transformation formula [1, Thm. 2.1] applied to the diffeomorphism
[TABLE]
and
[TABLE]
does not depend on . Now, for ,
[TABLE]
Therefore, where for ,
[TABLE]
which is a continuously differentiable decreasing function of . ∎
Proof of Proposition 4.2. By Lemma 4.3,
[TABLE]
Now, by the fundamental Theorem of Calculus for ,
[TABLE]
Replacing this in (12) and changing the order of integration, we obtain
[TABLE]
Now, since
[TABLE]
we obtain
[TABLE]
We now denote
[TABLE]
which is a non-negative function since is decreasing, and rewrite the equality above as
[TABLE]
where
[TABLE]
We now prove that : Changing variables we have
[TABLE]
To estimate the quantity between the square brackets we use the known equality
[TABLE]
together with (4) to obtain
[TABLE]
Therefore
[TABLE]
This implies, by taking , that
[TABLE]
i.e.
[TABLE]
Therefore is a density on , and
[TABLE]
∎
Since for all , we can now focus on .
Proposition 4.4**.**
With the notation in Proposition 4.2, we have
[TABLE]
Proof. Replacing the expectations in the right-hand side of the equality in Proposition 4.2 by their bound in (7) for and , we obtain
[TABLE]
where . The result follows from the last equality in Proposition 4.2. ∎
Our next goal is to estimate the right-hand side in Proposition 4.4.
Lemma 4.5**.**
Let . Then
[TABLE]
Proof. Write
[TABLE]
Since for , the second term satisfies
[TABLE]
We analyze the first term. Let A=\{(\theta,r)\in[t,\frac{\pi}{4}]\times[0,\ln\big{(}\frac{1}{\sin t}\big{)}]\,:\,r\leq\ln(\frac{1}{\sin\theta})\}, and . By Fubini’s Theorem we have both
[TABLE]
and
[TABLE]
since implies \ln\sqrt{2}\leq\ln\big{(}\frac{1}{\sin t}\big{)} and when , then . Therefore,
[TABLE]
by taking . Finally,
[TABLE]
since . ∎
Lemma 4.6**.**
Assume . For all , one has
[TABLE]
Proof. For ,
[TABLE]
for defined in (11). The first inequality holds because for , implies , and the second by Proposition 4.2 applied to F=\mathsf{1l}_{\big{\{}\sphericalangle(\Psi(y),\overline{x})\leq t\big{\}}}. It is then enough to bound the right-hand expression.
We observe that for , the set K=\big{\{}y\in\mathbb{R}^{N+1}\,:\,\sphericalangle(\Psi(y),\overline{x})\leq t\big{\}} is a pointed cone with vertex at [math], central axis passing through and angular opening . In addition, one can prove by the cosine theorem that this cone is included in the union of the pointed cone with vertex at , central axis passing through and angular opening with the intersection (see Figure 1). Hence, the measure of (with respect to ) is bounded by the sum of the measures of and .
As the vertex of coincides with the center of , the measure of with respect to equals the proportion of the volume (in ) of the intersection of with within this sphere. That is, the measure of with respect to satisfies
[TABLE]
where, we recall, . Using (6) we deduce that, for ,
[TABLE]
Also,
[TABLE]
Here we used the well-known lower bound \Gamma(\frac{N+1}{2})>\sqrt{2\pi}\Big{(}\frac{N-1}{2}\Big{)}^{\frac{N}{2}}e^{-\frac{N-1}{2}} (see for instance [1, Eq. 2.14]) for the last inequality. We finish the proof by noting that it can be easily proven by induction, using for instance that , that for all , we have
[TABLE]
Lemma 4.7**.**
Assume . Then,
[TABLE]
Proof. We have by Lemma 4.5 with ,
[TABLE]
where by Lemma 4.6, since for ,
[TABLE]
We have
[TABLE]
where . In addition we observe that for all , since
[TABLE]
Rewriting we get
[TABLE]
∎
Proof of Theorem 4.1. By Proposition 4.4 and Lemma 4.7,
[TABLE]
with . We then define
[TABLE]
We now verify that satisfies (SA1) and (SA2):
(SA1)
\displaystyle\lim_{\sigma\to 0}H(N,d,\sigma)=\displaystyle\lim_{\sigma\to 0}\Big{(}\dfrac{1}{N}\Big{(}1+\ln\big{(}2^{N-1}+\dfrac{1}{\sigma^{N+1}}\big{)}\Big{)}+\ln(Nd)+2(\ln 2+1)\Big{)}
{\ }\qquad\qquad\qquad\quad=\displaystyle\lim_{\sigma\to\infty}\Big{(}\dfrac{N+1}{N}\ln\dfrac{Nd}{\sigma}+{\cal O}(1)\Big{)}\ =\ \infty.
Note that actually the difference of the formula in the last line compared to (7), with the dispersion parameter replacing , is negligible.
(SA2)
, and we recover the well-known, average-case analysis, bound for (see [4] and [1, Theorem 21.1]).
∎
4.2 Local analysis
The main result of this section is the following.
Theorem 4.8**.**
Let be a conic condition number on with , with set of ill-posed inputs . Assume that is contained in a real algebraic hypersurface, given as the zero set of a homogeneous polynomial of degree . Let and . Then, there is an explicit bound –defined in (4.2) below– such that
[TABLE]
This bound satisfies (LA0), (LA1) and (LA2).
In order to prove Theorem 4.8 we need the following lemma.
Lemma 4.9**.**
Assume . For all ,
[TABLE]
Proof. The idea is to apply Markov’s inequality (e.g. [1, Corollary 2.9]) to the density to deduce that
[TABLE]
Therefore we need to bound . We first prove that
[TABLE]
where is given by (11), and then that
[TABLE]
This implies
[TABLE]
To show (14) we apply Proposition 4.2 with and get
[TABLE]
We claim that
[TABLE]
Indeed, for , one has
[TABLE]
Therefore, writing ,
[TABLE]
Now, for , we have
[TABLE]
which implies
[TABLE]
Using (6) twice we have, for ,
[TABLE]
and we deduce that . With this,
[TABLE]
which shows (17). From (16) and (17) it follows that
[TABLE]
which shows (14). We now show (15). We let be the closest point to on the line through [math] and (see Figure 2) and have
[TABLE]
where the last inequality is a consequence of [1, Prop. 2.10 & Lem. 2.15].
This shows (15). Therefore,
[TABLE]
as desired, and hence,
[TABLE]
Proof of Theorem 4.8. Let . Since , and we have . For all and all we have
[TABLE]
which implies .
We apply Proposition 4.2 to and use the previous inequality and the bounds (7) and (8) to obtain
[TABLE]
since
[TABLE]
We next bound each of the first three terms in the right-hand side.
Applying Lemma 4.6 and the inequality we obtain
[TABLE]
This bounds the first term in (4.2) by
[TABLE]
Second, by Lemma 4.5 since , Lemma 4.9 and ,
[TABLE]
Also, as , we have by Lemma 4.7 that
[TABLE]
Putting together this inequality and (4.2) we deduce that the second term in (4.2) is bounded by
[TABLE]
Finally, using again Lemma 4.9 and we obtain
[TABLE]
which bounds the third term in (4.2) by
[TABLE]
Combining (19), (4.2) and (22) with the bound in (4.2), we obtain
[TABLE]
where . We now verify that satisfies (LA0), (LA1) and (LA2).
(LA0)
When we get
[TABLE]
which is that of (13) (with a slightly bigger constant) as required in (LA0).
(LA1)
When , we have
[TABLE]
as required.
(LA2)
Also, when , we get
[TABLE]
and we recover the average-case analysis bound for .
∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Bürgisser and F. Cucker. Condition , volume 349 of Grundlehren der mathematischen Wissenschaften . Springer-Verlag, Berlin, 2013.
- 2[2] P. Bürgisser, F. Cucker and M. Lotz. The probability that a slightly perturbed numerical analysis problem is difficult. Mathematics of Computation , 77:1559–1583, 2008.
- 3[3] J. Demmel. On condition numbers and the distance to the nearest ill-posed problem. Numer. Math. , 51:251–289, 1987.
- 4[4] J. Demmel. The probability that a numerical analysis problem is difficult. Math. Comp. , 50:449–480, 1988.
- 5[5] C. Eckart and G. Young. The approximation of one matrix by another of lower rank. Psychometrika , 1:211–218, 1936.
- 6[6] A. Edelman. Eigenvalues and condition numbers of random matrices. SIAM J. of Matrix Anal. and Applic. , 9:543–556, 1988.
- 7[7] D.A. Spielman, S.H. Teng. Smoothed analysis: an attempt to explain the behavior of algorithms in practice. Communications of the ACM , 52(10):76–84, 2009.
- 8[8] A.M. Turing. Rounding-off errors in matrix processes. Quart. J. Mech. Appl. Math. , 1:287–308, 1948.
