On convergence of discrete methods of least squares on equidistant nodes
Ren\'e Goertz

TL;DR
This paper investigates the uniform convergence of least squares polynomial approximation on equidistant nodes, using Hahn polynomials and specific function classes, establishing conditions for convergence and error bounds.
Contribution
It provides new convergence criteria for least squares methods with Hahn polynomial expansions on equidistant grids, under smoothness and growth conditions of functions.
Findings
Convergence occurs for functions with specific smoothness and growth conditions.
Uniform convergence is guaranteed when N/n is sufficiently large and functions satisfy certain smoothness criteria.
Maximum error bounds are characterized for classes of functions with bounded derivatives.
Abstract
We consider the well-known method of least squares on an equidistant grid with nodes on the interval with the goal to approximate a function by a polynomial of degree . We investigate the following problem: For which ratio and which functions do we have uniform convergence of the least square operator ? We investigate this problem with a discrete weighting of the Jacobi-type. Thereby we describe the least square operator by the expansion of a function by Hahn polynomials . Without additional assumptions to functions it can not be guaranteed uniform convergence. But with and additional assumptions to and we obtain convergence and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical methods in inverse problems · Matrix Theory and Algorithms · Mathematical Analysis and Transform Methods
On convergence of discrete methods of least squares on equidistant nodes
René Goertz
(March 6, 2024)
Abstract
We consider the well-known method of least squares on an equidistant grid with nodes on the interval with the goal to approximate a function by a polynomial of degree . We investigate the following problem: For whichratio and which functions do we have uniform convergence of the least square operator ? We investigate this problem with a discreteweighting of the Jacobi-type. Thereby we describe the least square operator by the expansion of a function by Hahn polynomials . Without additional assumptions to functions it can not be guaranteed uniform convergence. But with and additional assumptions to and we obtain convergence and prove the following results: For an let and let be a sequence of natural numbers with . Then the method of least squares converges uniform on . Before we determine the maximum error („worst case“) with respect to the sup norm on the classes
.
1 Introduction and statement of the main results
It is over years ago since Legendre, Gauß and others started working with the method of least squares (cf., e.g., [14]). Since then, the method is used in many areas of mathematics and is nowadays a basic tool of applied mathematics (cf., e.g., [1], [3], [4], [10], [19]). Our focus in this paper is the pure approximation property of the method.
The method of least squares is defined as follows (cf., e.g., [10, p. 59], [9, p. 217], [15, p. 291]):
Let be distinct nodes for . Further let be a weight-function, which is positive on . For a let be a subspace of with on . The least square operator is unique defined by
[TABLE]
In this paper we investigate the standard case:
- •
is the space of polynomials of degree ,
- •
is an equidistant grid with nodes on the interval , i. e. for .
This situation is often occur in the practice: Since centuries polynomials are an intensive investigated function class to approximation. Moreover they can be applied effective on computers, because only the elementary operations addition and multiplication will be used for every computation. Equidistant collected informations are often exist, especially due to the data collection on big (often multidimensional) equidistant grids.
Without loss of generality let the interval be the standard interval in this paper.
We investigate the following problem:
For which functions and which ratio converges the sequence uniformly?
To investigate the above problem we describe the least square operator by the expansion of a function by Hahn polynomials . The Hahn polynomials are classical discrete orthogonal polynomials on the interval of degree . They are orthogonal on with respect to the inner product
[TABLE]
where is the weight-function given by
[TABLE]
They are normalized by
[TABLE]
(cf., e.g., [13, p. 204]).
It is well-known that the least square operator can be represented by use of Hahn polynomials (cf., e.g., [10, p. 62-63], [15, p. 270], [20, p. 218-232]):
[TABLE]
where .
Without an additional assumption to the functions the uniform convergence of the sequence can not be guaranteed (cf., e.g., [18, p. 106, Satz 4.10]). Hence we have to reduce the function class.
The Hahn polynomials can be interpreted as a discretization of the Jacobi polynomials . Because for a fixed the following relation between Hahn polynomials and Jacobi polynomials is well-known.
[TABLE]
for each (cf., e.g., [16, p. 45]).
For all approximation results in this paper we consider the important symmetric (so-called ultraspherical) case . The close connection between the series expansion of a function by Jacobi polynomials and the series expansion by Hahn polynomials, cf. (1.1), which have been proved in [12], is the motivation for my here presented, from Thomas Sonar and Tom Koornwinder inspired investigations: The series expansion by Jacobi polynomials is in the last decades a proved method to modelling. However it has to be evaluated integrals to calculate the coefficients. Usually this is done by discretization with the aid of methods of quadrature theory. Since it has to be discretized to approximate the integral, the question is obviously, if equivalently results can be obtained directly with the aid of discrete orthogonal polynomials, therefore without calculation of integrals. The main Theorem is:
Theorem 1.1**.**
Let and let for
[TABLE]
Further let
[TABLE]
For each with holds
[TABLE]
This estimation is not improvable in this sense, that the constant in inequality (1.2) can not be replaced by a lower value under the above assumptions.
That provides a possibility to compare the directly and the classical method by consideration the maximum error („worst case“) in the function classes
[TABLE]
This maximum error is according to (1.2) the constant and this is lower than in the corresponding classical case for for each ratio . That is proved in section 3.2.
In section 3 we present further possible applications. For example the following result:
Let , let
[TABLE]
and let be a sequence with . Then the method of least squares converges uniform on .
We compare our approximation results with corresponding results for the continuous case in section 3. In the next section we demonstrate preliminary Lemmata and prove our main Theorem 1.1.
2 Preliminaries
For our investigations is the following result from H. Brass fundamental:
Lemma 2.1** (cf. [6]).**
Let a distribution on and let
[TABLE]
be a family of orthogonal polynomials, which are orthogonal with respect to the inner product
[TABLE]
The polynomials are normalized by . Furthermore the distribution satisfy the properties:
- •
* für jedes ,*
- •
* für jedes .*
Let
[TABLE]
Then one has for each
[TABLE]
This estimation is not improvable in this sense, that the constant in inequality (2.1) can not be replaced by a lower value under the above assumptions.
To apply this result, we have to defined a corresponding distribution and the appropriate family of orthogonal polynomials in the following. First we use the representation of the Hahn polynomials by hypergeometric series: The Hahn polynomials are classical discrete orthogonal polynomials on the interval of degree . They can defined by the hypergeometric function as follows:
Definition 2.2** (cf., e.g., [13, p. 204]).**
Let and let . The polynomials which are defined by
[TABLE]
for each , are said to be Hahn polynomials.
The first Lemma give us a ratio for the boundedness of the Hahn polynomials. Furthermore we can see, that the maximum is on the boundary.
Lemma 2.3**.**
Let and let for
[TABLE]
Then, for any holds
[TABLE]
Proof.
It follows directly from [7]
[TABLE]
Furthermore one has the following symmetries (cf., e.g., [7]):
[TABLE]
With the definition 2.2 of the Hahn polynomials we have the positivity of:
[TABLE]
∎
Remark 2.4**.**
In the following let
[TABLE]
Let and let . Furthermore we consider in this section the distribution on , defined by
[TABLE]
In the next Lemma we prove all properties of the polynomials in Lemma 2.1.
Lemma 2.5**.**
Let and let for
[TABLE]
Let and let be the distribution of Remark 2.4. Furthermore let be the family of polynomials, which is defined in Remark 2.4. Then one has with the following properties:
* is a family of orthogonal polynomials, which is orthogonal with respect to the inner product .* 2. 2.
. 3. 3.
* for each .* 4. 4.
* for each .*
Proof.
For each holds
[TABLE]
With the definition of one has
[TABLE]
We obtain property .
With holds and we obtain property .
Furthermore one has for any
[TABLE]
With the index transformation and the equation
[TABLE]
follows
[TABLE]
We obtain property .
With Lemma 2.3 one has for each
[TABLE]
whereby we obtain property . ∎
Now we can apply Lemma 2.1.
Lemma 2.6**.**
Let and let for
[TABLE]
Let and let be the distribution of Remark 2.4. Furthermore let be the family of polynomials, which is defined in Remark 2.4. Let
[TABLE]
Then one has for each with
[TABLE]
This estimation is not improvable in this sense, that the constant in inequality (2.5) can not be replaced by a lower value under the above assumptions.
Proof.
We apply Lemma 2.1 and Lemma 2.5. With Lemma 2.5 the family of ortho-gonal polynomials satisfy all the assumptions of Lemma 2.1. So we can apply Lemma 2.1 and the claim follows. ∎
In the following Lemma we determine the factor in equation (2.4) of the previous Lemma 2.6.
Lemma 2.7**.**
With the assumptions of Lemma 2.6 we have for any
[TABLE]
Proof.
First one has for any and any
[TABLE]
With Lemma 2.3 holds
[TABLE]
Then one has
[TABLE]
With the representation of the Hahn polynomials in definition 2.2 follows
[TABLE]
The term is a polynomial of degree , which we can write in the form
[TABLE]
Hereby is a polynomial of degree . Hence we have
[TABLE]
With the transformations
[TABLE]
and
[TABLE]
we obtain
[TABLE]
Enter into the equation (2.7),
[TABLE]
and it follows equation (2.6). ∎
Now we can prove the main Theorem 1.1 with the aid of the previous lemmata:
Proof.
Let and let with . Then one has for each
[TABLE]
Now we apply Lemma 2.6 and Lemma 2.7 and we obtain
[TABLE]
We obtain with Lemma 2.6, that the estimation is not improvable. ∎
3 Conclusions
In this section we present some results, which we obtain by use of Theorem 1.1. Especially we discuss some cases, in which we obtain the uniform convergence of the method of least squares. First we investigate the factor of Theorem 1.1.
3.1 Uniform convergence of the discrete method of least squares
At the beginning we give the following Lemmata. The -function satisfy the asymptotic property:
Lemma 3.1** (cf., e.g., [2, p. 257]).**
For holds
[TABLE]
Lemma 3.2**.**
Let . Then one has
[TABLE]
Proof.
We use Lemma 3.1 and obtain
[TABLE]
∎
Lemma 3.3**.**
One has
[TABLE]
Proof.
We prove both inequalities successively. For that we use the Stirling’s formula (cf., e.g., [8, p. 50-53], [17])
[TABLE]
First we show the left inequality. With the aid of the Stirling’s formula we have
[TABLE]
We use one more time the Stirling’s formula
[TABLE]
We obtain the left inequality. For the right inequality we have by use of the Stirling’s formula again
[TABLE]
We use one more time the Stirling’s formula
[TABLE]
We obtain the right inequality. ∎
Remark 3.4**.**
With Lemma 3.3 we have
[TABLE]
Now we can simplify the estimation in Theorem 1.1.
Corollary 3.5**.**
Let and let for
[TABLE]
Then one has for each
[TABLE]
with .
Proof.
Let and let with . First we have
[TABLE]
With Lemma 3.2 we obtain
[TABLE]
We apply Remark 3.4 and obtain
[TABLE]
We use Theorem 1.1, then we obtain the inequality (3.5). ∎
For the important case we complement the following estimation.
Corollary 3.6**.**
Let for
[TABLE]
Furthermore let
[TABLE]
Then one has for each and for any with
[TABLE]
Under the above assumptions is the constant in inequality (3.6) improvable at most by the factor
[TABLE]
Proof.
Let and let with . For reduce the Theorem 1.1 to
[TABLE]
With the estimation
[TABLE]
we obtain
[TABLE]
This estimation is for any not improvable because of
[TABLE]
With Lemma 3.3 we have
[TABLE]
and we have that the inequality is not improvable except for the factor
[TABLE]
∎
With this Corollary 3.6 we can give a special and interest answer to our initially question:
For which classes of functions and which ratio converges the method of least squares uniformly?
Corollary 3.7**.**
Let , let
[TABLE]
and let be a sequence with
[TABLE]
Then the method of least squares converges uniform on the interval .
Proof.
First one has
[TABLE]
With simple transformations holds
[TABLE]
We transform again and obtain
[TABLE]
Now we can use Corollary 3.5:
[TABLE]
Because of one has
[TABLE]
∎
Concerning the above question we can easily give a sequence independent of :
Corollary 3.8**.**
Let , let
[TABLE]
and let be a sequence with . Then the method of least squares converges uniform on the interval .
Proof.
The sequence fulfils the assumption of Corollary 3.7 independent of . Because one has
[TABLE]
∎
3.2 Comparison to the continuous case
In this subsection we compare our approximation results of the discrete method of least squares with the results of the continuous method. The continuous method is the series expansion of a function by Jacobi polynomials , then the least square operator can be represented by
[TABLE]
This case was investigated by H. Brass in [5]. First we provide in the following some important properties of the Jacobi polynomials:
The Jacobi polynomials are classical orthogonal polynomials on the interval of degree . They can defined by the hypergeometric function as follows:
Definition 3.9** (cf., e.g., [13, p. 216]).**
Let . The polynomials which are defined by
[TABLE]
for each , are said to be Jacobi polynomials.
The Jacobi polynomials are orthogonal on the interval with respect to the inner product
[TABLE]
where is the weight-function given by
[TABLE]
They are normalized by
[TABLE]
(cf., e.g., [13, p. 217]).
For the Jacobi polynomials are bounded on the interval as follows:
[TABLE]
(cf., e.g., [2, p. 786]).
H. Brass proved the following result:
Lemma 3.10** (cf. [5]).**
Let . Further let
[TABLE]
Then one has for each
[TABLE]
This estimation is not improvable in this sense, that the constant in inequality (3.11) can not be replaced by a lower value under the above assumptions.
This result follows also from Lemma 2.1 (cf. [6]). In the following Lemma we determine the factor in equation (3.10) of the previous Lemma 3.10.
Lemma 3.11**.**
Let . Then for the constant of Lemma 3.10 holds
[TABLE]
Proof.
First the Jacobi polynomials are given by Definition 3.9
[TABLE]
We differentiate times and obtain
[TABLE]
With the transformations
[TABLE]
we have
[TABLE]
Then one has
[TABLE]
With equation (3.9) follows
[TABLE]
Then we have equation (3.12). ∎
Now we compare the constants of Theorem 1.1 (discrete case) and of Lemma 3.10 (continuous case), which are both not improvable. For the quotient one has with
[TABLE]
whereby in the discrete case we have the additional assumption . We define a function class by
[TABLE]
then we obtain the following Corollary:
Corollary 3.12**.**
Let . Further let be the continuous least square operator according to equation (3.7) and let be the discrete least square operator according to equation (1.1) with . Then one has
[TABLE]
Remark 3.13**.**
For the practical use we obtain for with Corollary 3.12 the following guarantee: The „worst case“ respecting to the class is in the continuous case worse than the corresponding discrete case, if the polynomial degree and the number of nodes fulfil the inequality .
Remark 3.14**.**
If we consider the „worst case“again
[TABLE]
*we obtain:
A ratio with any give us no better approximation in the sense of (3.15) than the ratio .*
Further comparisons with polynomial interpolation, method of least squares on different nodes and polynomial of best approximation you can find in [11].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Assyr Abdulle and Gerhard Wanner “200 years of least squares method” In Elemente der Mathematik Vol. 57, Iss. 2 , 2002, pp. 45–60
- 2[2] Milton Abramowitz and Irene A Stegun “Handbook of Mathematical Functions: with Formulas, Graphs, and Mathematical Tables” Dover Publications, Inc., 1964
- 3[3] Tilo Arens et al. “Mathematik” Springer-Verlag, 2015
- 4[4] Åke Björck “Numerical Methods for Least Squares Problems” SIAM, 1996
- 5[5] Helmut Brass “Approximation durch Teilsummen von Orthogonalpolynomreihen” In Numerische Methoden der Approximationstheorie 52 Springer-Verlag, 1980, pp. 69–83
- 6[6] Helmut Brass “Error estimates for least squares approximation by polynomials” In Journal of Approximation Theory Vol. 41, Iss. 4 , 1984, pp. 345–349
- 7[7] Holger Dette “New bounds for Hahn and Krawtchouk polynomials” In SIAM Journal on Mathematical Analysis Vol. 26, Iss. 6 , 1995, pp. 1647–1659
- 8[8] William Feller “An Introduction to Probability Theory and Its Applications” John Wiley & Sons, 1968
