On Mahler's transcendence measure for $e$
Anne-Maria Ernvall-Hyt\"onen, Tapani Matala-aho, Louna Sepp\"al\"a

TL;DR
This paper provides an explicit transcendence measure for e and its powers, improving previous results by employing Hermite-Padé approximations and detailed analysis of common factors.
Contribution
It introduces a new explicit transcendence measure for e and its powers, advancing the work of Mahler, Borel, and Hata.
Findings
Explicit transcendence measure for e established
Transcendence measure for positive integer powers of e proved
Improved bounds based on Hermite-Padé approximations
Abstract
We present a completely explicit transcendence measure for . This is a continuation and an improvement to the works of Borel, Mahler and Hata on the topic. Furthermore, we also prove a transcendence measure for an arbitrary positive integer power of . The results are based on Hermite-Pad\'e approximations and on careful analysis of common factors in the footsteps of Hata.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On Mahler’s transcendence measure for
Anne-Maria Ernvall-Hytönen
Matematik och Statistik, Åbo Akademi University, Domkyrkotorget 1, 20500 Åbo, Finland
,
Tapani Matala-aho
and
Louna Seppälä
Matematiikka, PL 8000, 90014 Oulun yliopisto, Finland
Abstract.
We present a completely explicit transcendence measure for . This is a continuation and an improvement to the works of Borel, Mahler and Hata on the topic. Furthermore, we also prove a transcendence measure for an arbitrary positive integer power of . The results are based on Hermite-Padé approximations and on careful analysis of common factors in the footsteps of Hata.
Key words and phrases:
Diophantine approximation, Hermite-Padé approximation, transcendence
2010 Mathematics Subject Classification:
11J82, 11J72, 41A21
The work of the author L.S. was supported by the Magnus Ehrnrooth Foundation. The work of the author A.-M. E.-H. was supported by the Academy of Finland project 303820, and by the Finnish Cultural Foundation.
The published version of this article may be found at https://doi.org/10.1007/s00365-018-9429-3.
1. Introduction
Let be given and define as the infimum of the numbers satisfying the estimate
[TABLE]
for all with . Then any function greater than or equal to may be called a transcendence measure for (see [6]). The quest to obtain good transcendence measures for dates back to Borel [3]. He proved that is smaller than for some positive constant depending only on . This was considerably improved by Popken [12, 13], who showed that for some positive constant depending on . Soon afterwards, Mahler [10] was able to get the dependance on explicit:
[TABLE]
with an absolute positive constant. The price he had to pay was that he was only able to prove the validity of the result in some subset of the set consisting of with , unlike the results by Borel and Popken. Finally, in 1991, Khassa and Srinivasan [8] proved that the constant can be chosen to be in the set with for some absolute constant . Soon after, in 1995, Hata [6] proved that the constant can be chosen to be in the set of and with . A broader view about questions concerning transcendence measures can be found for instance in the books of Fel’dman and Nesterenko [4], and Baker [2].
In [6] Hata introduced a striking observation of big common factors hiding in the auxiliary numerical approximation forms. These numerical approximation forms are closely related to the classical Hermite-Padé approximations (simultaneous approximations of the second type) of the exponential function used already by Hermite. The impact of the common factors was utilized in an asymptotic manner resulting in Theorem 1.2 in [6]. Hata’s Theorem 1.2 is sharper than Theorem 1.1 in his paper, but it is only valid for in an asymptotic sense: no explicit lower bound is given, instead, the theorem is formulated for a large enough .
In this article we present a more extensive result, Theorem 2.1. The improvements compared to Hata are made visible in its corollary, Theorem 1.1 below. Our Theorem 1.1 improves Hata’s bound for the function in his Theorem 1.1, and extends the set of values of for which the result is valid whenever . In addition, this result makes Hata’s Theorem 1.2 completely explicit, mainly due to our rigorous treatment of the common factors, giving rise to a more complicated behaviour visible in the term
[TABLE]
where for any , and is the set of prime numbers. We also give the exact asymptotic impact in (4), as well as approximations for values of for specific values of .
Theorem 1.1**.**
Assume , where and for . Now
[TABLE]
Asymptotically, we have
[TABLE]
Throughout his work, Hata assumed with the choice . The bound is considerably larger than our bound at its smallest (excluding the small cases ): . The choice of the function was made as an attempt to balance between the amount of technical details, and the improvement of the function against the size of the set of the values of .
In our Main theorem 2.1 we present a completely explicit transcendence measure for , in terms of and . The proof starts with Lemma 3.2, which gives a suitable criterion for studying lower bounds of linear forms in given numbers. Furthermore, we exploit estimates for the exact inverse function of the function , , in the lines suggested in [5]. As an important consequence of using the function , the functional dependence in is improved compared to earlier considerations.
The method displayed in this paper is applicable to proving bounds of the type displayed in (1). As an example, we will consider the case where the polynomial is sparse, namely, where several of the coefficients in (1) are equal to zero. As a corollary of this, we derive a transcendence measure for positive integer powers of .
It should be noted that all our results are actually valid over an imaginary quadratic field .
2. Main result
Let denote the inverse function of the function , , and denote further
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
with given in (2).
Throughout this work, let denote an imaginary quadratic field and its ring of integers.
Theorem 2.1**.**
Let and . With the above notations, the bound
[TABLE]
where
[TABLE]
holds for all with .
Corollary 2.2**.**
With the assumptions of Theorem 2.1 we have
[TABLE]
where
[TABLE]
3. Preliminaries, lemmas and notation
Fix now . Assume that we have a sequence of simultaneous linear forms
[TABLE]
where the coefficients
[TABLE]
satisfy the determinant condition
[TABLE]
Further, let and suppose that
[TABLE]
[TABLE]
where
[TABLE]
[TABLE]
for all . Let the above assumptions be valid for all .
Before presenting a criterion for lower bound, Lemma 3.2, we need some properties of the inverse function of the function , , considered in [5].
Lemma 3.1**.**
[5]** The inverse function of the function , , is strictly increasing. Define and for . Suppose , then Thus the inverse function may be given by the infinite nested logarithm fraction
[TABLE]
Further, we denote
[TABLE]
Lemma 3.2**.**
Let and . Then, under the above assumptions (7)-(9), the bound
[TABLE]
[TABLE]
holds for all with .
Proof.
We use the notation
[TABLE]
for the linear form to be estimated. Using our simultaneous linear forms
[TABLE]
from (6) we get
[TABLE]
where
[TABLE]
If now , then by (11) and (12) we get
[TABLE]
Now we take the largest with
[TABLE]
such that with big enough (to be determined later). Consequently .
According to the non-vanishing of the determinant (7) and the assumption , it follows that for some integer . Hence we get the estimate
[TABLE]
for our linear form , where we need to write in terms of .
Since , we have
[TABLE]
By (13) we have . Thus
[TABLE]
or equivalently . Further, by (13), which implies
[TABLE]
Then, by the properties of the function given in Lemma 3.1, we get
[TABLE]
Now we are ready to estimate as follows:
[TABLE]
By (15) we get
[TABLE]
Substituting (18) into (17) gives
[TABLE]
where we applied (13).
Hence
[TABLE]
where , and are precisely as in the formulation of Lemma 3.2. The claim now follows from (14) and (16). ∎
Let us now formulate a lemma that can be used to bound the function . It is extremely useful while comparing our results with the results of others.
Lemma 3.3**.**
If , we have . When in addition , for the inverse function of of the function it holds
[TABLE]
Proof.
Denote with . Then
[TABLE]
because . ∎
Corollary 3.4**.**
If , and , then
[TABLE]
Proof.
Since , Lemma 3.3 gives
[TABLE]
when we denote . Lemma 3.2 now implies
[TABLE]
∎
4. Hermite-Padé approximants for the exponential function
Hermite-Padé approximants of the exponential function date back to Hermite’s [7] transcendence proof of ; see also [16].
Lemma 4.1**.**
Let , , and be given and define by
[TABLE]
Then
[TABLE]
and
[TABLE]
for all and .
Theorem 4.2**.**
Let be distinct complex numbers. Denote and . Put
[TABLE]
Then there exist polynomials and remainders such that
[TABLE]
where
[TABLE]
for .
Proof.
First we have
[TABLE]
since for . Using Laplace transform, we can write this as
[TABLE]
Then
[TABLE]
We have and consequently
[TABLE]
By setting , we get the approximation formula
[TABLE]
where
[TABLE]
and
[TABLE]
Going backwards in (23) with (24) in mind we see that
[TABLE]
Note that the coordinate corresponds to in Lemma 4.1, and consequently we now have in the place of in the definition of (20). Hence for , and . In addition, ord for , since the function
[TABLE]
is analytic at the origin. ∎
Lemma 4.3**.**
We have for all .
Proof.
In the case we have
[TABLE]
by (21). The claim clearly holds due to the definition of .
Next write
[TABLE]
where
[TABLE]
By (22) it is sufficient to show that for . By (26) we have
[TABLE]
where implies , thus giving the result. ∎
5. Determinant
In order to fulfil the determinant condition (7) we choose
[TABLE]
i.e. for , and . Now . Then we write
[TABLE]
for all .
The non-vanishing of the determinant follows from the next well-known lemma (see for example Mahler [11, p. 232] or Waldschmidt [16, p. 53]).
Lemma 5.1**.**
There exists a constant such that
[TABLE]
Proof.
According to Theorem 4.2 and the equations in (28), the degrees of the entries of the matrix defining are
[TABLE]
We see that and the leading coefficient is a product of the leading coefficients of , which are non-zero.
On the other hand, column operations yield
[TABLE]
as . By Theorem 4.2, the order of each element in columns is at least . Therefore . ∎
6. Common factors
From now on we set for and denote
[TABLE]
for , and
[TABLE]
for , . Then, by Theorem 4.2, we have a system of linear forms
[TABLE]
where
[TABLE]
for , and
[TABLE]
for , .
Further, by Lemma 4.3 holds for all , . Next we try to find a common factor from the integer coefficients of the new polynomials .
Let in this section. We will also need the -adic valuation and its well-known property
[TABLE]
(for reference, see [9]).
Theorem 6.1**.**
For , we have
[TABLE]
Proof.
Let us start by writing the polynomial from (30) in a different way, using the representation (21):
[TABLE]
where and, by (20),
[TABLE]
So
[TABLE]
Here because
[TABLE]
So, we may expect some common factors from the terms .
Let be a prime number. Now, using (32),
[TABLE]
Recall from (27) that while for . Since the result (34) can be written as
[TABLE]
So, there is a factor
[TABLE]
which is a common divisor of all the coefficients of . The proof is complete.∎
Now we need to find a common factor dividing all .
Theorem 6.2**.**
Assume . Then there exists a positive integer
[TABLE]
with
[TABLE]
satisfying
[TABLE]
for all .
Proof.
From our assumption , and equations (25) and (33) it follows
[TABLE]
where
[TABLE]
So
[TABLE]
As before, we may expect some common factors from the terms
[TABLE]
Let . With considerations similar to those in (34), we get
[TABLE]
∎
Combining Theorems 6.1 and 6.2 gives us the complete result:
Corollary 6.3**.**
For all we have
[TABLE]
Theorem 6.4**.**
Let . Then the common factor satisfies the bound
[TABLE]
where
[TABLE]
and . Further,
[TABLE]
and asymptotically we have
[TABLE]
Proof.
We begin with the estimate of Theorem 6.2:
[TABLE]
Then
[TABLE]
We use the estimate
[TABLE]
since we are estimating a common divisor of all . Next we use the property (32) and the assumption in order to estimate :
[TABLE]
Altogether where
[TABLE]
proving the estimate (35).
Next we study the bound (37). Let be fixed, then
[TABLE]
when . To prove (39) above we differentiate the function :
[TABLE]
since when . Next write
[TABLE]
Then
[TABLE]
Thus, the bound together with (39) verifies the estimate
[TABLE]
Hence, by restricting the sum to primes , we get
[TABLE]
On the other hand,
[TABLE]
This proves the asymptotic behaviour (38). As for the numerical value in (38), see the sequence A138312 in [15]. ∎
With , for instance (36) gives
[TABLE]
Note that to simplify numerical computations for large , the estimate (37) is already rather sharp, where in addition the factor is very close to 1.
Lemma 6.5**.**
It holds that for all .
Proof.
[TABLE]
We choose, for example, which is equivalent to . Then
[TABLE]
Now
[TABLE]
when . In (42) we have an increasing lower bound for , and therefore
[TABLE]
when . As for , the estimate is quickly verified using Sage [14] and estimate (37).
∎
7. Numerical linear forms
By extracting the common factor from the linear forms (29) we are led to the numerical linear forms
[TABLE]
where
[TABLE]
are integers and
[TABLE]
Note that and .
According to (27), now . We have for , . Because of the condition (13) we have the assumption .
The following two lemmas give the necessary estimates for the coefficients and the remainders of the linear forms (43). In the subsequent estimates we shall use Stirling’s formula (see e.g. [1], formula 6.1.38) in the form
[TABLE]
Then
[TABLE]
Lemma 7.1**.**
Let when , and when . We have
[TABLE]
for and . When , we have the bounds
[TABLE]
Proof.
The structure of the proof is the following: First we treat the term by using the formulas given for it to obtain a bound. Then we factor out the common factor of yielding to . Finally, the bound is then the bound for .
By (30) we have
[TABLE]
Let us split the integral into following pieces:
[TABLE]
When ,
[TABLE]
Hence, we may estimate
[TABLE]
Write . Now , which has a unique zero at . The function is increasing for and decreasing for . Let us now estimate the integrals. The function obtains its maximum at , and we may thus estimate
[TABLE]
On the interval , the function is decreasing. Our aim is to find an upper bound for the integral using a geometric sum. Let us first write
[TABLE]
Notice that , when . It follows that . Hence
[TABLE]
Finally, we have to estimate the first integral. We have
[TABLE]
for . Now
[TABLE]
Hence
[TABLE]
when . When , we can estimate
[TABLE]
We may conclude that
[TABLE]
Next we take into account the common factor estimated by . Remember that will be the expression that is obtained when is divided by the common factor. Now
[TABLE]
Since and , we have
[TABLE]
and
[TABLE]
At last, estimate (46) with (47) and (48) yields
[TABLE]
When , we have and . Now
[TABLE]
When , we have and we need to divide by the common factor to get the correct bound for the term . Hence
[TABLE]
When , it holds and again we have to divide by the common factor to get the correct bound for the term . Hence
[TABLE]
In all three cases, the coefficient of is a decreasing function in . ∎
Lemma 7.2**.**
Let when , and when . We have
[TABLE]
for , , and . When , we have the bounds
[TABLE]
Proof.
We proceed as in the proof of Lemma 7.1: first we bound the terms , then sum them, and finally factor out the common divisor to obtain the bound . According to equation (31), we have the representation
[TABLE]
The expression attains its maximum in the interval for the first time when , so
[TABLE]
Thus we may estimate
[TABLE]
when . Using the estimate , and summing together the terms , we get
[TABLE]
Again we divide by the common factor . Thus the new values satisfy:
[TABLE]
Now
[TABLE]
[TABLE]
and
[TABLE]
because and . Together these estimates yield
[TABLE]
When , we can bound the terms in the following way:
[TABLE]
We may now move to estimating the sums for small . When , we have and
[TABLE]
When , we have and we need to divide to remove the common factors. Then
[TABLE]
When , we have and again we divide by the common factor. Thus
[TABLE]
Again, in all three cases, the coefficient of is a decreasing function in . ∎
8. Measure
We will apply Lemma 3.2. The determinant condition (7) is certainly satisfied by Lemma 5.1 and (43). According to Lemmas 7.1 and 7.2, we have and , where
[TABLE]
[TABLE]
for all , . Comparing formulas (50) and (51) to (8) and (9), we have
[TABLE]
Now, with , the formulas in (10) give
[TABLE]
for all . Recall also the shorthand notations and .
For the small values , we compare equations (45) and (49) to (8) and (9). Again and , and moreover
[TABLE]
Hence, with and for , we get
[TABLE]
We may thus finally establish our Main result 2.1:
Proofs of Theorem 2.1 and Corollary 2.2.
The values above have been achieved with the choice . Combining them with Lemma 3.2 leads straight to the result (5). Corollary 2.2 follows likewise by plugging these values into Corollary 3.4. ∎
Estimate (3) still requires a bit more work.
Proof of Theorem 1.1.
Let us first consider the case with . According to Lemma 3.2, we have
[TABLE]
where
[TABLE]
by estimate (19). By recalling our assumption , it is obvious from the expression (54) that the terms corresponding to the parameters and contribute much less than the term corresponding to the parameter . The first task is to bound them in such a way that they only slightly increase the constant term in the expression for the parameter . Let us start with the terms and . We have
[TABLE]
Since , we may estimate
[TABLE]
Hence, the estimate becomes
[TABLE]
We have now derived
[TABLE]
When , the above formulation gives
[TABLE]
When , we proceed as follows. Notice now that roughly estimating we have because and . Furthermore, , when . Since , we have now derived the inequality
[TABLE]
When , let us take a closer look at
[TABLE]
When , define the value using the same formula but in the place of of . Before moving any further, notice that the value of the expression can be estimated, and compared against the value of when . The calculations are performed by Sage [14]. The values of both functions are presented in the following table:
[TABLE]
It is evident from these values that when . Actually, when , the coefficient could be replaced by the better coeffient .
We have thus shown when , and the proof is ready for . For the rest of the proof we assume that meaning also that . Let us continue by writing
[TABLE]
where
[TABLE]
and
[TABLE]
First we show that . This claim is equivalent to
[TABLE]
which is true when because and . We still need to prove that
[TABLE]
Let us now look at the second term in the numerator of . First take a look at the ratio
[TABLE]
We have
[TABLE]
since
[TABLE]
Thus, we have
[TABLE]
Let us now prove that
[TABLE]
This is done by showing that
[TABLE]
because then
[TABLE]
Notice first that
[TABLE]
so we have to show that
[TABLE]
This is equivalent to . When , the right hand side of the inequality is at least , since . The inequality
[TABLE]
is true when , and hence for all integer values . The proof is complete for .
Let us now move to the small values of . We use estimate (54) with the values in (52) and (53). When , we have , , and hence
[TABLE]
When , we have , , and hence
[TABLE]
When , we have , , and hence
[TABLE]
∎
9. Sparse polynomials
The method presented in this paper suits very well for obtaining bounds for sparse polynomials of , namely, polynomials which have a considerable number of coefficients equal to zero. Let the pairwise different non-negative integers be the exponents of the sparse polynomial .
Theorem 9.1**.**
Let be a polynomial with at most non-zero coefficients, and of degree , where . Suppose . Then the bound
[TABLE]
holds for all with , where the constant for all , and when .
Proof.
This boils down to estimating the size of the terms and . We use the polynomial expression . Now the polynomial in question is , where are the exponents of the polynomial, so for all . Furthermore, we know that with the exception of one index, in which case it is . We may assume that the index in question is , namely, that the terms , and correspond to the polynomials with . Furthermore, we assume .
Let us now estimate the size of the polynomial using the same method as earlier. We have
[TABLE]
and we need the value at . First the integral needs to be split into integrals over the intervals , and . Let us start by looking at the first integral. We have
[TABLE]
Next we estimate the integral on the interval . Now
[TABLE]
Let us now look at the function . We have
[TABLE]
when . Hence, the integral can be estimated to be
[TABLE]
Finally, let us estimate the third integral
[TABLE]
Again, we use the function . Since this function obtains its maximum at , it is decreasing when . We also have . Hence, we may estimate
[TABLE]
Let us estimate the ratio between consecutive terms:
[TABLE]
The third integral can thus be estimated as a geometric sum:
[TABLE]
Hence,
[TABLE]
Since , we have
[TABLE]
and since the function peaks at , we have
[TABLE]
Therefore,
[TABLE]
We need to write the estimate as an exponential function. Using (44) we get
[TABLE]
Since and
[TABLE]
we have
[TABLE]
Let us now estimate the terms . They have the following integral representations:
[TABLE]
We obtain
[TABLE]
Now we need to write this as an exponential function:
[TABLE]
Now
[TABLE]
Comparing the above to (8) and (9), we get
[TABLE]
and by (10),
[TABLE]
Next we sum together the terms arising from the terms and . We may estimate (see (54))
[TABLE]
since , , and
[TABLE]
Now we can combine this term with the term coming from the term :
[TABLE]
Let us start by eliminating the last term with the term . Notice that
[TABLE]
Hence, it suffices to show that . This is easy to do. Notice first that the function is increasing when , so we may estimate
[TABLE]
while . Thus
[TABLE]
Finally, is always at most (the biggest value for ) and it is decreasing. When , the value of this expression is at most . Computations are performed by Sage [14]. ∎
As a corollary of the bound obtained for sparse polynomials, we get the following transcendence measure for an arbitrary integer power of :
Corollary 9.2**.**
Assume and . Then the bound
[TABLE]
where
[TABLE]
holds for all with , and as in the previous theorem.
Proof.
Notice that now and . Substituting these values into the previous theorem immediately yields the result. ∎
Acknowledgements
We are indebted to the anonymous referees for their critical reading and helpful suggestions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Abramowitz, I. A. Stegun: Handbook of mathematical functions with formulas, graphs, and mathematical tables, in: National Bureau Of Standards Applied Mathematics Series, vol. 55, Washington, D.C, 1964.
- 2[2] A. Baker: Transcendental Number Theory, Cambridge University Press, Cambridge, 1975.
- 3[3] É. Borel: Sur la nature arithmétique du nombre e 𝑒 e , C. R. Acad. Sci. Paris , 128 (1899), 596–599.
- 4[4] N. I. Fel’dman, Yu. V. Nesterenko: Transcendental Numbers, Number Theory IV, 1–345, in: Encyclopaedia Math. Sci., vol. 44, Springer, Berlin, 1998.
- 5[5] J. Hančl, M. Leinonen, K. Leppälä and T. Matala-aho: Explicit irrationality measures for continued fractions, J. Number Theory 132 (2012), 1758–1769.
- 6[6] M. Hata: Remarks on Mahler’s Transcendence Measure for e 𝑒 e , J. Number Theory 54 (1995), 81–92.
- 7[7] Ch. Hermite: Sur la fonction exponentielle, C. R. Acad. Sci. 77 (1873), 18–24, 74–79, 226–233, 285–293.
- 8[8] D. S. Khassa, S. Srinivasan: A transcendence measure for e 𝑒 e , J. Indian Math. Soc. 56 (1991), 145–152.
