Rational approximations to the zeta function
Keith Ball

TL;DR
This paper introduces a sequence of rational functions that converge to the zeta function, with simple matrix-based numerators and denominators, offering a potential spectral approach to the Riemann hypothesis.
Contribution
It presents a novel rational approximation framework for the zeta function, linking it to spectral problems and enabling potential quantitative analysis.
Findings
Rational functions converge to the zeta function locally uniformly.
Numerators and denominators are characteristic polynomials of simple matrices.
The approach relates to spectral problems similar to Connes' analysis.
Abstract
This article describes a sequence of rational functions which converges locally uniformly to the zeta function. The numerators (and denominators) of these rational functions can be expressed as characteristic polynomials of matrices that are on the face of it very simple. As a consequence, the Riemann hypothesis can be restated as what looks like a rather conventional spectral problem but which is related to the one found by Connes in his analysis of the zeta function. However the point here is that the rational approximations look to be susceptible of quantitative estimation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · History and Theory of Mathematics
Rational approximations to the zeta function
Keith Ball
Abstract
This article describes a sequence of rational functions which converges locally uniformly to . The numerators (and denominators) of these rational functions can be expressed as characteristic polynomials of matrices that are on the face of it very simple. As a consequence, the Riemann hypothesis can be restated as what looks like a rather conventional spectral problem but which is related to the one found by Connes and by Berry and Keating. However the point here is that the rational approximations look to be susceptible of quantitative estimation.
Introduction
This article describes a sequence of rational functions that converges locally uniformly to at least to the right of the line . The sequence begins
[TABLE]
[TABLE]
We define the sequence as follows. For each integer we define
[TABLE]
and the coefficients by
[TABLE]
These coefficients are rescaled Stirling numbers of the first kind but it is more convenient for us to use the different indexation and scaling here. We then set
[TABLE]
where the are the usual Bernoulli numbers given by the generating function
[TABLE]
and
[TABLE]
The rational functions in question are the ratios
[TABLE]
For example
[TABLE]
and
[TABLE]
The main theorem of the article is as follows.
Theorem 1
For each set
[TABLE]
where the are the usual Bernoulli numbers and
[TABLE]
where
[TABLE]
Then
[TABLE]
locally uniformly on with the obvious convention at .
It is immediate that the ratio interpolates at the points and has a simple pole with residue 1 at . We shall show that the numerators (and denominators) of these rational functions can be expressed as characteristic polynomials of matrices that are on the face of it very simple. One way to state this is that the numerator of the function is the determinant of
[TABLE]
The real content of the Riemann hypothesis is that has no zeroes to the right of the critical line. The customary formulation is equivalent by virtue of the functional equation proved by Riemann. In order to show that a holomorphic function has no zeroes in (say) a half-plane it suffices to express the function as a locally uniform limit of holomorphic functions with no zeroes there.
As a consequence, the Riemann hypothesis can be restated as what looks like a rather conventional spectral problem: to show that the spectra of certain matrices stay to the left of the critical line, at least asymptotically as . Needless to say, I spent some time thinking about this problem without success but am certain that I have not exhausted the possible lines of attack. Even if the simplicity of these matrices is indeed an illusion, as one would expect, there are concrete reasons to think that these rational approximations to zeta might be useful for estimating the size of zeta, in the sense of the Lindelöf hypothesis: see for example the book of Patterson [P].
Polyá suggested that the Riemann hypothesis should be proved by expressing the zeroes of zeta (rotated onto the real line) as eigenvalues of a self-adjoint operator. (The statement is often credited to Hilbert: it was Terry Tao who pointed out the mistake to me.) A number of candidates for such operators have been proposed, coming mainly from quantum theory. The best known of these were found by Connes [C] and by Berry and Keating [BK]. There is a connection between these infinite-dimensional operators and the finite-dimensional ones described here, which will be explained briefly in Section 5. This became apparent to me from the very readable article of Lachaud [L]. My hope is that the finite-dimensional operators and the rational functions they correspond to, are susceptible of quantitative estimation that would not make sense for the infinite-dimensional operators.
It is well known that the zeroes of should be modelled by the eigenvalues of certain random matrices. This originated in the work of Dyson and Montgomery [M] and was experimentally confirmed by remarkable calculations of Odlyzko [O]. Katz and Sarnak extended the model to other -functions [KaSa]. In the past two decades a huge amount of work has been done on this connection in particular by Keating, Snaith and their collaborators [KeSn]. While strictly speaking this is only indirectly related to the results in this article the random model is clearly a crucial inspiration.
In order to prove the convergence of to we shall show that
[TABLE]
and
[TABLE]
where is the partial sum of the harmonic series. We can get a sense of why this is, quite easily. Using the following variant of Kronecker’s formula
[TABLE]
it is easy to show that for
[TABLE]
If is close to zero then is approximately
[TABLE]
So the sum over in equation (3) is approximately
[TABLE]
Thus for small values of the integrand in equation (3) is approximately
[TABLE]
If the approximation were good for all between [math] and then would be close to
[TABLE]
The last integral plainly converges to
[TABLE]
as provided and the latter is easily seen to be .
Our first aim will be to show that indeed
[TABLE]
locally uniformly for (not just ) as . This looks like a tall order. Crossing the pole at is not the problem. The difficulty is that unless is very close to 0, the expressions
[TABLE]
involve values of at points well outside the interval , where looks nothing like a negative exponential. Indeed is a divided difference of and consequently equal to for some between and that is not easily specified. Since oscillates repeatedly on the interval it would seem that could be very large in size and of more or less random sign. So the following lemma comes as something of a shock.
Lemma 2
If is a non-negative integer and then for each integer and each
[TABLE]
It is trivial to check that
[TABLE]
for all , so the lemma shows that for each the form a partition of unity on and thus automatically controls the sizes of the as well as their signs. Once the lemma is established the convergence proof is fairly straightforward: this will be the content of Section 1 below.
The obvious way to prove the Kronecker formula (2) mentioned above is to use the expansion
[TABLE]
that already appeared in the integral formula for . So it might be logically more reasonable to define the by using the formula
[TABLE]
and simply avoid mention of the Bernoulli numbers. However it seemed a little odd to define a rational function with known poles and residues as the analytic continuation of an integral.
The convergence proof just alluded to relies on the fact that the are defined as sums which we can pass through integral signs. The point of the second section of the article will be to provide a bridge between the definition of the and their representation as characteristic polynomials: in other words to represent the as something more like a product than a sum. The main formula in Section 2 is the following recurrence for the :
Lemma 3
For each non-negative integer
[TABLE]
Thus
[TABLE]
and so on. If we treat the first of these relations as a linear system for the values we can express the fact that by the vanishing of a certain determinant. In Section 3 shall show that this determinant can be written as
[TABLE]
where as mentioned earlier is the Toeplitz matrix
[TABLE]
and is the matrix
[TABLE]
If we set the determinant becomes (apart from a factor of ). The Riemann hypothesis would follow if this determinant vanishes only at points with or equivalently that the matrix has spectral radius at most 1, where is the identity matrix. For small values of this is true. In the first version of this article I stated that there are good reasons to believe that the zeroes of the do leak across the critical line (and then come back again). Pace Nielsen quickly confirmed that when there is a zero to the right of the critical line. He also informed me that when there is a zero with real part larger than 1 (which I had not expected). The Riemann hypothesis is equivalent to the statement that the spectral radius of is at most as .
In Section 4 of the article I shall explain why I think that the approximations might be useful to estimate the size of . The main point is that whereas approximations to zeta that are sums of powers oscillate wildly all the way up the critical line, a polynomial of degree cannot oscillate too often.
Whenever one has a new sequence of approximations to it is natural to ask whether they can help to prove Diophantine properties (irrationality or transcendence) of values of the zeta function and most especially Euler’s constant
[TABLE]
Since the approximations described here are rational functions (with integer coefficients) they do provide rational approximations to Euler’s constant but for this particular “value” of zeta the approximations are not new.
1 The key lemma and convergence
In the introduction we defined, for each , , and for each
[TABLE]
Note that the sum makes sense and is zero if or . We introduced the function as a rational function,
[TABLE]
and also mentioned the Kronecker formula
[TABLE]
It is a consequence of standard properties of the binomial coefficients that for all between 0 and , the sum on the right is unchanged if the upper limit is increased from to . This implies that
[TABLE]
For the sum over can be written as
[TABLE]
and so for we have
[TABLE]
From now on we use to denote .
The aim of this section is to prove the following theorem
Theorem 4
[TABLE]
[TABLE]
locally uniformly for (with the obvious convention at ).
In view of the fact that has no zeros this theorem clearly implies the main convergence theorem of the article Theorem 1.
It is easy to check that for
[TABLE]
If we set for , then since ,
[TABLE]
Therefore
[TABLE]
Most of the effort in proving Theorem 4 will go into showing that the truncated functions converge to
[TABLE]
on with the convergence dominated by a negative exponential function. It is clear that for each fixed , and hence that for each fixed and
[TABLE]
as . We need to establish two types of dominance: one to confirm that the sum
[TABLE]
converges pointwise in to and one to check that this convergence is dominated on .
For almost every estimate we make it is essential to have the key lemma stated in the introduction:
Lemma 2
If is a non-negative integer and then for each integer and each
[TABLE]
We also need a simple property of the divided differences that depends only upon the fact that is a polynomial of degree at most .
Lemma 5
If is a non-negative integer then for every ,
[TABLE]
In particular for
[TABLE]
*Proof *We shall confirm that for any polynomial of degree at most
[TABLE]
and in checking this we may assume that . So our aim is to verify that for each such
[TABLE]
It suffices to check this for each polynomial of the form
[TABLE]
with . The internal sum vanishes if has degree less than and hence it vanishes for if . It also vanishes if because of the form of . The only remaining case is and in that case the internal sum has value . So the double sum is
[TABLE]
The proof of Lemma 2 involves the introduction of an additional parameter as follows. For each define
[TABLE]
and
[TABLE]
Observe that and . So Lemma 2 follows from:
Lemma 6
If is a non-negative integer, is an integer, and
[TABLE]
*Proof *We use induction on . When , is zero unless in which case it is 1. We claim that for
[TABLE]
Once this is established the inductive step is clear because we can assume that and for the given range of and , the number is also at least 0.
Now for any and
[TABLE]
and so
[TABLE]
[TABLE]
where the last step follows from the fact that for all , and ,
[TABLE]
By combining Lemmas 2 and 5 we can immediately make some estimates for the that will give us part of the dominance we need to get convergence.
Lemma 7
For each , each and each .
[TABLE]
and for each
[TABLE]
*Proof *For the first one we apply Lemma 5 with and use the positivity of the to deduce that for each
[TABLE]
For the second one we observe that
[TABLE]
As already remarked it is clear that for each fixed , and hence that for each fixed and
[TABLE]
as . From Lemma 7 we have
[TABLE]
so the convergence in (5) is dominated (on the space of non-negative integers with counting measure) by a sequence summable against . Hence for each
[TABLE]
We have that
[TABLE]
In order to use dominated convergence on we need an estimate for which we prove by introducing another extra parameter. For each and define
[TABLE]
and observe that .
Lemma 8
For each and
[TABLE]
*Proof *As long as we have
[TABLE]
where
[TABLE]
This function is holomorphic on the plane apart from its pole at . Its derivatives at 0 are successively
[TABLE]
[TABLE]
[TABLE]
and so on and therefore by Lemma 5 its power series expansion at 0 is
[TABLE]
Therefore, for
[TABLE]
The outermost identity continues analytically so it holds for all . Therefore
[TABLE]
Now we can estimate as follows. Using the key lemma we have an inequality
[TABLE]
provided . Then by Lemma (8)
[TABLE]
So by induction
[TABLE]
and we get the negative exponential dominance
[TABLE]
on the range of integration . This suffices to guarantee that for
[TABLE]
We wish to cross the pole and so we need to modify the integrand. For we have
[TABLE]
The last integrand behaves like near 0 so the integral converges locally uniformly for and represents the holomorphic function on this larger region.
As in the introduction we set
[TABLE]
and observe that for
[TABLE]
So on this half-plane
[TABLE]
Now for each fixed and as long as so for
[TABLE]
Therefore, still only for , we have
[TABLE]
The integrand is dominated as by but also, as by owing to Lemma 7. Moreover, and both have residue 1 at so the difference is holomorphic for . So the integral represents on the larger region and also converges to on this region.
To complete the proof of Theorem 4 it suffices to prove the (easy) second assertion of the theorem:
[TABLE]
locally uniformly for . We have that for ,
[TABLE]
because . The latter integral converges as long as so it represents on the larger region. It suffices to show that
[TABLE]
for and that the convergence is dominated by a negative exponential.
Observe that is decreasing on so is positive for . On the other hand
[TABLE]
Thus for
[TABLE]
giving the required dominance. Also
[TABLE]
The first term is at most and tends to 0 while the second term behaves like and tends to . This establishes Theorem 4.
Finally we have that the ratios
[TABLE]
converge locally uniformly to for . My guess is that they do so on the entire complex plane.
2 The bridge from sum to determinant
The purpose of this section is to establish the recurrence
Lemma 3
For each non-negative integer
[TABLE]
which will enable us to express the numerator of as a determinant. A similar recurrence holds for the functions : if
[TABLE]
A small modification of the proof below actually yields this as well.
We begin with a simple remark.
Lemma 9
For each non-negative integer we have
[TABLE]
Proof
[TABLE]
All the vanish at apart from since they involve only values of at the integers . By Lemma 5 the add up to 1 so . So
[TABLE]
Now for the proof of Lemma 3.
*Proof *The two sides have the same limits at infinity so it suffices to check that they have the same residues at each of the points . At the residue on the left is while the residue on the right is
[TABLE]
It thus suffices to check that for each ,
[TABLE]
(The case is also obvious.) Multiplying by and summing over it is enough to check that
[TABLE]
Both sides are polynomials of degree so we need only check the values at . Let be one of these integers. and it is easy to check that
[TABLE]
On the other hand
[TABLE]
because vanishes at if . The latter expression is
[TABLE]
[TABLE]
and the beta integral gives the appropriate reciprocal of the binomial coefficient.
The recurrence relation given by Lemma 3 describes a dynamical system for the sequence . Numerically this system appears to evolve very slowly and indeed the convergence of the sequence is very slow. This makes the approximations useless for effective calculation of the value but suggests that it might be possible to track the dynamical system: it is almost a continuous-time system.
3 The spectrum
From the previous section we have that for each
[TABLE]
The first of these relations give us the linear system
[TABLE]
So can be written as the ratio of two determinants. The denominator is the determinant of the matrix on the left, namely . The numerator is the determinant of the matrix obtained by replacing the last column of the original matrix with the vector . It will be more convenient to move this vector to the first column of the matrix thus introducing a factor of into the determinant, and to change the sign of all the other columns, thus removing the factor again. Then the numerator can be written
[TABLE]
Regard the columns as labelled . We leave the zero and 1 columns unchanged as and respectively. We add the 1 column to the 2 column to get
[TABLE]
We add the (new) 2 column to the 3 column and we get
[TABLE]
Continue in this way and after all the additions divide the 2 column by 2, the 3 column by 3 and so on to get
[TABLE]
where
[TABLE]
and
[TABLE]
Notice that does not have poles at because the corresponding Bernoulli numbers vanish. So the determinant in the numerator picks up the trivial zeros of zeta at these numbers: indeed the factor appears as soon as , the factor as soon as and so on.
The zeroes of are thus related in a simple way to the spectrum of . This formulation for is perhaps the most elegant one in terms of a determinant but it is interesting to express in a form in which the problem looks more like a conventional spectral problem.
To begin with we multiply column by for each but leave the zero column unchanged. This includes the factor appearing in (7) into the determinant. We now subtract the row from the last, the row from the and so on to produce the matrix
[TABLE]
We now add all the rows below the top one, to the top one, so that the second matrix now has a zero top row. The first matrix now has top row
[TABLE]
Since the variable now appears on the diagonal in all places except the first, our aim is to reduce the dimension by one so as to create a characteristic polynomial proper. We add multiples , and so on of the top row to the successive rows below. Since this eliminates the first column below the first row, the determinant is now the top left entry multiplied by the determinant of the remaining square. So we get
[TABLE]
where and are as follows.
[TABLE]
which is lower triangular with entry in the diagonal place and entry
[TABLE]
in the place, if . is the rank one matrix given by
[TABLE]
We have
[TABLE]
We are thus interested in the spectrum of the matrix where is a certain lower triangular matrix and has rank 1. It will be seen that this formulation is actually somewhat closer to the recurrence relation (6).
If we set
[TABLE]
then the determinant in question is
[TABLE]
The complex numbers to the right of the critical line are those for which so a natural way to tackle the spectral problem would be to try to find a norm on with the property that for every
[TABLE]
The obvious choice would be an Hilbertian norm. So we look for a positive definite matrix for which
[TABLE]
is also positive definite or alternatively one for which
[TABLE]
is positive definite.
The form of the matrix makes this very tempting. If we ignore the rank one matrix then we can certainly compute the spectrum of since it is lower triangular. However there is a natural choice of norm which shows that the spectrum is to the left of the critical line and therefore provides a more robust argument that one could try to perturb. If is the diagonal matrix with entries on the diagonal then
[TABLE]
is the matrix
[TABLE]
This is obviously positive definite because it has negative off diagonal entries and row sums that are positive because of the familiar telescoping sum
[TABLE]
As remarked earlier there are good reasons to think that the zeroes of the do leak through the critical line so that the best one can hope for is to find matrices with
[TABLE]
positive definite and .
Once one is in possession of the matrices and one could confirm that they yield the in a “direct” way. Diagonalise as and check that when you apply and to the components of you recover the Bernoulli and Stirling numbers. Such an argument would be a bit harsh on the reader since there would be little motivation for introducing these particular matrices. More importantly the dynamical system described by the recurrence relation (6) is of interest in itself.
4 Estimating the size of
Numerical evidence indicates that the function differs from by only about at any point of and so we expect the ratio
[TABLE]
to provide a good approximation to at as long as is as large as . This happens if is at most a bit less than . In fact, numerical evidence and rough calculations indicate that the ratio is not too far from for all the way up to a multiple of . At the same time there are good reasons to think that does not oscillate significantly for larger than . So we have the tantalising possibility that the two regions overlap: the region where tells us about and the region where is smooth enough to be estimated.
This discussion suggests that one should look at the asymptotic expansion for which starts off
[TABLE]
where the coefficient grows logarithmically with , the next coefficient like and so on. However my feeling is that the more promising approach is the “usual” one: to look at an integral (say)
[TABLE]
and move the contour into the region where is very small if with large.
For the genuine integral
[TABLE]
this approach is hopeless because the contour is forced up against the imaginary axis and hence picks up the poles of
[TABLE]
Being a polynomial, has no poles so the issue does not arise. The problem is to estimate off the real line.
5 The connection with the Connes, Berry-Keating operator
The articles by Connes [C] and Berry and Keating [BK] each describe an operator on an infinite-dimensional space whose spectrum shares properties of the zeros of the zeta function. The operators are formally the same but the spaces on which they are considered are different. Connes showed that all Riemann zeros that lie on the critical line correspond to eigenvalues of his operator but was not able to check this for zeros off the line (if they exist). Berry and Keating showed that the mean density of their eigenvalues matches that of the Riemann zeros.
In Connes’ incarnation, the operator can be built from a multiplication operator and an integral operator acting on an infinite-dimensional function space as explained in the article [L]. From this one can see that the finite-dimensional operators considered in this article are sections of the Connes operator as follows.
The Toeplitz matrix given in equation (8) can be thought of as acting on polynomials rather than sequences . It does so by multiplication by the partial sum
[TABLE]
of the series for (followed by truncation back to a polynomial of degree ). In this context the upper triangular matrix in (9) maps the constant function to 0 and for each the monomial to the sum
[TABLE]
Thus for any polynomial of degree the image is
[TABLE]
The finite-dimensional operators described here have several advantages.
- •
The determinants yield approximations to the zeta function itself (not just eigenvalues that correspond to zeros).
- •
Having finite-dimensional operators means that one need not worry about the space on which the matrices act: although of course we would like to find the right norm in order to prove things about the eigenvalues.
- •
These approximations provably do converge to zeta and so in the limit, pick up all the Riemann zeros.
To create finite-dimensional sections of an operator by restricting and projecting onto polynomials is normally an extremely unstable thing to do unless one is working with very special normed spaces (such as of the disc). The fact that it works here is rather remarkable and may perhaps indicate that when trying to find a norm in order to check the spectrum one should start with something like .
Acknowledgements
I am extremely grateful to David Preiss for his advice during this work. My thanks also to Terry Tao and Peter Sarnak who made very helpful suggestions concerning the presentation.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[C] A. Connes, Trace formula in noncommutative geometry and the zeros of the Riemann zeta function, Sel. Math., New Ser. 5 (1999).
- 2[BK] M.V. Berry and J.P. Keating, H = x p 𝐻 𝑥 𝑝 H=xp and the Riemann zeros, in Supersymmetry and Trace Formulae: Chaos and Disorder, eds. I.V. Lerner, J.P. Keating amd D.E. Khmelnitskii, Plenum Press(1999).
- 3[Ka Sa] N. M. Katz and P. Sarnak, Random matrices, Frobenius eigenvalues, and monodromy, Colloquium publications, bf 45, American Mathematical Society, Providence, RI, (1999).
- 4[Ke Sn] J.P. Keating and N.C. Snaith, Random matrix theory and ζ ( 1 / 2 + i t ) 𝜁 1 2 𝑖 𝑡 \zeta(1/2+it) , Comm. Math. Phys., 214 (2000).
- 5[L] G. Lachaud, Spectral analysis and the Riemann hypothesis, J. Comp. Appl. Math., 160 (2003).
- 6[M] H.L. Montgomery, The pair correlation of the zeta function, Proc. Symp. Pure Math, 24 (1973).
- 7[O] A.M. Odlyzko, The 10 22 superscript 10 22 10^{22} -nd zero of the Riemann zeta function, Dynamical, Spectral, and Arithmetic Zeta Functions (M. van Frankenhuysen and M.L. Lapidus, eds.), Contemporary Math., Amer. Math. Soc, Providence, RI, (2001).
- 8[P] S.J. Patterson, An introduction to the theory of the Riemann Zeta-Function, Cambridge studies in advanced mathematics 14 , CUP (1988).
