This paper extends the parametric geometry of numbers, a recent theory in Diophantine approximation, to function fields, and applies it to analyze simultaneous approximation of exponential functions.
Contribution
It introduces a novel adaptation of the parametric geometry of numbers to function fields and explores its application to exponential function approximation.
Findings
01
Extended the theory to function fields of rational functions.
02
Provided new insights into simultaneous approximation of exponential functions.
Parametric geometry of numbers is a new theory, recently created by Schmidt and Summerer, which unifies and simplifies many aspects of classical Diophantine approximations, providing a handle on problems which previously seemed out of reach. Our goal is to transpose this theory to fields of rational functions in one variable and to analyze in that context the problem of simultaneous approximation to exponential functions.
{\textstyle{\bigwedge}}^{m}K_{\infty}^{n}=U^{(m)}\perp_{\mathrm{top}}W^{(m)}\quad\text{with}\quad U^{(m)}={\textstyle{\bigwedge}}^{m}U\quad\text{and}\quad W^{(m)}=\big{(}{\textstyle{\bigwedge}}^{m-1}U\big{)}\wedge W.
{\textstyle{\bigwedge}}^{m}K_{\infty}^{n}=U^{(m)}\perp_{\mathrm{top}}W^{(m)}\quad\text{with}\quad U^{(m)}={\textstyle{\bigwedge}}^{m}U\quad\text{and}\quad W^{(m)}=\big{(}{\textstyle{\bigwedge}}^{m-1}U\big{)}\wedge W.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Full text
Parametric geometry of numbers in function fields
Damien Roy and Michel Waldschmidt
Abstract.
We transpose the parametric geometry of numbers, recently created
by Schmidt and Summerer, to fields of rational
functions in one variable and analyze, in that context, the
problem of simultaneous approximation to exponential
functions.
Key words and phrases:
Simultaneous approximation, parametric geometry of numbers, function fields, Minkowski successive minima, Mahler duality, compound bodies, Schmidt and Summerer n–systems, Padé approximants, perfect systems, exponential function.
Parametric geometry of numbers is a new theory, recently created
by Schmidt and Summerer [12, 13],
which unifies and simplifies many aspects of classical Diophantine
approximation, providing a handle on problems which previously
seemed out of reach (see also [11]).
Our goal is to transpose this theory to fields of rational
functions in one variable and to analyze in that context the
problem of simultaneous approximation to exponential
functions.
Expressed in the setting of [10], the theory deals
with a general family of convex bodies of the form
[TABLE]
where the norm is the Euclidean norm, u is a fixed unit
vector in Rn, and u⋅x denotes the scalar product
of u and x. For each i=1,…,n, let Lu,i(q)
be the logarithm of the i-th minimum of Cu(eq) with
respect to Zn, that is the minimum of all t∈R such that
etCu(eq) contains at least i linearly independent
elements of Zn. Equivalently, this is the smallest t
for which the solutions x in Zn of
[TABLE]
span a subspace of Qn of dimension at least i.
Define
[TABLE]
Although the behavior of the maps Lu may be complicated
(even for n=2, see [5]), it happens that, modulo the
additive group of bounded functions from [0,∞) to
Rn, their classes are the same as those of simpler
functions called n-systems, defined as follows.
An n-system on [0,∞) is a map
P=(P1,…,Pn):[0,∞)→Rn with the
property that, for each q≥0,
(S1)
we have 0≤P1(q)≤⋯≤Pn(q) and
P1(q)+⋯+Pn(q)=q,
(S2)
there exist ϵ>0 and integers
k,ℓ∈{1,…,n} such that
[TABLE]
where e1=(1,0,…,0),…,en=(0,…,0,1),
(S3)
if q>0 and if the
integers k and ℓ from (S2) satisfy k>ℓ,
then Pℓ(q)=⋯=Pk(q).
By [10, Theorems 8.1 and 8.2], there is
an explicit constant C(n), depending only on n,
such that, for each unit vector u∈Rn, there
exists an n-system P on [0,∞) such that
∥Lu(q)−P(q)∥≤C(n) for each q≥0,
and conversely, for each n-system P on [0,∞),
there exists a unit vector u∈Rn with the same
property.
Instead of Z, we work here with a ring of polynomials
A=F[T] in one variable T over an arbitrary field F.
We denote by K=F(T) its field of quotients equipped with
the absolute value given by
[TABLE]
for any f,g∈A with g=0 (using the convention
that deg(0)=−∞ and exp(−∞)=0). The role
of R is now played by the completion K∞=F((1/T))
of K with respect to that absolute value. The extension
of this absolute value to K∞ is also denoted ∣∣.
We fix an integer n≥2 and still denote by (e1,…,en)
the canonical basis of K∞n. We endow K∞n with
the maximum norm
[TABLE]
We also use the non-degenerate bilinear form on K∞n×K∞n
mapping a pair (x,y) to
[TABLE]
This identifies K∞n with its dual isometrically
in the sense that
[TABLE]
for any x∈K∞n.
For a given u∈K∞n of norm 1,
for each i=1,…,n and each q≥0, we define
Lu,i(q) to be the minimum of all t≥0 for which
the solutions x in An of the inequalities
(1.1), interpreted in K∞n,
span a subspace of Kn of dimension at least i.
This minimum exists as we may restrict to values of t
in Z or in q+Z.
Then we form a map Lu:[0,∞)→Rn
as in (1.2) above. Our first main
result reads as follows.
Theorem A**.**
The set of maps Lu where u runs through the
elements of K∞n of norm 1 is the same as
the set of n-systems P on [0,∞) with
P(q)∈Zn for each integer q≥0.
As we will see in the next section, when q belongs
to the set N={0,1,2,…} of non-negative
integers, the numbers Lu,1(q),…,Lu,n(q)
are the logarithms of the successive minima of a
convex body Cu(eq) of K∞n with
respect to An, as defined by Mahler in
[7]. However, in terms of the inequalities
(1.1), these functions naturally
extend to all real numbers q≥0.
The proof of Theorem A is similar to that of the
previously mentioned result over Q, but much simpler
in good part because, as Mahler proved in the same paper
[7], the analog of Minkowski’s second convex body
theorem holds with an equality in that setting.
There is also the fact that the group of
isometries of K∞n is an open set in GLn(K∞)
thus in that sense much larger than the orthogonal
group of Rn. In Sections 2 and
3, we give a
complete proof of Theorem A following [10].
The fact that each map Lu is an n-system is
an adaptation of the argument of Schmidt and Summerer
in [13, Section 2]. In Section 4,
we also connect the maps Lu with the analogue
of those considered by these authors in [13].
Because of the condition (S1), an n-system
P=(P1,…,Pn) on [0,∞) mapping
N to Nn satisfies
[TABLE]
It happens that there is exactly one such n-system
for which
[TABLE]
When q≡0modn, such a system necessarily has
P1(q)=⋯=Pn(q)=q/n. Figure 1
shows the union of the graphs of P1,…,Pn over
an interval of the form [mn,(m+1)n] with m∈N.
Over such an interval, the i-th component Pi
of P is constant equal to m on [mn,mn+n−i], then
increases with slope 1 on [mn+n−i,mn+n−i+1] and finally
is constant equal to m+1 on [mn+n−i+1,mn+n].
One can also characterize that system as the unique one for
which Pn(q)−P1(q)≤1 for each q≥0. Our second main
result is the following.
Theorem B**.**
Suppose that F has characteristic zero. Let
ω1,…,ωn be distinct elements of F,
and let
[TABLE]
Then, we have ∥u∥=1 and the n-system
P=Lu is characterized by the property
(1.4).
As we will show in section 5, this result in fact
extends to all perfect systems of series in the sense
of Mahler-Jager [9, 4].
In 1964, A. Baker showed that, in the notation of Theorem B,
the n-tuple
\big{(}{\mathrm{e}}^{\omega_{1}/T},\dots,{\mathrm{e}}^{\omega_{n}/T}\big{)}
provides a counterexample to the analogue in C((1/T)) of
a conjecture of Littlewood. In Section 6,
we generalize this result to several places of C(T).
2. Constraints on the successive minima
In this section, we prove that the maps Lu which
appear in Theorem A are n-systems. The argument
is based on the ideas of Schmidt and Summerer
in [13], but follows the presentation in
[10, §2].
2.1. Convex bodies
We fix an integer n≥1 and denote by
[TABLE]
the ring of integers of K∞. A convex body
of K∞n is simply a free sub-O∞-module of K∞n
of rank n. This seemingly narrow notion, the
analog of a parallelotope, is explained by Mahler
in [7]. For example, the unit ball O∞n
of K∞n for the maximum norm is a convex body.
Let C be an arbitrary convex body of K∞n.
Its volumevol(C) is defined
as the common value ∣det(ψ)∣ attached
to all K∞-linear automorphisms ψ of K∞n
for which ψ(O∞n)=C.
For each i=1,…,n, the i-th minimum of C (with respect
to An) is defined as the smallest number ∣ρ∣
where ρ runs through the elements of K∞× for which
the dilated convex body
[TABLE]
contains at least i linearly independent elements
of An. Since ρC depends only on the class
ρO∞× in K∞×/O∞×, we may restrict to
elements of the form ρ=Ta with a∈Z. In this context,
Mahler’s extension of Minkowski’s convex body theorem
in [7, §9], reads as follows (compare with
the version proved by J. Thunder over an arbitrary function
field in [14]).
Theorem 2.1**.**
For i=1,…,n, let λi=eμi be the i-th minimum
of C. Then we have
[TABLE]
Moreover, there exists a basis (x1,…,xn) of An over A
such that xi∈TμiC for i=1,…,n.
The last property is expressed by saying that x1,…,xnrealize the successive minima
λ1≤⋯≤λn of C.
Mahler defines the dual or polar body to C by
[TABLE]
This is a convex body of K∞n with vol(C∗)=vol(C)−1.
On the algebraic counterpart, for any basis
(x1,…,xn) of An, there is a dual basis
(x1∗,…,xn∗) of An characterized by
xi∗⋅xj=δi,j(1≤i≤j≤n).
In [7, §10], Mahler shows the following.
Theorem 2.2**.**
In the notation of the previous theorem,
the successive minima of C∗ are
λn−1≤⋯≤λ1−1,
realized by the elements of the dual basis
to (x1,…,xn) listed in reverse
order xn∗,…,x1∗.
Mahler’s original theory of compound bodies (over R)
also extends to the present
setting. To state the result, fix m∈{1,…,n} and put
N=(mn). We identify ⋀mK∞n with K∞N
via a linear map sending the N products
ei1∧⋯∧eim with
1≤i1<⋯<im≤n to the elements of the canonical
basis of K∞N in some order. Then, the
sub-A-module ⋀mAn of ⋀mK∞n
generated by the products v1∧⋯∧vm with
v1,…,vm∈An is identified with AN.
The m-th compound body of C, denoted
⋀mC, is the sub-O∞-module
of ⋀mK∞n spanned by the products
v1∧⋯∧vm with
v1,…,vm∈C. This is a convex
body in that space and an adaptation of the argument
of Mahler in [8] yields the following.
Theorem 2.3**.**
In the notation of the previous theorems,
the successive minima of ⋀mC are
the N products λi1⋯λim
with 1≤i1<⋯<im≤n, listed in monotone
increasing order. They are realized by the products
xi1∧⋯∧xim listed in
the corresponding order.
In particular, if 1≤m<n, the first two minima of
⋀mC are λ1⋯λm and
λ1⋯λmλm+1.
2.2. Isometries and orthogonality
Let n≥1 be an integer. An isometry of K∞n
is a norm-preserving K∞-linear map from K∞n to itself.
We say that subspaces V1,…,Vℓ of K∞n are
(topologically) orthogonal if
[TABLE]
for any choice of vi∈Vi for i=1,…,ℓ.
We write
[TABLE]
when K∞n is the direct sum of such subspaces. We say that a
finite sequence (v1,…,vℓ) of elements of V
is orthogonal if the one-dimensional subspaces
K∞v1,…,K∞vℓ that they span
are orthogonal. We say that it is orthonormal if
moreover ∥vi∥=1 for each i=1,…,ℓ. Thus a basis
(v1,…,vn) of K∞n over K∞ is orthonormal if and
only if it is a basis of O∞n as an O∞-module.
Since O∞ is a principal ideal domain,
any orthonormal sequence in K∞n
can be extended to an orthonormal basis of K∞n.
We recall that Hadamard’s inequality extends naturally
to the present setting and provides a criterion for
orthogonality.
Lemma 2.4**.**
Let x1,…,xm be non-zero elements of K∞n.
Then, we have
[TABLE]
with equality if and only if (x1,…,xm)
is orthogonal.
2.3. The map Lu
Suppose n≥2, and let u∈K∞n with
∥u∥=1. We now adapt the arguments of Schmidt and
Summerer in [13, §2] to show that the
corresponding map Lu:[0,∞)→Rn
defined in the introduction is an n-system.
We first choose an orthonormal basis
(u1,…,un) of K∞n ending
with un=u. Since the dual basis
(u1∗,…,un∗) is orthonormal, we obtain
an orthogonal sum decomposition
[TABLE]
Let projW denote the projection onto W. For
each integer q≥0, we define
[TABLE]
The first equality shows that this is a convex body
of K∞n of volume e−q. The last one implies
that, for each j=1,…,n, its j-th minimum is
exp(Lu,j(q)) where Lu,j(q) is
defined in the introduction.
Now, fix an integer m with 1≤m<n. Put
N=(mn) and M=(m−1n−1). We denote
by ω1,…,ωN−M the products
ui1∗∧⋯∧uim∗
with 1≤i1<⋯<im<n in some order
and by ωN−M+1,…,ωN those
with 1≤i1<⋯<im=n. Since
(ω1,…,ωN) is an orthonormal
basis of ⋀mK∞n, we deduce that
[TABLE]
where the projection is taken with respect to the
decomposition
[TABLE]
In particular, ⋀mCu(eq) has volume e−Mq.
For each j=1,…,N and each q≥0, we define
Lu,j(m)(q) to be the minimum of all
t≥0 for which the inequalities
[TABLE]
admit at least j linearly independent solutions
x in ⋀mAn. When q∈N, this is
the logarithm of the j-th minimum of ⋀mCu(eq).
In general, the minimum exists because we may restrict
to values of t in Z∪(q+Z).
In the case where m=1, we have N=n and
Lu,j(1)=Lu,j for j=1,…,n.
Note that, for fixed q≥0, the points
ω1,…,ωN satisfy (2.2)
for the choice of t=q, thus
[TABLE]
We also note that, for each j=1,…,N, we have
[TABLE]
Thus, Lu,1(m),…,Lu,N(m) are
continuous functions on [0,∞). We make
additional observations.
Lemma 2.5**.**
For each a>0, the union of the graphs of
Lu,1(m),…,Lu,N(m) over [0,a] is
contained in the union of the graphs of finitely
many functions
[TABLE]
associated to non-zero points ω in ⋀mAn.
For ω∈⋀mAn∖{0} and q≥0,
the number Lω(q) is the smallest real number
t≥0 satisfying (2.2). In particular,
when q∈N, it is the smallest integer t such that
ω∈Tt⋀mCu(eq). As
this measures the distance from ω to
⋀mCu(eq) for varying q, we say that the graph of
Lω is the trajectory of ω.
In the case m=1, the trajectory of a non-zero point
x in ⋀1An=An is the graph of the map
Fix a choice of a>0. By (2.3),
the union of the graphs of Lu,1(m),…,Lu,N(m) over [0,a] is contained
in [0,a]×[0,a].
By construction, it is also contained in the union
of the trajectories of the non-zero points ω
in ⋀mAn. The conclusion follows because,
for such ω, we have
log∥ω∥∈N and
log∥projW(m)(ω)∥∈Z∪{−∞}.
Thus, there are only finitely many possible trajectories
meeting [0,a]×[0,a].
∎
Lemma 2.6**.**
For j=1,…,N, the map Lu,j(m)
is continuous and piecewise linear with
constant slope [math] or 1 on each interval
of the form [a,a+1] with a∈N.
Moreover, for each q≥0, we have
(i)
Lu,1(m)(q)+⋯+Lu,N(m)(q)=Mq,
(ii)
Lu,1(m)(q)=Lu,1(q)+⋯+Lu,m(q),
(iii)
Lu,2(m)(q)−Lu,1(m)(q)=Lu,m+1(q)−Lu,m(q).
Proof.
The first assertion is a direct consequence of the
previous lemma because the maps Lω with
ω∈⋀mAn∖{0} are piecewise
linear with constant slope [math] or 1 in the
intervals between consecutive integers, and
we already know that the maps Lu,j(m)
are continuous.
When q is an integer, the equality (i) follows
from Theorem 2.1 applied
to the convex body ⋀mCu(eq) of
⋀mK∞n while (ii) and (iii) follow
from Theorem 2.3 together
with the remark stated below that theorem.
The three equalities then extend
to all q≥0 because all the functions
involved have a constant slope
between consecutive integers.
∎
Lemma 2.7**.**
Suppose that Lu,1(m) changes slope
from 1 to [math] at some point q>0, then
q is an integer and we have
Lu,m(q)=Lu,m+1(q).
Proof.
Put a=Lu,1(m)(q). By the preceding lemmas,
the point q is an integer and there exist
α,β∈⋀mAn∖{0} such that
[TABLE]
Since Lβ changes slope at most once on [0,∞),
going from slope [math] to slope 1, we deduce that Lβ
is constant equal to a on [0,q+1]. In particular,
Lβ−Lα is not constant on [q−1,q]. So
α and β are linearly independent, and thus
Lu,2(m)(q)=a=Lu,1(m)(q). The conclusion
then follows from Lemma 2.6 (iii).
∎
Theorem 2.8**.**
The map Lu=(Lu,1,…,Lu,n):[0,∞)→Rn is an n-system.
Proof.
For the choice of m=1, the inequalities
(2.3) and the identity of
Lemma 2.6 (i) become
[TABLE]
Thus Lu satisfies the condition (S1) in
the definition of an n-system. It also satisfies
(S2) because, by Lemma 2.6,
each Lu,j=Lu,j(1) has constant slope
[math] or 1 in each interval [q,q+1] with q∈N
while, by the above, their sum has slope 1 on [q,q+1].
So, for each q∈N, there is an index
k∈{1,…,n} for which Lu,k has slope 1
on [q,q+1] while the other maps Lu,j with
j=k are constant on that interval. Now, suppose
that q≥1 and that Lu,ℓ has slope 1
on [q−1,q]. Suppose further that ℓ<k. Then,
for each integer m with ℓ≤m<k, the map
Lu,1(m)=Lu,1+⋯+Lu,m changes
slope from 1 to [math] at q. By Lemma 2.7,
this implies that Lu,ℓ(q)=⋯=Lu,k(q).
Thus (S3) holds as well.
∎
3. The inverse problem
Our goal here is to complete the proof of Theorem A by
providing a converse to Theorem 2.8.
To this end, we follow the argument of [10]
taking advantage of the notable simplifications that
arise in the present non-archimedean setting.
3.1. The projective distance
We define the projective distance between two non-zero points
x and y in K∞n by
[TABLE]
Lemma 2.4 implies that dist(x,y)≤1
with equality if and only if the pair (x,y) is
orthogonal. Moreover, the projective distance is
invariant under an isometry of K∞n. The next result
relates it to the distance associated with the norm
on K∞n.
Lemma 3.1**.**
Let x∈K∞n∖{0}. Then, there exists
u∈K∞n with ∥u∥=1 such that
∥x∥=∣u⋅x∣.
For any such u and any y∈K∞n∖{0} with
dist(x,y)<1, we have
∥y∥=∣u⋅y∣ and
[TABLE]
Proof.
Let (u1,…,un) be an orthonormal basis of K∞n
and let (x1,…,xn) be the dual basis.
Since the latter is also orthonormal, we find
[TABLE]
Thus, there exists an index i such that
∣ui⋅x∣=∥x∥.
Let y∈K∞n∖{0}. We also note that
[TABLE]
If ∣u1⋅x∣=∥x∥, we deduce
that, for each j=1,…,n,
[TABLE]
and thus ∥x∧y∥=∥(u1⋅x)y−(u1⋅y)x∥.
If moreover ∣u1⋅y∣<∥y∥,
then we have ∥(u1⋅x)y∥=∥x∥∥y∥>∥(u1⋅y)x∥
and the previous formula then yields
∥x∧y∥=∥x∥∥y∥, thus
dist(x,y)=1. We conclude that,
if ∣u1⋅x∣=∥x∥
and dist(x,y)<1, then
∣u1⋅y∣=∥y∥ and
[TABLE]
The lemma follows because any element u
of K∞n of norm 1 can be taken as the first
component of an orthonormal basis of K∞n.
∎
This implies in particular that the projective distance
satisfies the ultrametric form of the triangle inequality, namely
[TABLE]
for any non-zero elements x, y, z of K∞n.
This is clear if dist(x,y)=1 or dist(y,z)=1.
Otherwise, both numbers are <1 and the inequality follows
from the lemma applied to the point y.
3.2. The key lemma
The following is an adaptation of [10, Lemma 5.1]
which will serve to construct recursively a sequence
of bases of An with specific properties.
Note the stronger hypothesis and conclusion.
Lemma 3.2**.**
Let h,k,ℓ∈{1,…,n} with h≤ℓ and
k<ℓ, let (x1,…,xn) be a basis of An,
let u∈K∞n, and let a∈Z with ea>∥xh∥
and ea≥∥x1∥,…,∥xℓ∥. Suppose
that (x1,…,xh,…,xn,u)
is an orthogonal basis of K∞n.
Then, there exists a basis (y1,…,yn) of An
satisfying
(y1,…,yk,…,yn,u)*
is an orthogonal basis of K∞n,*
5)
det(y1,…,yk,…,yn,u)*
and det(x1,…,xh,…,xn,u)
have the same leading coefficients as elements of
K∞=F((1/T)).*
Although the basis (y1,…,yn) is in general
not uniquely determined by the conditions 1) to 5), the
argument that we provide below is deterministic in the
sense that, for the given data, it yields a unique basis
with the requested properties.
Proof.
We use 1) as a definition of the vectors
y1,…,yℓ,…,yn.
Then, (y1,…,yn) is a basis of An
for any choice of yℓ satisfying 2).
Since k<ℓ, the point yk belongs to the set
[TABLE]
and so ∥yk∥=ea−b for some integer b≥0.
In particular the choice of
[TABLE]
fulfils the condition 2). Since ∥xh∥<ea=∥Tbyk∥, we also have ∥yℓ∥=ea
as requested by condition 3). Moreover,
(y1,…,yℓ,…,yn,u)
is an orthogonal basis of K∞n. So, we can write
[TABLE]
with coefficients c1,…,cn∈K∞ such that
∥cℓu∥≤∥xh∥ and ∥cjyj∥≤∥xh∥ for any j=ℓ. In particular, this yields
∥ckyk∥<ea=∥Tbyk∥, so ∣ck∣<∣Tb∣, and thus ∣Tb+ck∣=eb. Since
[TABLE]
we deduce that
[TABLE]
As (y1,…,yℓ,…,yn,u)
is an orthogonal basis of K∞n, Lemma
2.4 then yields
[TABLE]
because e−b∥yℓ∥=ea−b=∥yk∥.
By Lemma 2.4, this in turn implies that
the n-tuple
(y1,…,yk,…,yn,u)
is an orthogonal basis of K∞n. Thus the condition 4)
is satisfied as well. Finally, the relation (3.1)
yields
[TABLE]
Since Tb+ck has leading coefficient 1 in F((1/T))
(because ∣ck∣<∣Tb∣), this gives 5).
∎
We will use this lemma in combination with the following
result (cf. [10, Lemma 4.7]).
Lemma 3.3**.**
Let 1≤k<ℓ≤n be integers, let (y1,…,yn)
be a basis of K∞n, and let (y1∗,…,yn∗)
denote the dual basis of K∞n in the sense that
yi∗⋅yj=δi,j (1≤i,j≤n). Assume that the
(n−1)-tuples (y1,…,yℓ,…,yn)
and (y1,…,yk,…,yn) are
both orthogonal families in K∞n. Then, we have
[TABLE]
Proof.
Without loss of generality, we may assume that
y1,…,yn all have norm 1. Upon permuting
y1 and yk if k>1, as well as permuting
yn and yℓ if ℓ<n, we may also
assume that k=1 and ℓ=n, so that (y2,…,yn)
and (y1,…,yn−1) are orthonormal families.
We then need to show that dist(y1∗,yn∗)=∥y1∧⋯∧yn∥.
To this end, we first choose u∈K∞n so that
(y1,…,yn−1,u) is an orthonormal basis
of K∞n. Write u=∑j=1ncjyj where
cj=u⋅yj∗∈K∞ for j=1,…,n. Then,
we have cn=0 and, applying Lemma
2.4 to that family, we find
[TABLE]
Applying the same lemma to (y2,…,yn), we obtain
as well
[TABLE]
where the last equality uses the fact that
y2∧⋯∧yn−1∧u and
y1∧⋯∧yn−1 are orthogonal unit elements
of ⋀n−1K∞n. Combining these results,
we conclude that
[TABLE]
The dual basis to (y1,…,yn−1,u)
in K∞n is
[TABLE]
It is orthonormal because it is dual to an
orthonormal basis of K∞n. Then
the decompositions
[TABLE]
yield
[TABLE]
thus dist(y1∗,yn∗)=max{1,∣c1∣}−1 and
(3.3) yields the conclusion.
∎
3.3. Construction of a point
The last lemma that we need is the following description
of the class of n-systems that are involved in Theorem A
(cf. [10, §1]).
Lemma 3.4**.**
Let P=(P1,…,Pn):[0,∞)→Rn be
an n-system such that P(q)∈Zn for each
integer q≥0. There exist s∈{1,2,…,∞},
and sequences of integers (qi)0≤i<s, (ki)0≤i<s
and (ℓi)0≤i<s, starting with q0=0,
k0=ℓ0=n, with the following property. Put
qs=∞ if s<∞. Then, for each index i with
0≤i<s, we have
(i)
qi<qi+1**
(ii)
if i>0, then 1≤ki<ℓi≤n and
Pki(qi)<Pℓi(qi),
(iii)
if i+1<s, then ℓi+1≥ki and
Pℓi+1(qi+1)=qi+1−qi+Pki(qi),
where Φn:Rn→Δn:={(x1,…,xn)∈Rn;x1≤⋯≤xn} is the map that lists the coordinates
of a point in monotone increasing order.
The properties (iii) and (iv) mean that the
union of the graphs of P1,…,Pn over the interval
[qi,qi+1) (called the combined graph of P over
that interval), consists of horizontal line segments with
ordinates P1(qi),…,Pki(qi),…,Pn(qi)
(not necessarily distinct), and a line segment of slope 1
starting on the point (qi,Pki(qi)) and, if i+1<s,
ending on the point (qi+1,Pℓi+1(qi+1)) or else
going to infinity.
By hypothesis, the function P satisfies the conditions
(S1) to (S3) stated in the introduction. Let a∈N.
By (S1) the sum of the coordinates of P(a)∈Nn is
a and the sum of those of P(a+1)∈Nn is a+1.
Since, by (S2), each component of P is monotone
increasing on [0,∞), we must have P(a+1)=P(a)+ek for some k∈{1,…,n}. By (S1) again,
this implies that Pk+1(a)≥Pk(a)+1 and that
[TABLE]
Therefore, the half line [0,∞) can be partitioned
in maximal intervals [qi,qi+1) (0≤i<s) on
which (iv) holds for some ki∈{1,…,n}.
The existence of an integer ℓi+1∈{1,…,n}
satisfying (iii) then follows by the continuity of the
map P. Finally, the condition in (ii) expresses
the maximality of those intervals thanks to (S3).
∎
We can now state and prove the following converse to
Theorem 2.8.
Theorem 3.5**.**
Let P be as in the previous lemma. Then there exists
a point u∈K∞n of norm 1 such that P=Lu.
Proof.
Using the notation of the previous lemma, we first
construct recursively, for each integer i with
0≤i<s, a basis (x1(i),…,xn(i)) of An
with the following properties:
(B1)
(x1(i),…,xki(i),…,xn(i),en)
is an orthogonal basis of K∞n,
(B2)
\log\big{\|}\mathbf{x}^{(i)}_{j}\big{\|}=P_{j}(q_{i}) for j=1,…,n,
(B3)
(x1(i),…,xℓi(i),…,xn(i))=(x1(i−1),…,xki−1(i−1),…,xn(i−1))
if i≥1.
For i=0, we choose (x1(0),…,xn(0))=(e1,…,en).
Then the conditions are fulfilled because k0=n, q0=0 and Pj(0)=0
for j=1,…,n. Suppose now that i≥1 and that appropriate bases
have been constructed for all smaller values of the index. By
Lemma 3.4, we have
[TABLE]
and Pℓi(qi)≥Pℓi(qi−1)=max{P1(qi−1),…,Pℓi(qi−1)}
as well as Pℓi(qi)>Pki−1(qi−1).
In view of the induction hypothesis, this yields
[TABLE]
Since ki−1≤ℓi and ki<ℓi,
Lemma 3.2 then produces a basis
(x1(i),…,xn(i)) of An satisfying
(B1), (B3) and
[TABLE]
Thus it also satisfies (B2) because
of (3.4)
combined with (B3) and the induction hypothesis
that log∥xj(i−1)∥=Pj(qi−1) for j=1,…,n.
For each index i with 0≤i<s, let ui denote
an element of K∞n of norm 1 with
ui⋅xj(i)=0 for each j=1,…,n
with j=ki. By Lemma 3.3 and (B3),
we have
[TABLE]
Since (x1(i),…,xn(i)) is a basis of
An, its determinant belongs to A×⊂O∞×
and so we obtain that
\big{\|}\mathbf{x}_{1}^{(i)}\wedge\cdots\wedge\mathbf{x}_{n}^{(i)}\big{\|}=1.
Then, using (B2), we conclude that
[TABLE]
Since k0=n and (x1(0),…,xn(0))=(e1,…,en), we may assume that
u0=en. Since dist(ui,ui−1)<1
when 1≤i<s, Lemma 3.1
implies that ∣ui⋅en∣=1
for each of those i. So, upon replacing ui by
(ui⋅en)−1ui, we may assume that
ui⋅en=1. The norm of ui remains equal to 1,
and the same lemma combined with (3.5) gives
[TABLE]
Moreover, (qi)0≤i<s is a strictly increasing sequence
of non-negative integers. So, if s=∞, the sequence
(ui)i≥0 converges in norm to an element u
of K∞n of norm 1 with
[TABLE]
If s<∞, the latter inequalities remain true for the choice
of u=us−1 upon setting qs=∞. We claim that
the vector u has the requested property.
To show this, let q≥0 be an arbitrary non-negative integer,
and let i be the index with 0≤i<s such that qi≤q<qi+1
(with the above convention that qs=∞ if i=s−1<∞).
For each j∈{1,…,n} with j=ki, we have
ui⋅xj(i)=0, thus
[TABLE]
and so
[TABLE]
If i≥1, we also have ui−1⋅xki(i)=0
because of (B3), and a similar computation gives
[TABLE]
This inequality still holds if i=0 because, in that case, its
right hand side is 1. So, in all cases we find that
[TABLE]
Since (x1(i),…,xn(i)) is a basis of An,
this implies that, for the componentwise partial ordering
on Rn, we have
[TABLE]
Since the components
of Lu(q) and of P(q) both add up to q, this implies that
Lu(q)=P(q) as announced. Moreover, we must have equality
in (3.6).
∎
Like the proof of lemma 3.2,
the above argument is entirely deterministic in the sense
that it yields a single point u with the requested
properties. Moreover, if F0 denotes the smallest subfield
of F, then each n-tuple (x1(i),…,xn(i))
that it constructs is in fact a basis of F0[T]n over
F0[T], and the corresponding approximation ui of u with
ui⋅en=1 belongs to F0(T)n. So these can be
calculated recursively on a computer for a given n-system P.
We further develop this remark below.
3.4. Universality of the construction
Let P=(P1,…,Pn):[0,∞)→Rn be
an n-system such that P(q)∈Zn for each
integer q≥0. We claim that, when F=Q, the point
u of Q((1/T))n provided by the proof of
Theorem 3.5 belongs in fact to
Z[[1/T]]n and that, for a general field F, the
point that it produces is its image uˉ∈K∞n under
the reduction of coefficients from Z to F.
By induction on i, we first note that, when F=Q,
the n-tuples (x1(i),…,xn(i))
attached to P
are bases of Z[T]n and that, for a general field
F, the corresponding n-tuples are their images
(xˉ1(i),…,xˉn(i))
under the reduction of coefficients from Z to F.
When F=Q, the point ui is the last row in
the inverse transpose of the matrix Mi whose rows are
x1(i),…,xki(i),…,xn(i),en. However, the condition 5) in
Lemma 3.2 implies that
det(Mi) is a monic polynomial of Z[T] for each
index i with 0≤i<s. Thus each ui has coefficients
in Z[[1/T]] and the same is true of the vector u.
In particular, it makes sense to consider their images
uˉi and uˉ under reduction. Clearly
we have uˉi⋅en=1 and
uˉi⋅xˉj(i)=0 for each
j=1,…,n with j=ki. Thus, we have
Lu=P when working in Q((1/T))n and
Luˉ=P when working in F((1/T)).
Remark*.*
Although our construction yields a single point u with
Lu=P, such a point u is far from being unique.
Consider for example an arbitrary 2-system P=(P1,P2):[0,∞)→R2 for which P1 is unbounded.
There is a unique sequence of integers
d0=0<d1<d2<⋯ such that, upon putting q0=0 and
qi=di−1+di for each i≥1, we have
[TABLE]
With this notation, one can check that the point u
constructed in the proof of Theorem 3.5 is
u=(−ξ0,1) where ξ0∈O∞ has the continued
fraction expansion
[TABLE]
However, the continued fraction ξ=[a0,a1,a2,…]
has the same property for any sequence
(ai)i≥0 in A=F[T] satisfying a0∈F and
deg(ai)=di−di−1 for each i≥1. Clearly
the point u=(−ξ,1) then has ∥u∥=1. To show that
Lu=P, define recursively y−1=(0,1),
y0=(1,a0) and yi=aiyi−1+yi−2 for each
i≥1. Then the theory of continued fractions shows that,
with respect to u, one has
[TABLE]
So, for a given integer i≥0 and a given
q∈[qi,qi+1], we have Lyi−1(q)=q−di
and Lyi(q)=di. Since yi−1 and yi
form a basis of A2, this implies that, for the
componentwise ordering on R2, we have
Lu(q)≤Φ2(di,q−di)=P(q),
and so Lu(q)=P(q) (because both points have the
sum of their coordinates equal to q).
4. Duality and an alternative normalization
Let u∈K∞n with ∥u∥=1. It can be
shown that, for each q∈N={0,1,2,…}, the dual of
the convex body Cu(eq) defined in §2.3
is
[TABLE]
For each j=1,…,n and each q∈[0,∞), we define
Lu,j∗(q) to be the minimum of all t∈R
for which the inequalities
[TABLE]
admit at least j linearly independent solutions
y in An so that, when q∈N, this is
the logarithm of the j-th minimum of Cu∗(eq).
Then Theorem 2.2 gives
[TABLE]
for each q∈N.
This remains true for all q∈[0,∞) because
a reasoning similar to that in §2.3 shows
that, like Lu, the map Lu∗=(Lu,1∗,…,Lu,n∗) is affine in each
interval between two consecutive integers.
The analogue of the setting of Schmidt and Summerer
in [13] would require instead to work with the
family of convex bodies of volume 1 given by
[TABLE]
Associate to this family is the map L~u=(L~u,1,…,L~u,n):[0,∞)→Rn
where L~u,j(q) is the minimum of all t∈R
for which the inequalities
[TABLE]
admit at least j linearly independent solutions
y in An, and thus L~u,j(q)=q+Lu,j∗(nq).
5. Perfect systems
From now on, we work with several places of K=F(T). So, we
distinguish the corresponding absolute values with subscripts.
For each α∈F, we denote by Kα=F((T−α))
the completion of K for the absolute value
∣f∣α=e−ordα(f) where, for f in K or in
Kα, the quantity ordα(f)∈Z∪{∞}
represents the order of f at α (with the convention
that ordα(0)=∞).
We also write ∣∣∞ for the absolute value on K
and on K∞=F((1/T)) previously denoted without subscript,
so that ∣f∣∞=edeg(f) for any series f∈K∞.
For each α∈F∪{∞} and each integer
n≥1, we equip Kαn with the maximum norm
denoted ∥∥α.
Let f=(f1,…,fn) be an n-tuple of elements of
F[[T]]. A linear algebra argument shows that,
for any non-zero (ϱ1,…,ϱn)∈Nn,
there exists a non-zero point a=(a1,…,an) in
An=F[T]n such that
[TABLE]
Following Mahler [9] and Jager [4], we say
that f is normal for (ϱ1,…,ϱn)
if any non-zero solution a of (5.1)
in An has ord0(a⋅f)=ϱ1+⋯+ϱn−1.
Then, those solutions together with [math] constitute, over F,
a one dimensional subspace of An. We also say that f is a
perfect system if it is normal for any
(ϱ1,…,ϱn)∈Nn∖{0}.
Examples 5.1*.*
Suppose that F has characteristic zero.
If ω1,…,ωn are elements of F then
[TABLE]
is a perfect system [4, Theorem 1.2.1].
If moreover ω1,…,ωn are pairwise
incongruent modulo Z then
[TABLE]
is also a perfect system [4, Theorem 1.2.2].
Finally the n-tuple
[TABLE]
is normal for each (ϱ1,…,ϱn)∈Nn∖{0} with ϱ1≤⋯≤ϱn
[4, Theorem 1.2.3].
When F=C, the first example of a perfect system is due to Hermite in [3], although it
also follows by duality from his earlier work on the transcendence of e in [2] (see also [6]).
To our knowledge, no perfect n-system of series
of F[[T]] with n≥2 is known when F is a finite field.
A short computation shows that there are none when F has two
or three elements.
In view of the first example above, Theorem B in the
introduction follows from the following result which
also applies to the two other examples as well as to
any perfect system.
Theorem 5.2**.**
Let f=(f1(T),…,fn(T))∈F[[T]]n with n≥2.
Suppose that f is normal for each diagonal element
(ϱ,…,ϱ)∈Nn∖{0}.
Then the point u=(f1(1/T),…,fn(1/T))∈K∞n
satisfies ∥u∥∞=1 and its associated map Lu
is the unique n-system P characterized
by the property (1.4).
Proof.
Since f is normal for (1,…,1), we have
∥f∥0=1, thus ∥u∥∞=∥f∥0=1.
Fix q∈N and let t=Lu,1(q)∈N. By definition there
exists a non-zero point x=(x1(T),…,xn(T)) in An such
that
[TABLE]
Then, for each i=1,…,n, the polynomial
ai(T)=Ttxi(1/T) satisfies deg(ai(T))≤t and
we find that
[TABLE]
Since f is normal for (t+1,…,t+1), this implies that
n(t+1)>q or equivalently that
[TABLE]
For q=mn with m∈N, this gives Lu,1(mn)≥m
and, since the coordinates of Lu(mn) form a monotone
increasing sequence with sum mn, all of these
are equal to m, in particular Lu,1(mn)=Lu,n(mn)=m.
Now let q≥0 be any real number and let m∈N such
that mn≤q≤(m+1)n. Since Lu,1 and Lu,n
are monotone increasing, we find
[TABLE]
As observed in the introduction, this characterizes Lu
as the n-system described in there.
∎
In the case where f is normal for each
(ϱ1,…,ϱn)∈Nn∖{0}
with ϱ1≤⋯≤ϱn and ϱn≤ϱ1+1,
it is also possible to relate the points which
realize the successive minima to the corresponding solutions
of (5.1). To this end, we note that each integer
i≥1 can be written as a sum i=ϱi,1+⋯+ϱi,n
for a unique such n-tuple given by
ϱi,j=⌈(i+j−n)/n⌉ for j=1,…,n. Define
yi=Tϱi,n−1(ai,1(1/T),…,ai,n(1/T)) where
ai=(ai,1,…,ai,n) is a corresponding non-zero solution
of (5.1). Then yi∈An because
deg(ai,j)≤ϱi,n−1
for j=1,…,n. Moreover, we have
[TABLE]
because ∥ai∥0=1 and ∣ai⋅f∣0=e−i+1.
Thus, with respect to the point u, we deduce that
[TABLE]
In particular the trajectory of yi changes slope
from [math] to 1 at the point q=i−1.
The hypothesis also implies that deg(ai,j)≤⌈(i+j−2n)/n⌉ for each i≥1 and
each j=1,…,n, with equality when
i+j≡1modn. This in turn implies that
det(ai,…,ai+n−1) is a non-zero
polynomial of degree i−1 for each i≥1.
Thus, the points yi,yi+1,…,yi+n−1
are linearly independent over K and so, for
each q∈[i−1,i], we obtain
[TABLE]
Since the arguments of Φn in the last expression add
up to q, we conclude that the latter is equal to Lu(q).
Therefore
yi,yi+1,…,yi+n−1 realize the minima
of Cu(eq) for q=i−1 and for q=i, while their
trajectories cover the combined graph of Lu over
the interval [i−1,i].
6. An adelic estimate
In this section we assume that F=C so that, for each
ω and α in C, we may define
[TABLE]
We also fix an integer n≥1 and n distinct complex
numbers ω1,…,ωn∈C. Our last
main result is the following.
Theorem 6.1**.**
Let S={α1,…,αs} be a finite
subset of C of cardinality s≥1.
Then, for any n-tuple of non-zero polynomials
a=(a1(T),…,an(T)) in C[T], we have
[TABLE]
where f=(eω1T,…,eωnT)
and C(n)=exp(n(n−1)/2).
Proof.
Fix a choice of non-zero polynomials a1,…,an in
C[T]. Put a=(a1,…,an) and,
for i=1,…,n, let ciTdi denote
the leading monomial of ai(T). For each k∈N,
we write
[TABLE]
where ak,i(T)=(ωi+d/dT)kai(T)=ωikciTdi+(terms of lower degree). Define
[TABLE]
and put Δ=det(a0,…,an−1). Then Δ
is a non-zero polynomial of degree d=d1+⋯+dn whose
coefficient of Td is the product of c1⋯cn=0
with the Vandermonde determinant
det(ωik)=0 (using the convention that 00=1
if ωi=0 for some i). Thus we have
[TABLE]
Now fix a choice of j∈{1,…,s}. Put
α=αj and choose ℓ∈{1,…,n}
such that ∥a∥α=∣aℓ∣α. Define also
[TABLE]
Since ∣eωℓT∣α=1, we have
∣Δ∣α=∣det(b0,…,bn−1)∣α.
On the other hand, since ak⋅f is the k-th
derivative of a⋅f, we have
[TABLE]
and similarly
[TABLE]
From this we deduce that
[TABLE]
and thus
[TABLE]
The conclusion follows because the product
formula yields 1≤∣Δ∣∞∣Δ∣α1⋯∣Δ∣αs.
∎
Remark*.*
Under the assumptions of Theorem 6.1, the above
argument also yields
[TABLE]
with C′(n)=exp(n−1). The latter estimate is best possible
for any choice of n,s≥1 as one sees by expanding (eT−1)n−1
in the form a⋅f with ωj=j−1 and
aj(T)=(j−1n−1)(−1)n−j for j=1,…,n and
by choosing the points αj=2πji for j=1,…,s.
Then we have ∣aj∣∞=1 for j=1,…,n and
∣a⋅f∣αj=C′(n)−1 for j=1,…,s.
This construction shows that the constant C(n) in
Theorem 6.1 cannot be replaced by a number
less than exp(n−1).
By a change of variables, we deduce from
Theorem 6.1 the following
statement involving the functions eωi/T.
Corollary 6.2**.**
Let a=(a1(T),…,an(T)) be an n-tuple of
non-zero polynomials in C[T]. Then, we have
[TABLE]
where u=(eω1/T,…,eωn/T)
and where C(n) is as in the theorem.
Proof.
Let d be the largest of the degrees of a1,…,an.
Set
[TABLE]
Since x1,…,xn are non-zero polynomials, the
preceding theorem gives
[TABLE]
The conclusion follows because, for each i=1,…,n,
we have deg(xi)=d−ord0(ai)
and ord0(xi)=d−deg(ai), thus ∣xi∣∞∣xi∣0=∣ai∣0∣ai∣∞, while ∥x∥0=e−d∥a∥∞
and ∣x⋅f∣0=e−d∣a⋅u∣∞.
∎
We conclude with two sets of inequalities, the second
one being the result announced by Baker in [1]
and proved there in the case n=3, except for the value
of the constant.
Corollary 6.3**.**
Let a1(T),…,an(T) be
non-zero polynomials in C[T]. Then, we have
[TABLE]
Proof.
The first estimate follows directly from the previous
corollary using the facts that ∣ai∣0≤1 for each
i=1,…,n and that ∥a∥∞≥∣a1∣∞.
It implies that, within K∞=C((1/T)), the series
u1=eω1/T,…,un=eωn/T
are linearly independent over C(T). Consequently,
for each (g1,…,gn)∈Zn, the sets
[TABLE]
are dual convex bodies of K∞n. Moreover, the same estimate
implies that the first minimum λ1 of C satisfies
λ1nV≥C(n)−1 where V=eg1+⋯+gn
is the volume of C. By Theorems 2.1
and 2.2,
this implies that the first minimum λ1∗ of C∗
satisfies
[TABLE]
Upon choosing g1,…,gn so that ∣a1∣∞=e−g1
and ∣a1ui−aiu1∣∞=e−gi for i=2,…,n, we
also have λ1∗≤1, and so we obtain V≤C(n)n−1
which yields the second inequality of the corollary.
∎
Bibliography14
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] A. Baker, On an analogue of Littlewood’s Diophantine approximation problem, Michigan Math. J. 11 (1964), 247–250.
2[2] Ch. Hermite, Sur la fonction exponentielle, C. R. Acad. Sci., Paris 77 (1873), 18–24, 74–79, 226–233, 285–293; Œuvres tome III, 150–181.
3[3] Ch. Hermite, Sur la généralisation des fractions continues algébriques (extrait d’une lettre à M. Pincherle), Annali di Mat. 21 (1893), 289–308; Œuvres tome IV, 357-377.
4[4] H. Jager, A multidimensional generalization of the Padé table. I–VI, Nederl. Akad. Wet., Proc., Ser. A 67 (1964), 193–249.
5[5] A. Keita, Continued fractions and parametric geometry of numbers, J. Théor. Nombres Bordeaux 29 (2017), 129–135.
6[6] K. Mahler, Zur Approximation der Exponentialfunktion und des Logarithmus. I, J. Reine Angew. Math. 166 (1931), 118–136.
7[7] K. Mahler, An analogue to Minkowski’s geometry of numbers in a field of series, Ann. of Math. 42 (1941), 488–522.
8[8] K. Mahler, On compound convex bodies I, II, Proc. Lond. Math. Soc. 5 (1955), 358–384.