On a theorem of Davenport and Schmidt

Nickolas Andersen; William Duke

arXiv:1905.05236·math.NT·May 15, 2019

On a theorem of Davenport and Schmidt

Nickolas Andersen, William Duke

PDF

TL;DR

This paper generalizes Davenport and Schmidt's work on improving Dirichlet's approximation theorems, using geometry of numbers and semi-regular continued fractions to establish sharp bounds in a generalized norm setting.

Contribution

It introduces a new approach using semi-regular continued fractions to analyze approximation bounds in the geometry of numbers with arbitrary norms.

Findings

01

Established sharp bounds for approximation improvements

02

Extended classical results to general norms in $\\mathbb R^2$

03

Utilized semi-regular continued fractions with a best approximation property

Abstract

This work is motivated by a paper of Davenport and Schmidt, which treats the question of when Dirichlet's theorems on the rational approximation of one or of two irrationals can be improved and if so, by how much. We consider a generalization of this question in the simplest case of a single irrational but in the context of the geometry of numbers in $R^{2}$ , with the sup-norm replaced by a more general one. Results include sharp bounds for how much improvement is possible under various conditions. The proofs use semi-regular continued fractions that are characterized by a certain best approximation property determined by the norm.

Figures11

Click any figure to enlarge with its caption.

Equations359

α = b_{0} + \frac{1}{b _{1} +} \frac{1}{b _{2} +} \dots = def b_{0} + \frac{1}{b _{1} + \frac{1}{b _{2} + \frac{1}{⋱}}},

α = b_{0} + \frac{1}{b _{1} +} \frac{1}{b _{2} +} \dots = def b_{0} + \frac{1}{b _{1} + \frac{1}{b _{2} + \frac{1}{⋱}}},

u_{n} = \frac{1}{b _{n + 1} +} \frac{1}{b _{n + 2} +} \dots and v_{n} = \frac{1}{b _{n} +} \frac{1}{b _{n - 1} +} \frac{1}{b _{n - 2} +} \dots \frac{1}{b _{1}} .

u_{n} = \frac{1}{b _{n + 1} +} \frac{1}{b _{n + 2} +} \dots and v_{n} = \frac{1}{b _{n} +} \frac{1}{b _{n - 1} +} \frac{1}{b _{n - 2} +} \dots \frac{1}{b _{1}} .

\delta(\alpha)=\limsup_{n\rightarrow\infty}\big{(}1+u_{n}v_{n}\big{)}^{-1}.

\delta(\alpha)=\limsup_{n\rightarrow\infty}\big{(}1+u_{n}v_{n}\big{)}^{-1}.

\frac{e - 1}{e + 1} = \frac{1}{2 +} \frac{1}{6 +} \frac{1}{10 +} \frac{1}{14 +} \dots .

\frac{e - 1}{e + 1} = \frac{1}{2 +} \frac{1}{6 +} \frac{1}{10 +} \frac{1}{14 +} \dots .

δ (α) = \frac{1}{10} (5 + 5) = 0.723607 \dots,

δ (α) = \frac{1}{10} (5 + 5) = 0.723607 \dots,

F_{t} (x, y) = F (t^{- 1} x, t y) .

F_{t} (x, y) = F (t^{- 1} x, t y) .

Δ F_{t}^{2} (q, p - α q) \leq 1.

Δ F_{t}^{2} (q, p - α q) \leq 1.

q ∣ p - α q ∣ < \frac{1}{3},

q ∣ p - α q ∣ < \frac{1}{3},

Δ F_{t}^{2} (q, p - α q) < c,

Δ F_{t}^{2} (q, p - α q) < c,

F (x, y) = F (∣ x ∣, ∣ y ∣) .

F (x, y) = F (∣ x ∣, ∣ y ∣) .

F (0, \pm 1) = F (\pm 1, 0) = 1.

F (0, \pm 1) = F (\pm 1, 0) = 1.

F^{⟨ p ⟩} (x, y) = (∣ x ∣^{p} + ∣ y ∣^{p})^{\frac{1}{p}},

F^{⟨ p ⟩} (x, y) = (∣ x ∣^{p} + ∣ y ∣^{p})^{\frac{1}{p}},

δ_{F} (α) \geq \frac{1}{2} .

δ_{F} (α) \geq \frac{1}{2} .

\frac{Δ _{p}}{10} (5 + 5) ((\frac{1}{2} (5 - 1))^{p} + 1)^{2/ p},

\frac{Δ _{p}}{10} (5 + 5) ((\frac{1}{2} (5 - 1))^{p} + 1)^{2/ p},

\delta_{2}\big{(}\tfrac{1}{2}(-1+\sqrt{3})\big{)}=1.

\delta_{2}\big{(}\tfrac{1}{2}(-1+\sqrt{3})\big{)}=1.

T (u, v) = (\frac{1}{u} - ⌊ \frac{1}{u} ⌋, \frac{1}{v + ⌊ \frac{1}{u} ⌋}) .

T (u, v) = (\frac{1}{u} - ⌊ \frac{1}{u} ⌋, \frac{1}{v + ⌊ \frac{1}{u} ⌋}) .

\frac{1}{lo g 2} \frac{1}{( 1 + uv ) ^{2}} .

\frac{1}{lo g 2} \frac{1}{( 1 + uv ) ^{2}} .

a_{0} + \frac{ε _{1}}{a _{1} +} \frac{ε _{2}}{a _{2} +} \frac{ε _{3}}{a _{3} +} \dots, ε_{m} = \pm 1, a_{m} \in Z

a_{0} + \frac{ε _{1}}{a _{1} +} \frac{ε _{2}}{a _{2} +} \frac{ε _{3}}{a _{3} +} \dots, ε_{m} = \pm 1, a_{m} \in Z

\frac{p _{m}}{q _{m}} = a_{0} + \frac{ε _{1}}{a _{1} +} \frac{ε _{2}}{a _{2} +} \dots \frac{ε _{m}}{a _{m}}

\frac{p _{m}}{q _{m}} = a_{0} + \frac{ε _{1}}{a _{1} +} \frac{ε _{2}}{a _{2} +} \dots \frac{ε _{m}}{a _{m}}

F_{t} (q, p - α q) < F_{t} (s, r - α s)

F_{t} (q, p - α q) < F_{t} (s, r - α s)

∣ p - α q ∣ < ∣ r - α s ∣

∣ p - α q ∣ < ∣ r - α s ∣

α = b_{0} + \frac{1}{b _{1} +} \frac{1}{b _{2} +} \frac{1}{b _{3} +} \dots

α = b_{0} + \frac{1}{b _{1} +} \frac{1}{b _{2} +} \frac{1}{b _{3} +} \dots

α = b_{0} + 1 + \frac{- 1}{b _{2} + 1 +} \frac{1}{b _{3} +} \dots .

α = b_{0} + 1 + \frac{- 1}{b _{2} + 1 +} \frac{1}{b _{3} +} \dots .

s_{n} ∣ r_{n} - α s_{n} ∣ \leq (4^{1/ p} Δ_{p})^{- 1} .

s_{n} ∣ r_{n} - α s_{n} ∣ \leq (4^{1/ p} Δ_{p})^{- 1} .

μ_{m} = \frac{ε _{m + 1}}{a _{m + 1} +} \frac{ε _{m + 2}}{a _{m + 2} +} \dots and ν_{m} = \frac{1}{a _{m} +} \frac{ε _{m}}{a _{m - 1} +} \frac{ε _{m - 1}}{a _{m - 2} +} \dots \frac{ε _{2}}{a _{1}} .

μ_{m} = \frac{ε _{m + 1}}{a _{m + 1} +} \frac{ε _{m + 2}}{a _{m + 2} +} \dots and ν_{m} = \frac{1}{a _{m} +} \frac{ε _{m}}{a _{m - 1} +} \frac{ε _{m - 1}}{a _{m - 2} +} \dots \frac{ε _{2}}{a _{1}} .

D_{p} (u, v) = \frac{1}{1 + uv} (\frac{( 1 - ∣ u ∣ ^{p} v ^{p} ) ^{2}}{( 1 - ∣ u ∣ ^{p} ) ( 1 - v ^{p} )})^{\frac{1}{p}},

D_{p} (u, v) = \frac{1}{1 + uv} (\frac{( 1 - ∣ u ∣ ^{p} v ^{p} ) ^{2}}{( 1 - ∣ u ∣ ^{p} ) ( 1 - v ^{p} )})^{\frac{1}{p}},

\delta_{p}(\alpha)=\limsup_{m\rightarrow\infty}\Delta_{p}\,D_{p}\big{(}\mu_{m},\nu_{m}\big{)},

\delta_{p}(\alpha)=\limsup_{m\rightarrow\infty}\Delta_{p}\,D_{p}\big{(}\mu_{m},\nu_{m}\big{)},

α = \dots \frac{1}{b _{n} +} \frac{1}{1 +} \frac{1}{b _{n + 2} +} \dots into α = \dots \frac{1}{( b _{n} + 1 ) +} \frac{- 1}{( b _{n + 2} + 1 ) +} \dots

α = \dots \frac{1}{b _{n} +} \frac{1}{1 +} \frac{1}{b _{n + 2} +} \dots into α = \dots \frac{1}{( b _{n} + 1 ) +} \frac{- 1}{( b _{n + 2} + 1 ) +} \dots

\ref g e n l \to \ref t c f \to \ref t 6 \to \ref se x p \to \ref g e n l 2 \to \ref n e w 2 \to \ref t 4 .

\ref g e n l \to \ref t c f \to \ref t 6 \to \ref se x p \to \ref g e n l 2 \to \ref n e w 2 \to \ref t 4 .

B = {P \in R^{2}; F (P) < 1} .

B = {P \in R^{2}; F (P) < 1} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On a theorem of Davenport and Schmidt

Nickolas Andersen

UCLA Mathematics Department, Box 951555, Los Angeles, CA 90095-1555

[email protected]

and

William Duke

UCLA Mathematics Department, Box 951555, Los Angeles, CA 90095-1555

[email protected]

Abstract.

This work is motivated by a paper of Davenport and Schmidt, which treats the question of when Dirichlet’s theorems on the rational approximation of one or of two irrationals can be improved and if so, by how much. We consider a generalization of this question in the simplest case of a single irrational but in the context of the geometry of numbers in $\mathbb{R}^{2}$ , with the sup-norm replaced by a more general one. Results include sharp bounds for how much improvement is possible under various conditions. The proofs use semi-regular continued fractions that are characterized by a certain best approximation property determined by the norm.

Supported by NSF grant DMS 1701638.

1. Introduction

In 1842 Dirichlet [13] applied the pigeonhole principle to give good approximations of real numbers by rationals. One form of his theorem in one dimension is the following.

Dirichlet Approximation Theorem.

For $\alpha\in\mathbb{R}$ and any $Q\in\mathbb{Z}^{+}$ there are $p,q\in\mathbb{Z}$ such that $1\leq q\leq Q$ and $|p-q\alpha|<\tfrac{1}{Q}.$

Davenport and Schmidt [10] considered those $\alpha$ for which an improvement of this result is possible, at least when we only require that $Q$ be sufficiently large. More precisely, let $\delta(\alpha)$ be the largest number with the property that if $c>\delta(\alpha)$ then for every sufficiently large $Q$ (depending only on $\alpha$ ), there are integers $p,q\in\mathbb{Z}$ with $1\leq q\leq Q$ and $Q|p-\alpha q|<c,$ while if $c<\delta(\alpha)$ there are arbitrarily large $Q$ for which no such $p,q$ exist. If $\delta(\alpha)<1$ then we say that an improvement on Dirichlet’s theorem is possible for this $\alpha$ . Clearly $\delta(\alpha)=0$ for rational $\alpha$ so we only consider irrational $\alpha$ .

An easy direct argument proves the fact, perhaps surprising at first, that any irrational $\alpha$ for which $\delta(\alpha)<1$ must be badly approximable. For $\alpha$ to be badly approximable means that for some $c>0$ we have $|\alpha-\frac{p}{q}|>\frac{c}{q^{2}}$ for all relatively prime integers $p,q$ with $q>0$ . Davenport and Schmidt gave another proof of this that also shows that, conversely, an improvement on Dirichlet’s theorem is possible for every badly approximable number. They deduced this from a formula for $\delta(\alpha)$ given in terms of the regular continued fraction expansion of $\alpha.$ Recall that an irrational $\alpha$ has a unique infinite regular continued fraction expansion

[TABLE]

where the partial quotients $b_{n}$ satisfy $b_{0}=\lfloor\alpha\rfloor$ and $b_{k}\in\mathbb{Z}^{+}$ for $k\geq 1$ . Also define $u_{0}=\alpha-a_{0}$ , $v_{0}=0$ while for $n\geq 1$ let

[TABLE]

Theorem.

(Davenport-Schmidt [10]) For any irrational $\alpha\in\mathbb{R}$ we have that

[TABLE]

An immediate consequence of (1.3) is that the irrational $\alpha\in\mathbb{R}$ for which Dirichlet’s theorem can be improved are precisely those whose continued fraction have bounded partial quotients. This condition is well-known to be equivalent to $\alpha$ being badly approximable [49, p. 22]. Real quadratic irrationalities are precisely those whose regular continued fraction expansions are eventually periodic, so they are badly approximable. On the other hand, they are the only known examples that are algebraic. A continued fraction discovered by Euler [15] provides an explicit example of an irrational (in fact transcendental) number that is not badly approximable, namely

[TABLE]

By a well-known result of Khintchine [24, Thm 29] badly approximable numbers, although uncountable, are rare in the sense of measure theory. Thus we have the following.

Corollary.

The set of real irrationals for which Dirichlet’s theorem can be improved is uncountable and has Lebesgue measure zero.

Another consequence of the formula (1.3) is a bound for how much the Dirichlet theorem can be improved when it can be improved at all.111 For further results about the set of values of $\delta(\alpha)$ see [22] and the references therein. See also our §12.

Corollary.

The smallest value of $\delta(\alpha)$ is given by

[TABLE]

when $\alpha=\frac{1}{2}(1+\sqrt{5})$ .

2. Improving the Minkowski approximation theorem

Davenport and Schmidt used their theorem as a starting point to obtain results that pertain to the Dirichlet theorems about approximating two numbers simultaneously and later more generally [11] (see also [48]). In this paper we will consider a different kind of generalization of Dirichlet’s results, one that was conceived of by Hermite and Minkowski.

Let $F:\mathbb{R}^{2}\rightarrow\mathbb{R}$ be a fixed norm on $\mathbb{R}^{2}$ and $\mathcal{B}$ its unit ball. Define the stretched norm $F_{t}$ for $t>0$ by

[TABLE]

The following generalization of Dirichlet’s theorem follows from the work of Minkowski. Although it was not stated directly by him, for the purposes of this paper we will refer to it as the Minkowski approximation theorem (in two dimensions).

Minkowski Approximation Theorem.

For a fixed norm $F$ on $\mathbb{R}^{2}$ let $\Delta=\Delta_{F}$ be the minimal area of a parallelogram with one vertex at the origin and the other three on the boundary of $\mathcal{B}$ . Fix $\alpha\in\mathbb{R}$ . Then for any real $t\geq 1$ there exist integers $p,q$ with $q>0$ such that

[TABLE]

Note that for this result we are not restricting $t$ to be an integer. It is not hard to see that for the sup-norm the Minkowski approximation theorem implies Dirichlet’s theorem. In this case $\Delta=1.$

The idea of generalizing Dirichlet’s theorem to other norms goes back at least to Hermite [19]. He applied (2.2) for the Euclidean norm, for which $\Delta=\frac{\sqrt{3}}{2}$ , together with the inequality between arithmetic and geometric means. The resulting inequality implies that for any irrational $\alpha$ there are infinitely many integers $p,q$ with $q>0$ such that

[TABLE]

improving upon the corresponding upper bound $1$ given by Dirichlet’s theorem. Later Minkowski [33, 36] showed that (2.2) with the 1-norm given by $F(x,y)=|x|+|y|$ and for which $\Delta=\frac{1}{2}$ , implies (2.3) with $\tfrac{1}{\sqrt{3}}$ replaced by $\frac{1}{2}$ .

Given these results of Hermite and Minkowski, it is natural to study the generalization for any norm of the quantity $\delta(\alpha)$ from the Davenport-Schmidt theorem. We want this generalization to measure to what extent the Minkowski approximation theorem (2.2) can be improved for a particular $\alpha$ . Hence for a fixed norm $F$ , let $\delta_{F}(\alpha)$ be the largest number with the property that if $c>\delta(\alpha)$ then for every sufficiently large $t$ there are $p,q\in\mathbb{Z}$ with $q>0$ such that

[TABLE]

while for $c<\delta_{F}(\alpha)$ there are arbitrarily large $t$ for which no such $p,q$ exist. For a given norm we say that the Minkowski approximation theorem can be improved for irrational $\alpha\in\mathbb{R}$ if $\delta_{F}(\alpha)<1.$ A straightforward argument shows that when $F$ is the sup-norm, $\delta_{F}=\delta$ for $\delta$ in the Davenport–Schmidt theorem.

We have only been able to obtain satisfactory results about $\delta_{F}$ if we make the assumption that for all $(x,y)\in\mathbb{R}^{2}$ the norm $F$ satisfies

[TABLE]

At least for the study of $\delta_{F}$ , we may assume without any further loss of generality that the norm $F$ also satisfies

[TABLE]

Definition 1.

Say that a norm $F$ is strongly symmetric if it satisfies (2.4) and (2.5).

The most important strongly symmetric norms are the $p$ -norms. For $(x,y)\in\mathbb{R}^{2}$ and a fixed $1\leq p<\infty$ the $p$ -norm is defined by

[TABLE]

while $F^{\langle\infty\rangle}(x,y)=\sup\{|x|,|y|\}.$ Denote the corresponding $\mathcal{B}$ by $\mathcal{B}^{p}$ , $\Delta$ by $\Delta_{p}$ and $\delta$ by $\delta_{p}.$ Other interesting examples are the two unique strongly symmetric norms whose unit balls are regular octagons: $\mathcal{B}^{\mathrm{oct_{1}}}$ and $\mathcal{B}^{\mathrm{oct_{2}}}$ (see Figure 1).

Our first result generalizes the first corollary of the theorem of Davenport and Schmidt. It shows that for a strongly symmetric norm the set of irrationals for which the Minkowski approximation theorem can be improved, while uncountable, is small in the sense of measure theory.

Theorem 1.

Fix a strongly symmetric norm $F$ . Then the set of all real irrationals for which Minkowski’s approximation theorem can be improved is uncountable and has Lebesgue measure zero.

Next we have a uniform lower bound for $\delta_{F}(\alpha)$ for any strongly symmetric norm and any irrational $\alpha$ .

Theorem 2.

For any strongly symmetric norm $F$ and any irrational $\alpha\in\mathbb{R}$ we have that

[TABLE]

Equality in (2.6) can hold for the 1-norm. This follows from the next result since $\Delta_{1}=\frac{1}{2}.$ For simplicity say that an irrational $\alpha\in\mathbb{R}$ is well approximable if it is not badly approximable.

Theorem 3.

For any strongly symmetric norm $F$ the smallest value of $\delta_{F}(\alpha)$ for a well approximable $\alpha$ is $\Delta.$

We will see in the proof of Theorem 3 that $\delta_{p}(\alpha)=\Delta$ for any $\alpha$ whose regular continued fraction has partial quotients that are eventually strictly increasing, for example $\alpha=\tfrac{e-1}{e+1}$ from (1.4). For the $p$ -norm we can go further and identify the smallest value of $\delta_{p}(\alpha)$ for any irrational $\alpha$ .

Theorem 4.

For the $p$ -norm the smallest value of $\delta_{p}(\alpha)$ for an irrational $\alpha$ is $\Delta_{p}$ when $1\leq p\leq 2$ and is

[TABLE]

when $2<p\leq\infty.$ The value in (2.7) is attained when $\alpha=\frac{-1+\sqrt{5}}{2}.$

The value of $\Delta_{p}$ is given below in (4.2). See Figure 2 for graphs of $\Delta_{p}$ and the minimum value of $\delta_{p}.$ It is not the case that the Minkowski approximation theorem can always be improved for each badly approximable irrational, not even each real quadratic irrational. For example, we show at the end of §8 that

[TABLE]

Finding the norm or norms with the largest minimum value of $\delta_{F}(\alpha)$ among all strongly symmetric norms seems an interesting problem. The 2-norm has the largest minimum value $\frac{\sqrt{3}}{2}=0.866025\dots$ of $\delta_{p}$ among all $p$ -norms. It can be shown that the minimum value of $\delta_{F}(\alpha)$ for both of the octagonal norms is $\frac{1}{8}\left(3\sqrt{2}+2\right)=0.78033\dots$ (see the end of §10). Among all of the examples we have considered, the 2-norm provides the largest minimum (see Figure 2).

Remarks:

Results like ours involving continuously varying norms belong to the “parametric geometry of numbers,” an area that has recently seen a revival of activity stimulated by Schmidt and Summerer [50, 51]. See also [47] and its references.

The sequence $(u_{n},v_{n})$ from (1.2) represents a trajectory of the dynamical system on $\Omega_{0}=[0,1)\times[0,1]$ determined by the extended continued fraction map $T:\Omega_{0}\rightarrow\Omega_{0}$ given by $T(0,v)=(0,0)$ and for $u>0$ by

[TABLE]

It has an invariant measure $\omega$ with density function

[TABLE]

The ergodicity of this system, which is the natural extension of the usual continued fraction dynamical system, can be used to give a different proof that Dirichlet’s theorem cannot be improved for almost all real irrationals. An argument of [23], given as Lemma 5.3.11 of [9], allows one to conclude an almost all result for the special trajectories (1.2). Our proof of Theorem 1 proceeds along similar lines except that the trajectories of our dynamical system are determined by certain semi-regular continued fractions, which admit $\pm 1$ as partial numerators. The continued fractions we need are examples of $\mathcal{S}$ -expansions, which have a well developed metrical theory again based on the ergodic theorem. For some remarks on the connection between these dynamical systems and the geodesic flow on ${\rm SL}(2,\mathbb{Z})\backslash{\rm SL}(2,\mathbb{R})$ see §12.

The second corollary of the Theorem of Davenport and Schmidt and our generalization, Theorem 4, require for their proofs information about all, rather than almost all trajectories. Other aspects of the continued fractions we use are needed, including a best approximation property given in terms of the norm, in order to be able to analyze in detail each individual trajectory.

3. The continued fraction associated to a norm

We want to give a generalization of the formula (1.3) of Davenport and Schmidt and for that we require, as previously mentioned, certain infinite semi-regular continued fraction expansions. Such a continued fraction has the form

[TABLE]

where $a_{m}>0$ and $a_{m}+\varepsilon_{m+1}\geq 1$ for all $m\geq 1$ and $a_{m}+\varepsilon_{m+1}\geq 2$ for infinitely many $m$ . For any $m\geq 0$ the $m^{th}$ convergent of this continued fraction

[TABLE]

uniquely defines relatively prime integers $p_{m},q_{m}$ with $q_{m}>0$ , where $p_{0}=a_{0}$ and $q_{0}=1.$ Tietze ([55], see also [43, p. 135]) showed that there is an irrational $\alpha$ to which such a continued fraction converges, meaning that $\alpha=\lim_{m\rightarrow\infty}\frac{p_{m}}{q_{m}}.$

The continued fraction we need is characterized by a best approximation property stated in terms of the given strongly symmetric norm.

Definition 2.

Say that a rational number $\frac{p}{q}$ where $q>0$ is a best approximation to $\alpha$ with respect to the norm $F$ if there is a $t>1$ depending only on $\frac{p}{q}$ such that

[TABLE]

for all rational $\frac{r}{s}\neq\frac{p}{q}$ .

In the case of the sup-norm Definition 2 is equivalent to the usual one that states that a rational number $\frac{p}{q}$ with $q>0$ is a best approximation to an irrational $\alpha$ if for all rational numbers $\frac{r}{s}\neq\frac{p}{q}$ with $0<s\leq q$ we have

[TABLE]

(see Lemma 6.1 below).

Theorem 5.

Fix a strongly symmetric norm $F$ . Every irrational $\alpha\in\mathbb{R}$ has a unique semi-regular continued fraction expansion whose convergents are precisely the best approximations to $\alpha$ with respect to $F$ .

We will refer to this continued fraction as the $F$ -continued fraction of $\alpha$ and, for the $p$ -norm, as the $p$ -continued fraction of $\alpha.$ For the sup-norm the $\infty$ -continued fraction is closely related to, but not always equal to, the regular continued fraction. Suppose that

[TABLE]

is the regular continued fraction of an irrational $\alpha$ . Recall that Lagrange showed ([25], see also [43, §15]) that every best approximation in the usual sense is a convergent of the regular continued fraction of $\alpha$ and that every convergent, except possibly $b_{0}$ , is a best approximation to $\alpha$ . In view of Theorem 5, (3.2) coincides with the $\infty$ -continued fraction of $\alpha$ if and only if $b_{1}>1$ . If $b_{1}=1$ the $\infty$ -continued fraction of $\alpha$ is

[TABLE]

This is an example of a singularization, which has the effect of contracting the regular continued fraction by removing $b_{1}=1$ and the convergent $b_{0}$ , which is not a best approximation to $\alpha$ in this case. This well-known exceptional case does not occur for the $\infty$ -continued fraction. With this one possible exception, however, the convergents of the $\infty$ -continued fraction and those of the regular continued fraction coincide.

For any $1\leq p<\infty$ , the inequality between arithmetic and geometric means immediately gives that a necessary condition for a regular convergent $\frac{r_{n}}{s_{n}}$ of an irrational $\alpha$ to be a convergent of the $p$ -continued fraction is that

[TABLE]

For $p=1$ , when the right hand side is $\frac{1}{2}$ , Minkowski [34] showed that (3.4) is also sufficient.

Just as the formula (1.3) is given in terms of the sequence $u_{n},v_{n}$ coming from the regular continued fraction, our generalization will be given in terms of a sequence $\mu_{m},\nu_{m}$ determined by our continued fraction $\alpha=a_{0}+\frac{\varepsilon_{1}}{a_{1}+}\,\frac{\varepsilon_{2}}{a_{2}+}\;\frac{\varepsilon_{3}}{a_{3}+}\cdots$ . Namely, for a fixed norm we define $\mu_{0}=\alpha$ and $\nu_{0}=0$ , while for $m\geq 1$ we let

[TABLE]

For a general strongly symmetric norm we will express $\delta_{F}(\alpha)$ in terms of these numbers $\mu_{m},\nu_{m}$ in §7 below. For the $p$ -norm the formula is completely explicit and we give it here. For $p$ with $1\leq p<\infty$ let

[TABLE]

while when $p=\infty$ set $D_{\infty}(u,v)=\lim_{p\rightarrow\infty}D_{p}(u,v)=(1+uv)^{-1}.$

Theorem 6.

Fix $1\leq p\leq\infty.$ For any irrational $\alpha$ whose $p$ -continued fraction is (3.1) we have that

[TABLE]

where $\mu_{m},\nu_{m}$ are given above in (3.5).

The $F$ -continued fraction of an irrational $\alpha$ for any strongly symmetric norm is an example of an $\mathcal{S}$ -expansion. Their theory has been developed by Kraaikamp [28] and others (see also [4], [9] and [26]). Recall the definition of $T$ and $\omega$ from (2.9) and (2.10). A Borel set $\mathcal{S}\subset\Omega_{0}$ is called a singularization area if $\omega(\partial\mathcal{S})=0$ and if

(i)

$\mathcal{S}\subseteq[\tfrac{1}{2},1)\times[0,1]$ and 2. (ii)

$T\mathcal{S}\cap\mathcal{S}\subseteq\{(\beta,\beta)\},$ where $\beta=\frac{1}{2}(-1+\sqrt{5}).$

The $\mathcal{S}$ -expansion of an irrational $\alpha$ is obtained from the regular continued fraction (1.1) by changing

[TABLE]

for each $n$ such that $(u_{n},v_{n})\in\mathcal{S}$ . Note that $(u_{n},v_{n})\in\mathcal{S}$ implies that $b_{n+1}=1$ by (i). Also (ii) implies that this procedure is unambiguous. The result is a unique semi-regular continued fraction for $\alpha$ whose convergents are precisely those regular convergents $\frac{r_{n}}{s_{n}}$ where $n\geq 0$ is such that $(u_{n},v_{n})\notin\mathcal{S}$ . For example, the $\infty$ -continued fraction discussed above is the $\mathcal{S}$ expansion for $\mathcal{S}=[\frac{1}{2},1)\times\{0\}$ . As usual, we denote $\mathcal{S}$ by $\mathcal{S}_{p}$ in the case of the $p$ -norm (see Figure 3).

Theorem 7.

Fix a strongly symmetric norm $F$ . There exists a singularization area $\mathcal{S}$ so that the $F$ -continued fraction of any irrational $\alpha$ is the $\mathcal{S}$ -expansion of $\alpha$ .

Remarks: At the beginning of the paper [33], Minkowski states without proof several of the main properties of the $p$ -continued fraction for any $p$ , including the best approximation property. Our proof of Theorem 5, which allows for $F$ to be any strongly symmetric norm, was strongly influenced by his ideas. As previously mentioned, Minkowski [34] also gave the remarkable result that for the 1-continued fraction the necessary condition (3.4) is also sufficient. Unsurprisingly, this also follows from our arguments. In addition, the 2-continued fraction has actually been studied since the time of Hermite [18], especially by Humbert [20, 21]. It is also closely connected to the improper modular billiards studied in [1]. It was shown in [27] that the 1-continued fraction (known as Minkowski’s diagonal continued fraction) is an $\mathcal{S}$ -expansion. For related work on the 1-continued fraction see [40]. That the $p$ -continued fraction for $p\neq 1,\infty$ is also an $\mathcal{S}$ -expansion seems to be new.

In the next section we will review some basic facts from the geometry of numbers in the case we need, namely in two dimensions. Seven sections, each with the proof of one of our theorems, follow afterward. The theorems will be proven in the following order:

[TABLE]

Some concluding remarks are then given. Finally, an appendix contains a number of technical lemmas and their proofs that we will refer to as needed in the main body of the paper.

4. Geometry of numbers

As above let $F$ be a fixed norm on $\mathbb{R}^{2}$ . This means that for $P,P^{\prime}\in\mathbb{R}^{2}$ we have

(i)

$F(P)\geq 0$ and $F(P)=0$ if and only if $P=(0,0)$ 2. (ii)

$F(tP)=|t|F(P)$ for $t\in\mathbb{R}$ 3. (iii)

$F(P+P^{\prime})\leq F(P)+F(P^{\prime})$ .

The unit ball of the norm is

[TABLE]

This $\mathcal{B}$ is open, bounded, convex and symmetric around 0 and every such body arises as the unit ball of some norm (see e.g. [52]). Denote by $\mathrm{area}(\mathcal{B})$ the Lebesgue measure of $\mathcal{B}$ on $\mathbb{R}^{2}$ . It is convenient to define the stretched ball for $t>0$

[TABLE]

Let $L\subset\mathbb{R}^{2}$ be a (full) lattice. By the determinant of $L$ , denoted $\det{L}$ , we mean $|\det{g}|$ for any $g\in{\rm GL}(2,\mathbb{R})$ whose rows give a $\mathbb{Z}$ -basis for $L$ . The lattice $L$ is admissible for $\mathcal{B}$ if $\mathcal{B}$ contains no other points of $L$ than $(0,0)$ . The following result is fundamental [36]:

Minkowski’s First Convex Body Theorem.

If $L$ is admissible for $\mathcal{B}$ then

[TABLE]

The critical determinant of $\mathcal{B}$ , denoted $\Delta(\mathcal{B})$ or simply $\Delta$ , is the infimum of all determinants of lattices admissible for $\mathcal{B}$ . Building on work of Minkowski [35, 37], Mahler [29] proved that lattices with determinant $\Delta$ actually exist, and these are called critical lattices. Minkowski’s first convex body theorem implies that

[TABLE]

This is sharp for the 1-norm and the sup-norm.

Apparently, if we wish to evaluate $\delta_{F}(\alpha)$ exactly we must also know $\Delta$ exactly. Finding the critical determinant of a given $\mathcal{B}$ is the main problem of the geometry of numbers in $\mathbb{R}^{2}.$ Although the $n$ -dimensional version of this problem is apparently intractable in general, here it is approachable. For a given critical lattice $L$ for $\mathcal{B}$ the boundary of $\mathcal{B}$ must contain a $\mathbb{Z}$ -basis $\{P,P^{\prime}\}$ for $L$ as well as their sum $P+P^{\prime}$ . Furthermore, the lattice generated by any pair of points $P,P^{\prime}$ with $P,P^{\prime},P+P^{\prime}$ on the boundary of $\mathcal{B}$ is admissible for $\mathcal{B}$ (see [7, Thm XI p. 160]). Therefore, as Minkowski already knew, computing $\Delta$ amounts to solving the (generally quite difficult) calculus problem of minimizing the area of a parallelogram with one vertex at the origin and the three others on the boundary of $\mathcal{B}$ . This justifies our definition of $\Delta$ in the statement of the Minkowski approximation theorem.

Next we review what is known about the value of $\Delta_{p}$ for all $p$ . Let

[TABLE]

where $0<\tau_{p}<\frac{1}{2}$ satisfies $\tau_{p}^{p}+1=2(1-\tau_{p})^{p}$ . A modification of a conjecture of Minkowski [37, p. 51–58] made by Davis [12] states that

[TABLE]

Furthermore, there is a unique value $2.57<\rho<2.58$ so that $\Delta_{p}=\Delta_{p}^{(0)}$ when $2\leq p\leq\rho$ , while otherwise $\Delta_{p}=\Delta_{p}^{(1)}$ . Many mathematicians obtained partial results, among them Mordell [39], Davis [12], Cohn [8], Watson [56, 57] and Malyshev [31]. Building on their work, the proof of the full conjecture was finally completed by Glazunov, Golovanov and Malyshev [16].

In the case of the $p$ -norm parallelograms that minimize the area may be given explicitly. For $2\leq p\leq\rho$ we may take the parallelogram with vertices at $0,P,P^{\prime},P+P^{\prime}$ where

[TABLE]

For $1\leq p\leq 2$ or $\rho\leq p\leq\infty$ we may take

[TABLE]

where again $0<\tau_{p}<\frac{1}{2}$ solves $\tau_{p}^{p}+1=2(1-\tau_{p})^{p}.$ Except when $p=1,2$ or $\infty$ these parallelograms are unique up to obvious symmetries. When $p=1,2$ or $\infty$ there are infinitely many essentially different minimizing parallelograms. They are easily parameterized. For example, when $p=2$ all are obtained by rotating the standard hexagonal lattice coming from (4.3).

Minkowski’s method can be restated as saying that $3\Delta(\mathcal{B})$ is the minimal area of an affinely regular symmetric hexagon inscribed in $\mathcal{B}$ . A useful alternative due to Reinhardt [46] is that $4\Delta(\mathcal{B})$ is the minimum area of a symmetric convex circumscribed hexagon (see also [7, p. 239] or [17, Thm 2 p. 243]). Using this fact, that he also found independently, Mahler [30] computed $\Delta(\mathcal{B}^{\mathrm{oct_{1}}})=\sqrt{2}-\tfrac{1}{2}$ for the regular octagon from Figure 1. Thus we also have $\Delta(\mathcal{B}^{\mathrm{oct_{2}}})=\frac{1}{8}\left(3\sqrt{2}+2\right)$ , obtained by scaling.

5. A Minkowski-type algorithm

In this section we will prove Theorem 2. First we give a needed definition. A minimal basis for a lattice $L\subset\mathbb{R}^{2}$ with respect to a norm $F$ is a $\mathbb{Z}$ -basis $\{P,P^{\prime}\}$ for $L$ with the property that

[TABLE]

For $\alpha\in\mathbb{R}$ let

[TABLE]

Obviously $L_{\alpha}$ has determinant one.

To prove Theorems 2 and 5 we require an algorithm that constructs a sequence of points $P_{n}\in L_{\alpha}$ and positive numbers $t_{m}$ such that $\{P_{m-1},P_{m}\}$ gives a minimal basis for $L_{\alpha}$ with respect to $F_{t_{m}}.$ We also want $P_{m-1}$ to have the smallest norm $\|P_{m-1}\|_{t}$ among non-zero points in $L_{\alpha}$ for any $t\in(t_{m-1},t_{m})$ . We will start with $P_{-1}=(0,1)$ and $P_{0}=(1,-\alpha)$ . Roughly speaking, given $P_{m-1}$ , to find the new point $P_{m}$ and the associated $t_{m}$ , we simultaneously expand $\mathcal{B}$ in the $x$ -direction while shrinking in the $y$ -direction in a such a way that $P_{m-1}$ remains on its boundary until we encounter $P_{m}$ . We then repeat this procedure starting with $P_{m}$ (see Figure 5). Our algorithm will produce pairs of lattice points in $L_{\alpha}$ that are linearly independent over $\mathbb{R}$ and on a ball for which $L_{\alpha}$ is admissible. First we need to know that they give a basis for $L_{\alpha}.$

Lemma 5.1.

Let $\alpha\in\mathbb{R}$ be irrational and $F$ be a fixed strongly symmetric norm. Suppose that $P,P^{\prime}\in L_{\alpha}$ lie on the boundary of $\mathcal{B}_{t}$ for some $t>0$ and are linearly independent over $\mathbb{R}$ . If $L_{\alpha}$ is admissible for $\mathcal{B}_{t}$ then $\{P,P^{\prime}\}$ gives a $\mathbb{Z}$ -basis for $L_{\alpha}$ .

Proof.

Consider the sublattice $P\mathbb{Z}+P^{\prime}\mathbb{Z}$ of $L_{\alpha}$ generated by these lattice points. By Minkowski’s first convex body theorem its index in $L_{\alpha}$ can only be 1 or 2. In the latter case suppose that $P=aQ+bQ^{\prime}$ and $P^{\prime}=cQ+dQ^{\prime}$ where $L_{\alpha}=Q\mathbb{Z}+Q^{\prime}\mathbb{Z}$ , so $|ad-bc|=2.$ If $a$ were even and $c$ odd we would have that $b$ is even and so $\tfrac{1}{2}P=(\frac{a}{2})Q+(\frac{b}{2})Q^{\prime}$ would be a non-zero point in $\mathcal{B}_{t}\cap L_{\alpha}$ . A similar argument disallows $c$ being even and $a$ odd. Thus $a$ and $c$ are either both even or both odd. Similarly $b$ and $d$ are either both even or both odd. In any case

[TABLE]

are distinct points of $L_{\alpha}$ . As $\mathcal{B}_{t}$ is convex they must lie on the boundary of $\mathcal{B}_{t}$ . It follows that $\mathcal{B}_{t}$ must be a parallelogram and strong symmetry implies it is a stretched ball for either the 1-norm or the sup norm. As the corners and midpoints of the sides are lattice points we would have to have that $L_{\alpha}$ contains points of the $x$ -axis, i.e. $\alpha$ would be rational. ∎

In the next lemma we make the whole process precise. Clearly to represent any $L_{\alpha}$ we may assume that $\alpha\in(-\frac{1}{2},\frac{1}{2}].$

Lemma 5.2.

Fix a strongly symmetric norm $F$ and an irrational $\alpha\in(-\frac{1}{2},\frac{1}{2})$ . There is a sequence $1=t_{-1}\leq t_{0}<t_{1}<t_{2}<\ldots$ tending to $\infty$ and for each $m=-1,0,1,\dots$ there is a $P_{m}=(x_{m},y_{m})\in L_{\alpha}$ with the following properties. For each $m\geq 0$

(i)

$x_{m}>x_{m-1}$ * and $|y_{m}|<|y_{m-1}|$ ,* 2. (ii)

$\{P_{m-1},P_{m}\}$ * gives a minimal basis for $L_{\alpha}$ with respect to $F_{t_{m}}$ ,* 3. (iii)

for any $t\in(t_{m-1},t_{m})$ there is no $P^{\prime}=(x^{\prime},y^{\prime})\in L_{\alpha}$ different from $P_{m-1}$ with $x^{\prime}>0$ and

[TABLE]

Proof.

Consider for $P=(x,y)\in L_{\alpha}$ the ball

[TABLE]

for which $\mathrm{area}\,\mathcal{B}(P,t)=F^{2}_{t}(P)\,\mathrm{area}\,\mathcal{B}.$ Suppose that $L_{\alpha}$ is admissible for $\mathcal{B}(P,t)$ . Now by Lemma A.1

[TABLE]

Thus, as long as $y\neq 0$ , by Minkowski’s first convex body theorem there will be a maximal $t^{\prime}\geq t$ for which $L_{\alpha}$ is admissible for $\mathcal{B}(P,t^{\prime})$ . For any of the resulting $P^{\prime}\neq-P$ with $F_{t^{\prime}}(P^{\prime})=F_{t^{\prime}}(P)$ , we have by Lemma 5.1 that $\{P,P^{\prime}\}$ gives a minimal basis for $L_{\alpha}$ with respect to $F_{t^{\prime}}.$

Let $P_{-1}=(0,1)$ . Then $L_{\alpha}$ is admissible for $\mathcal{B}(P_{-1},1)$ . Let $t_{0}\geq 1$ be maximal for which $L_{\alpha}$ is admissible for $\mathcal{B}(P_{-1},t_{0})$ . Our assumption that $\alpha\in(-\frac{1}{2},\frac{1}{2})$ implies that we can take $P_{0}=(1,-\alpha)$ as a solution to $F_{t_{0}}(P_{0})=F_{t_{0}}(P_{-1})$ .

Now $L_{\alpha}$ is admissible for $\mathcal{B}(P_{0},t_{0})$ and so we find $t_{1}>t_{0}$ maximal so that $L_{\alpha}$ is admissible for $\mathcal{B}(P_{0},t_{1})$ . That $t_{1}>t_{0}$ with strict inequality is assured by our choice of $P_{0}$ . Among the finitely many $P^{\prime}=(x^{\prime},y^{\prime})\in L_{\alpha}$ with $F_{t_{1}}(P^{\prime})=F_{t_{1}}(P_{0})$ there will be unique one with maximal $x^{\prime}$ since $\alpha$ is irrational. We let $P_{1}=(x_{1},y_{1})$ be this point. Clearly $x_{1}>x_{0}$ and $|y_{1}|<|y_{0}|$ .

We continue this process to construct $t_{m}$ and $P_{m}$ . That we have $t_{m}>t_{m-1}$ is guaranteed by choosing among the new points on the boundary the one with maximal $x$ -coordinate. From the form of $L_{\alpha}$ , where $\alpha$ is irrational, it follows that $x_{m}>x_{m-1}$ and $|y_{m-1}|>|y_{m}|>0$ for each $m\geq 0$ and that this process never terminates.

We will have all the stated properties of $t_{m}$ and $P_{m}$ once we show that $t_{m}\rightarrow\infty$ . We have by Lemma A.1 and Minkowski’s first convex body theorem again that for each $m\geq 0$

[TABLE]

Thus $t_{m}\gg x_{m}\rightarrow\infty$ as $x_{m}>x_{m-1}$ are integers. ∎

Proof of Theorem 2

Observe that $\mathcal{B}(P_{m},t_{m})$ as defined by (5.2), with $P_{m}$ and $t_{m}$ from Lemma 5.2, contains a parallelogram of area 2 since $\{P_{m-1},P_{m}\}$ is a minimal basis for $L_{\alpha}$ with respect to $F_{t_{m}}.$ Therefore

[TABLE]

where to get the second inequality we have applied Minkowski’s bound (4.1). Since $t_{m}\rightarrow\infty$ as $n\rightarrow\infty$ we have that

[TABLE]

thus proving Theorem 2. ∎

6. The continued fraction

Next we relate to each other the two definitions of best approximation given in and below Definition 2.

Lemma 6.1.

If the fraction $\frac{p}{q}$ with $q>0$ is a best approximation of an irrational $\alpha$ with respect to a strongly symmetric norm $F$ then it is a best approximation in the usual sense. Conversely, if $\frac{p}{q}$ with $q>0$ is a best approximation of an irrational $\alpha$ in the usual sense it is a best approximation with respect to the sup-norm.

Proof.

Suppose that there is a $t>1$ such that $\frac{r}{s}\neq\frac{p}{q}$ with $s>0$ implies

[TABLE]

If $s\leq q$ then $|r-\alpha s|>|p-\alpha q|$ by Lemma A.1.

Conversely, suppose that $\frac{r}{s}\neq\frac{p}{q}$ and $0<s\leq q$ implies that $|p-\alpha q|<|r-\alpha s|$ . Choose $t$ such that $t^{-1}q=t|p-\alpha q|$ and note that such a $t>1.$

If $0<s\leq q$ and $\frac{r}{s}\neq\frac{p}{q}$ then

[TABLE]

If $s>q$ then

[TABLE]

This finishes the proof. ∎

Proof of Theorem 5

For any $\alpha\in\mathbb{R}$ write $\alpha=\alpha^{\prime}+a_{0}$ , where $a_{0}\in\mathbb{Z}$ and $\alpha^{\prime}\in(-\frac{1}{2},\frac{1}{2}].$ Suppose that $\alpha$ is irrational. In the notation of Lemma 5.2 (taking there $\alpha=\alpha^{\prime}$ ) for $m\geq-1$ write $P_{m}=(x_{m},y_{m})$ . For $m\geq 0$ define

[TABLE]

and set $g_{-1}=\left(\begin{smallmatrix}0&1\\ 1&-\alpha\end{smallmatrix}\right)$ . We know by Lemma 5.2 that for each $m\geq 1$ there is a positive integer $a_{m}$ and $\varepsilon_{n}=\pm 1$ so that

[TABLE]

This also holds for $m=0$ if we set $\varepsilon_{0}=1.$ Clearly for $m\geq 0$

[TABLE]

The numerator $p_{m}$ and denominator $q_{m}$ of the convergents

[TABLE]

of our continued fraction are determined recursively for $m\geq 0$ through

[TABLE]

It is easy to see that for $m\geq 0$

[TABLE]

By (6.1) and (6.5) for each $m\geq 0$ we get that

[TABLE]

By Lemma 5.2 the basis of rows of $g_{m}$ is minimal for the norm $F_{t_{m}}$ . Choose any $t\in(t_{m},t_{m+1})$ . By (iii) of Lemma 5.2 for any $\frac{r}{s}\neq\frac{p_{m}}{q_{m}}$ we have

[TABLE]

Conversely, suppose that $\frac{r}{s}$ is a best approximation to $\alpha$ with respect to $F$ , and write $Q=(s,r-\alpha s)$ . Then for some $t>1$ , we have $F_{t}(Q)\leq F_{t}(P_{m})$ for all $m\geq 0$ . By Lemma 5.2 it cannot happen that $t\in(1,t_{0})$ since in that case we would have to have $Q=(0,1)$ and so $s=0$ . Also we cannot have that $t=t_{m}$ for any $m\geq 0.$ On the other hand, if $m$ is such that $t\in(t_{m},t_{m+1})$ , then by Lemma 5.2 we have $Q=P_{m}$ . It follows that the convergents are precisely the best approximations to $\alpha$ with respect to $F$ .

That the continued fraction converges to $\alpha$ now follows from the first statement of Lemma 6.1 and Lagrange’s theorem mentioned below Theorem 5, since they imply that each convergent of our continued fraction is a convergent of the regular continued fraction. It remains to show that it is semi-regular. Since $\alpha$ is irrational we need only show that $\varepsilon_{m+1}+a_{m}\geq 1$ for all $m\geq 1$ . This will follow once we relate the $\mu_{m},\nu_{m}$ from (3.5) to the points $P_{m}=(x_{m},y_{m})$ , which is also needed to prove our generalization of (1.3).

Lemma 6.2.

For $x_{m},y_{m}$ from (6.6) and $\mu_{m},\nu_{m}$ from (3.5) we have for $m\geq 0$ that

[TABLE]

Proof.

The proof is an adaptation to more general continued fractions of standard arguments used for regular continued fractions (see [49]).

To start with, by (6.6)

[TABLE]

By (6.5) and (6.2) we have

[TABLE]

Together with (6.3) and (6.4), this yields the following formal identity between rational functions with variables $a_{1},\dots,a_{m+1}$ :

[TABLE]

The $m^{th}$ complete quotient $\alpha_{m}$ of the expansion $\alpha=\frac{\varepsilon_{1}}{a_{1}+}\,\frac{\varepsilon_{2}}{a_{2}+}\cdots$ is defined recursively by $\alpha_{0}=\alpha$ and for $m\geq 0$ through

[TABLE]

It follows that for $m\geq 0$ we have

[TABLE]

By (6.11) upon setting the variable $a_{m+1}=\alpha_{m+1}$ and using (6.12) we derive that

[TABLE]

Next solve this equation for $\alpha_{m+1}$ and use (6.10) with $m$ in place of $m+1$ to get

[TABLE]

From (6.12) we have

[TABLE]

so by (3.5)

[TABLE]

The first formula of (6.8) now follows from (6.9) and (6.13).

To prove the second formula of (6.8) start with $\frac{q_{m-1}}{q_{m}}=\frac{x_{m-1}}{x_{m}}$ from (6.6). By (3.5) we have that $v_{0}=0$ while for $m\geq 0$

[TABLE]

Using (6.4) we see that $\frac{q_{m-1}}{q_{m}}$ satisfies the same recurrence. ∎

We now finish the proof of Theorem 5 by showing that our expansion is semi-regular. Suppose that we had $\varepsilon_{m+1}=-1$ and $a_{m}=1$ for some $m\geq 1.$ We would then have from (6.14) that $\alpha_{m}<1$ and so from (6.15) that

[TABLE]

which is impossible. This completes the proof of Theorem 5.∎

7. A formula for $\delta_{F}(\alpha)$

We will deduce Theorem 6 from a formula for $\delta_{F}(\alpha)$ for any strongly symmetric norm $F$ given in terms of the quantities $\mu_{m},\nu_{m}$ . As usual, we may identify the space of all lattices of determinant one with $\Gamma\backslash G$ where $G={\rm SL}(2,\mathbb{R})$ and $\Gamma={\rm SL}(2,\mathbb{Z})$ by means of

[TABLE]

Let $\mathcal{D}$ be the set of $g=\left(\begin{smallmatrix}x&y\\ x^{\prime}&y^{\prime}\end{smallmatrix}\right)\in G$ such that

[TABLE]

For $g\in\mathcal{D}$ let $F(g)=F(x,y).$

Lemma 7.1.

The map $\Phi:\mathcal{D}\rightarrow(-1,1)\times[0,1)$ given by

[TABLE]

is a continuous bijection.

Proof.

The inverse of $\Phi$ is given by

[TABLE]

By Lemma A.3 we see that $t=t(u,v)>0$ exists and is uniquely determined by the condition $F_{t}(1,-u)=F_{t}(v,1).$ ∎

The function

[TABLE]

is easily seen to be continuous on $(-1,1)\times[0,1)$ .

The following is our generalization of the formula (1.3).

Lemma 7.2.

Fix a $F$ strongly symmetric norm. For any irrational $\alpha$ whose continued fraction associated to the norm is (3.1) we have that

[TABLE]

where $\mu_{m},\nu_{m}$ are given above in (3.5).

Proof.

Fix an $m$ and write as before $P_{m}=(x_{m},y_{m})$ . Let

[TABLE]

where $0\leq x^{\prime}<x$ and $|y|<y^{\prime}$ . Recall that by (i) of Lemma 5.2 we know that

[TABLE]

Lemmas 7.1 and 6.2 now imply that

[TABLE]

where $\gamma_{m}=\pm 1$ was defined in (6.2). Note that in this case $\gamma_{m}=\operatorname{sgn}{y_{m-1}}.$ By strong symmetry of the norm we have $F_{t_{m}}(x,y)=F_{t_{m}}(x^{\prime},y^{\prime})=F_{t_{m}}(P_{m})$ . Hence

[TABLE]

Now we need to show that

[TABLE]

For $t\geq 1$ let $m(t)$ be such that $t_{m(t)}\leq t\leq t_{m(t)+1}$ . By Lemma 5.2 we have that

[TABLE]

and by Lemma A.5 we have that

[TABLE]

Now apply the first inequality in (5.3) to establish (7.6) and therefore finish the proof of Lemma 7.2. ∎

Proof of Theorem 6

To conclude formula (3.6) from Lemma 7.2, first observe that for the $p$ -norm with $1\leq p<\infty$ we have from (7.4) that for $(u,v)\in(-1,1)\times[0,1)$ the value of $t$ that makes the rows of $\Phi^{-1}(u,v)$ have the same norm $F_{t}$ is given by

[TABLE]

The corresponding value of $D_{F^{\langle p\rangle}}(u,v)$ from (7.5) is

[TABLE]

giving (3.6). The case $p=\infty$ is immediate. This completes the proof of Theorem 6.∎

8. $\mathcal{S}$ -expansions

To prove Theorem 7 we want to characterize in terms of the norm those convergents of the regular continued fraction of an irrational $\alpha$ that are also convergents of the continued fraction of $\alpha$ associated to a strongly symmetric norm $F$ . We will use the notation and results of Lemma 5.2. Write $P_{m}=(q_{m},p_{m}-\alpha q_{m})$ for points coming from this norm with corresponding $t_{m}$ and let $Q_{n}=(s_{n},r_{n}-\alpha s_{n})$ be the points coming from the convergents of the regular continued fraction of $\alpha$ . Furthermore, the partial quotient $b_{n}$ is associated to $Q_{n}$ while $a_{m}$ is associated to $P_{m}$ .

Lemma 8.1.

For a fixed $n\geq 0$ there are integers $c_{\ell}$ and $d_{\ell}$ with $c_{\ell}>0$ and $d_{\ell}\geq 0$ so that for each $\ell\geq 0$

[TABLE]

where $c_{\ell}\geq d_{\ell}$ for all $\ell\geq 0$ , while for $\ell\geq 2$ we have

[TABLE]

Proof.

The integers $r_{n},s_{n}$ are determined recursively for $n\geq 0$ by

[TABLE]

It is easy to check using (8.1) and (8.2) that $c_{\ell}$ and $d_{\ell}$ satisfy for fixed $n$ and $\ell\geq 1$ the recurrence relations

[TABLE]

The claim of the lemma follows from a straightforward inductive argument. ∎

The following result will be used to characterize those convergents of the regular continued fraction that occur as convergents in the continued fraction associated to the norm.

Lemma 8.2.

For $m\geq 1$ let $n$ and $\ell$ be such that $P_{m-1}=Q_{n-1}$ and $P_{m}=Q_{n+\ell}$ . Then

(i)

$\ell\in\{0,1\}$ . 2. (ii)

There is a unique $t\geq 1$ such that $F_{t}(Q_{n})=F_{t}(Q_{n-1})$ , and $\ell=1$ if and only if

[TABLE]

If this holds we have that $b_{n+1}=1.$ 3. (iii)

$Q_{0}=P_{0}$ * if and only if $a_{0}=b_{0}.$ *

Proof.

We know that $P_{-1}=Q_{-1}$ and that for each $m\geq 1$ we have $P_{m-1}=Q_{n-1}$ for some $n$ and $P_{m}=Q_{n+\ell}$ for some $\ell\geq 0.$ We can check directly that $Q_{0}=P_{0}$ if and only if $a_{0}=b_{0}.$

By Lemma 8.1 for $\ell\geq 0$ we have

[TABLE]

By Lemma 5.2 we have $F_{t_{m}}(Q_{n})\geq F_{t_{m}}(P_{m})=F_{t_{m}}(P_{m-1})$ and hence

[TABLE]

by Lemma A.4. Thus we have

[TABLE]

so by Lemma 8.1 we have that either $\ell=0$ or $\ell=1.$

Now by Lemma A.3 applied to the norm $F_{t_{m}}$ and using that

[TABLE]

there is a $t\geq 1$ (indeed $t\geq t_{m}$ ) so that

[TABLE]

In case $\ell=1$ we have $b_{n+1}=1$ by (8.7) and (8.3)–(8.4). By (8.5) we have that

[TABLE]

so by Lemma 5.2 we must have

[TABLE]

If $\ell=0$ we have $Q_{n-1}=P_{m-1}$ and $Q_{n}=P_{m}$ so that $t=t_{m}$ and

[TABLE]

at least when $m\geq 0$ , since then the $x$ -coordinate of $P_{m}+P_{m+1}$ is strictly larger than that of $P_{m}$ and so by Lemma 5.2 strict inequality must hold. ∎

Proof of Theorem 7

Lemma 8.2 gives instructions for obtaining the sequence of convergents ${p_{m}}/{q_{m}}$ of $\alpha$ associated to the norm $F$ from the sequence of regular convergents ${r_{n}}/{s_{n}}$ of $\alpha$ , namely

[TABLE]

where $t\geq 1$ is such that $F_{t}(Q_{n})=F_{t}(Q_{n-1})$ , and

[TABLE]

We must define a singularization area that encodes both of these instructions. Let $\mathcal{D}$ and $\Phi$ be as in Section 7. For each $g=\left(\begin{smallmatrix}x&y\\ x^{\prime}&y^{\prime}\end{smallmatrix}\right)\in\mathcal{D}$ we write

[TABLE]

and we define

[TABLE]

The portion of $\mathcal{S}$ that lies on the $u$ -axis encodes the rule (8.9). Suppose that $n\geq 1$ and let $\left(\begin{smallmatrix}x&y\\ x^{\prime}&y^{\prime}\end{smallmatrix}\right)=\Phi^{-1}(u_{n},v_{n})$ . Then Lemmas 7.1 and 6.2 imply that $Q_{n}=(tx,t^{-1}y)$ and $Q_{n-1}=(tx^{\prime},t^{-1}y^{\prime})$ , with $t$ defined by $F_{t}(Q_{n})=F_{t}(Q_{n-1})$ . So the condition on the right-hand side of (8.8) is equivalent to $F(P+P^{\prime})\leq F(P)$ . It follows that (8.8)–(8.9) are encoded by the rule

[TABLE]

It is helpful to have some more concrete information about the set $\mathcal{S}$ . For a generic norm it is difficult to describe $\mathcal{S}$ explicitly, so we will relate $\mathcal{S}$ to the set $\mathcal{S}_{1}$ , which is easy to describe. As usual, we denote $\mathcal{S}$ by $\mathcal{S}_{p}$ when $F$ is the $p$ -norm. We have

[TABLE]

as one can see by reducing the system of equations and inequalities

[TABLE]

defining $\mathcal{S}_{1}$ . The interior of the set (8.12) agrees with the $S$ -region given in [28] for Minkowski’s diagonal continued fraction.

Lemma 8.3.

For any strongly symmetric norm $F$ we have $\mathcal{S}\subseteq\mathcal{S}_{1}$ .

Proof.

Since $\mathcal{S}$ and $\mathcal{S}_{1}$ are closed sets in the induced topology on $[\frac{1}{2},1)\times[0,1]$ , it suffices to show that a dense subset of $\mathcal{S}$ is contained in $\mathcal{S}_{1}$ . Suppose that $(u,v)\in\mathcal{S}$ with $u\notin\mathbb{Q}$ and $v\in\mathbb{Q}$ , and write

[TABLE]

for the regular continued fractions of $u$ and $v$ . If we define

[TABLE]

then $(u,v)=(u_{n},v_{n})$ for $\alpha$ . Since $(u,v)\in\mathcal{S}$ we have $Q_{n+1}=Q_{n-1}+Q_{n}$ in the notation of Section 8, and for some $m$ we have $P_{m-1}=Q_{n-1}$ and $P_{m}=Q_{n+1}$ . Thus

[TABLE]

Let $t$ be such that $F^{\langle 1\rangle}_{t}(P_{m-1})=F_{t}^{\langle 1\rangle}(P_{m})$ , where $F^{\langle 1\rangle}$ denotes the $1$ -norm. By convexity of $F$ , the closed stretched ball $\overline{\mathcal{B}(P_{m},t_{m})}$ contains the line segment connecting the points $P_{m-1}$ and $P_{m}$ . This line segment comprises all of the points $P$ in the same quadrant as $Q_{n-1},Q_{n+1}$ with $x$ -coordinate between $x_{m-1}$ and $x_{m}$ , and with $F^{\langle 1\rangle}_{t}(P)=F^{\langle 1\rangle}_{t}(P_{m})$ . Since the $x$ -coordinate of $Q_{n}$ is between $x_{m-1}$ and $x_{m}$ and $Q_{n}$ is outside the ball $\mathcal{B}(P_{m},t_{m})$ , we have

[TABLE]

By (8.8) and (8.11) it follows that $(u,v)\in\mathcal{S}_{1}$ . ∎

Lemma 8.3, together with the explicit description (8.12), shows that the set $\mathcal{S}$ is a singularization area as defined above Theorem 7. This fact and (8.11) together prove Theorem 7.∎

We also immediately obtain the following lemma, which we will use several times in the coming sections.

Lemma 8.4.

For every strongly symmetric norm $F$ , there is a neighborhood of the line segment $u=v$ with $u,v\in(0,1)$ that does not intersect $\mathcal{S}$ .

We finish this section with a quick proof of our claim (2.8) that

[TABLE]

The regular continued fraction expansion of $\alpha=\tfrac{1}{2}(-1+\sqrt{3})$ is

[TABLE]

from which it follows that

[TABLE]

while $v_{2n}\to 2\alpha$ from below and $v_{2n+1}\to\alpha$ from above. The region $\mathcal{S}_{2}$ comprises those points $(u,v)$ for which $u(2+v)>1+2v$ . The points $(u_{n},v_{n})$ are all outside $\mathcal{S}_{2}$ , so the $2$ -continued fraction expansion of $\alpha$ is the same as the regular continued fraction and thus $(\mu_{n},\nu_{n})=(u_{n},v_{n})$ . Since $D_{2}(u,v)=D_{2}(v,u)$ , we have

[TABLE]

9. Values of $\delta_{F}(\alpha)$ for well approximable numbers

We now prove Theorem 3, which gives the smallest value of $\delta_{F}(\alpha)$ for $F$ any strongly symmetric norm and $\alpha$ well approximable.

Lemma 9.1.

Suppose that $\alpha$ is well approximable. Then $\delta_{F}(\alpha)\geq\Delta.$

Proof.

By definition, for any $\varepsilon>0$ there are arbitrarily large $q>0$ so that for some $p\in\mathbb{Z}$

[TABLE]

For such a $q$ let $t=q$ and note that for any $r,s\in\mathbb{Z}$ with $s>0$

[TABLE]

for some $\sigma$ with $|\sigma|\leq\varepsilon.$ By Lemma A.1 if $s\geq q$ we have

[TABLE]

while for $0<s<q$ we have $F(\tfrac{s}{q},sp-rq+\tfrac{\sigma s}{q})\geq F(0,1-\varepsilon),$ since $q\nmid s$ . By the continuity of $F$ , for any $\varepsilon^{\prime}>0$ there is an $\varepsilon>0$ so that $F(0,1-\varepsilon)\geq 1-\varepsilon^{\prime}$ . It follows that $F_{t}(s,r-\alpha s)\geq 1$ and hence that $\delta_{F}(\alpha)\geq\Delta.$ ∎

To finish the proof of Theorem 3, we need to find well approximable $\alpha$ for which $\delta_{F}(\alpha)=\Delta.$

Lemma 9.2.

Suppose that the partial quotients $b_{n}$ of the regular continued fraction expansion of $\alpha$ are eventually strictly increasing with $n$ . Then

[TABLE]

Proof.

If the regular partial quotients $b_{n}$ of $\alpha$ are eventually strictly increasing, then for any $\varepsilon>0$ the points $(u_{n},v_{n})$ all eventually lie within $\varepsilon$ of the point $(0,0)$ . So by Lemma 8.4, the points $(u_{n},v_{n})$ are outside $\mathcal{S}$ for sufficiently large $n$ . Thus

[TABLE]

Finally, $D_{F}(0,0)=F^{2}\left(\left(\begin{smallmatrix}1&0\\ 0&1\end{smallmatrix}\right)\right)=1$ , therefore $\delta_{F}(\alpha)=\Delta$ . ∎

10. Values of $\delta_{p}(\alpha)$ for any irrational $\alpha$

Proof of Theorem 4

Fix $p\in[1,\infty]$ . Throughout the proof let

[TABLE]

denote the $p$ -continued fraction expansion of $\alpha\in(0,1)$ and define $\mu_{m}$ and $\nu_{m}$ as in (3.5). By Theorem 6 it suffices to show that for every $\alpha$ we have

[TABLE]

and that there is at least one $\alpha$ for which equality holds. In both cases the number on the right-hand side of (10.2) is $\leq 1$ . Since $(1-|u|^{p}v^{p})^{2}\geq(1-|u|^{p})(1-v^{p})$ , we have

[TABLE]

It follows that $D_{p}(u,v)\geq 1$ for nonpositive $u$ , so if $\mu_{m}\leq 0$ for infinitely many $m$ , the inequality (10.2) holds trivially. Thus we may assume that the $p$ -continued fraction expansion of $\alpha$ has $\mu_{m}\geq 0$ for all sufficiently large $m$ .

Lemma 10.1.

If $0\leq u,v<1$ then

[TABLE]

with equality only when $u=v$ .

Proof.

The inequalities

[TABLE]

both reduce to $(u-v)^{2}\geq 0$ . It remains to show that

[TABLE]

This inequality is implied by the first inequality of (10.4) and

[TABLE]

which follows immediately from Hölder’s inequality. ∎

It is convenient to define

[TABLE]

Then

[TABLE]

If $1\leq p<2$ then $d_{p}(x)$ is strictly increasing, so by Lemma 10.1 we have

[TABLE]

If $p=2$ then $d_{p}(x)=1$ for all $x$ . In either case, we can use Lemma 9.2 to find examples of $\alpha$ for which $\delta_{p}(\alpha)=\Delta_{p}$ .

Suppose that $p>2$ , and let $\beta=\frac{1}{2}(\sqrt{5}-1)$ . The sequence $(u_{n},v_{n})$ associated to the regular continued fraction

[TABLE]

approaches $(\beta,\beta)$ as $n\to\infty$ . By Lemma 8.4 it follows that the sequence $(\mu_{m},\nu_{m})$ associated to the $p$ -continued fraction of $\beta$ also converges to $(\beta,\beta)$ . Thus, for $p>2$ we have $\delta_{p}(\beta)=D_{p}(\beta,\beta)$ , which is the number in (2.7).

It remains to show that for every $\alpha$ with $\mu_{m}\geq 0$ for sufficiently large $m$ , we have $D_{p}(\mu_{m},\nu_{m})\geq D_{p}(\beta,\beta)$ for infinitely many $m$ . Since $p>2$ , the function $d_{p}(x)$ is strictly decreasing, so by Lemma 10.1 it suffices to show that

[TABLE]

for infinitely many $m$ .

If there are infinitely many $m$ such that $a_{m+1}\geq 5$ then for such $m$ we have $\mu_{m}\leq\frac{1}{5}$ and therefore $\mu_{m}+\nu_{m}\leq 1.2$ . So we may suppose that $a_{m}\leq 4$ for all sufficiently large $m$ . The following lemma covers the remaining cases.

Lemma 10.2.

Let $\ell\in\{2,3,4\}$ and suppose that $\varepsilon_{m}=1$ and $a_{m}\leq\ell$ for sufficiently large $m$ . If $a_{m}=\ell$ for infinitely many $m$ , then

[TABLE]

for infinitely many $m$ .

Proof.

Suppose that $\varepsilon_{m}=1$ and $a_{m}\leq\ell$ for $m\geq M$ . Then for any $m\geq M+3$ with $a_{m+1}=\ell$ we have

[TABLE]

The lemma now follows from an easy computation. ∎

This completes the proof of Theorem 4.∎

We remark that it is sometimes possible to compute the minimum value of $\delta_{F}(\alpha)$ for other norms as well. The composition of strongly symmetric norms is, up to scaling, also strongly symmetric (see Lemma A.2). For instance, the norms $F^{\mathrm{oct_{1}}}$ and $F^{\mathrm{oct_{2}}}$ with regular octagonal unit balls mentioned in §1 can be given in terms of compositions of the 1-norm and the sup-norm. Explicitly,

[TABLE]

where $Q=(\tfrac{1}{\sqrt{2}}F^{\langle 1\rangle}(P),F^{\langle\infty\rangle}(P)).$ These formulas, together with Mahler’s computation of the critical determinant of the regular octagon recalled at the end of §4 and Lemma 8.4, lead to the result referred to at the end of §1. The minimum of $\delta_{F}(\alpha)$ for $F=F^{\mathrm{oct_{1}}}$ is $\frac{1}{8}\left(3\sqrt{2}+2\right),$ which is attained when

[TABLE]

For $F=F^{\mathrm{oct_{2}}}$ the minimal value is also $\frac{1}{8}\left(3\sqrt{2}+2\right)$ , but now this is the value of $\Delta$ and is attained when $\alpha=\frac{e-1}{e+1}$ , for instance.

11. The dynamical system

The goal of this section is to prove Theorem 1. We employ the notation of Section 8. Say that $g=\left(\begin{smallmatrix}x&y\\ x^{\prime}&y^{\prime}\end{smallmatrix}\right)\in G$ is reduced with respect to the norm $F$ if $g\in\mathcal{D}$ and

[TABLE]

where the overline denotes the closure and $L(g)$ was defined in (7.1). Let $\mathcal{R}$ be the set of all $g$ that are reduced with respect to $F$ and define $\Omega\subset(-1,1)\times[0,1]$ as

[TABLE]

where

[TABLE]

We will show that $(\mu_{n},\nu_{n})\in\Omega$ for all $n\geq 0$ .

We want to apply the ergodic theory of $\mathcal{S}$ -expansions as developed in [27]. For that we need to show that $\Omega$ defined by (11.2) coincides with the set

[TABLE]

defined in Section 5 of [27], where

[TABLE]

The following equivalent description of reduced matrices is helpful.

Lemma 11.1.

A matrix $g\in\mathcal{D}$ is reduced if and only if

[TABLE]

Proof.

Clearly a matrix $g$ satisfying (11.1) also satisfies (11.3). Suppose $g\in\mathcal{D}$ satisfies (11.3). We will show that $F(aP+bP^{\prime})>F(P)$ for all $(a,b)\in\mathbb{Z}^{2}\setminus\{(0,0),(0,\pm 1),(\pm 1,0)\}$ . If $|a|=|b|$ then

[TABLE]

Otherwise, if $|a|>|b|$ , say, then by the reverse triangle inequality

[TABLE]

This is strictly greater than $F(P)$ if $|a|-|b|\geq 2$ . If $|a|=|b|+1$ then

[TABLE]

This completes the proof since $|b|\geq 1$ so $|a|\geq 2$ . ∎

It follows that the set $(-1,1)\times[0,1]$ decomposes as $\Omega\sqcup\mathcal{S}\sqcup\mathcal{S}^{\prime}\sqcup\mathcal{S}^{\prime\prime}$ , where

[TABLE]

See Figure 6 for the case $p=2$ .

Since the critical lattices for $F$ are among those for which two basis vectors and their sum all have equal norm, we will refer to the set

[TABLE]

as the potentially critical matrices. The next lemma describes the boundary of $\Omega$ in terms of the distinguished subset

[TABLE]

of the potentially critical matrices.

Lemma 11.2.

The part of the boundary of $\Omega$ that lies in $(-1,1)\times[0,1]$ is $\partial\cup\partial^{\prime}\cup\partial^{\prime\prime}\cup\mathcal{A}$ , where

[TABLE]

Proof.

The boundary of $\Omega$ is $\mathcal{A}\cup\mathcal{C}$ , where

[TABLE]

Clearly $\partial$ is the part of $\mathcal{C}$ adjacent to $\mathcal{S}$ . The remaining set, $\mathcal{C}\setminus\partial$ , is the image of the set of $g^{\prime}\in\mathcal{D}$ satisfying

[TABLE]

If the $y$ -coordinate of $Q$ is negative, then $(P,P^{\prime})=(Q^{\prime},Q-Q^{\prime})$ gives an element of $\mathcal{P}$ , and

[TABLE]

Otherwise $(P,P^{\prime})=(Q-Q^{\prime},Q^{\prime})$ yields an element of $\mathcal{P}$ ; in this case

[TABLE]

This completes the proof. ∎

Lemma 11.3.

The function $D(u,v)$ is continuous on $\overline{\Omega}\setminus\{(1,1)\}$ and assumes its maximum value $1/\Delta$ on that set.

Proof.

The continuity statement is clear from the definition of $D(u,v)$ . Say that a point $(u,v)$ in $\overline{\Omega}\setminus\{(1,1)\}$ is a critical point if $D(u,v)=1/\Delta$ . For $(u,v)\in\overline{\Omega}\setminus\{(1,1)\}$ , let the points $P=(x,y)$ and $P^{\prime}=(x^{\prime},y^{\prime})$ be such that $\Phi^{-1}(u,v)=\left(\begin{smallmatrix}x&y\\ x^{\prime}&y^{\prime}\end{smallmatrix}\right)$ . Then

[TABLE]

where $L$ is the lattice generated by the unit vectors $\frac{1}{F(P)}P$ and $\frac{1}{F(P^{\prime})}P^{\prime}$ . Thus $(u,v)$ is a critical point if and only if $L$ is a critical lattice, so by Lemma 11.2 all critical points lie on the boundary of $\Omega$ . If critical points in $\overline{\Omega}\setminus\{(1,1)\}$ exist, then we are done.

Suppose that a critical lattice $L$ corresponds to the point $(1,1)$ in the $uv$ -plane. Then there exists a $t$ such that

[TABLE]

Then the matrix $g=\frac{1}{\sqrt{2}}\left(\begin{smallmatrix}2t^{-1}&0\\ t^{-1}&-t\end{smallmatrix}\right)$ is in $\overline{\mathcal{R}}$ and satisfies $\Phi(g)=(0,\frac{1}{2})$ . So if $(1,1)$ corresponds to a critical lattice $L$ , then the point $(0,\frac{1}{2})\in\overline{\Omega}$ is a critical point. ∎

That $\Omega=\Omega_{\mathcal{S}}$ follows from the next lemma.

Lemma 11.4.

We have

[TABLE]

Proof.

We begin by showing that $\mathcal{S}^{\prime}=T\mathcal{S}$ . For $(u,v)\in\mathcal{S}$ we have $\frac{1}{2}\leq u<1$ , so (2.9) simplifies to $T(u,v)=\left(\frac{1-u}{u},\frac{1}{v+1}\right)$ . We show that the boundary of $\mathcal{S}$ maps to the boundary of $\mathcal{S}^{\prime}$ under $T$ ; the lemma then follows by the continuity of $T$ on $\mathcal{S}$ . Since $T([\frac{1}{2},1]\times\{0\})=[0,1]\times\{1\}$ and $T(\{1\}\times[0,1])=\{0\}\times[\frac{1}{2},1]$ , it suffices to show that $T(\partial)=\partial^{\prime}$ . Suppose that $(u,v)\in\partial$ and that $\Phi(g)=(u,v)$ . Then

[TABLE]

Since this is clearly invertible, we conclude that $T(\partial)=\partial^{\prime}$ .

We prove $\mathcal{S}^{\prime\prime}=\left((-1,0]\times[0,1]\right)\setminus M\mathcal{S}^{\prime}$ similarly. We have $M([0,1]\times\{1\})=[-\frac{1}{2},0]\times\{0\}$ and $M(\{0\}\times[\frac{1}{2},1])=\{0\}\times[0,\frac{1}{2}]$ for the straight line segments, so it suffices to show that $M(\partial^{\prime})=\partial^{\prime\prime}$ . We will show that $(M\circ T)\partial=\partial^{\prime\prime}$ , using that

[TABLE]

Suppose that $(u,v)\in\partial$ and that $\Phi(g)=(u,v)$ . Then

[TABLE]

which completes the proof. ∎

Let $\omega_{\mathcal{S}}=(1-\omega(\mathcal{S}))^{-1}\omega,$ where $\omega$ was defined in (2.10).

Lemma 11.5.

Define $\mathcal{S}$ and $\Omega$ by (8.10) and (11.2). Then for almost all irrational $\alpha$ the sequence $(\mu_{m},\nu_{m})$ is uniformly distributed over $\Omega$ with respect to the measure $\omega_{\mathcal{S}}$ .

Proof.

Since $\Omega=\Omega_{\mathcal{S}}$ , Lemma 11.5 follows from Theorem 5.4.23 of [9] (see also [27]) and Theorem 7. ∎

Proof of Theorem 1

By Lemma 11.3, the function $\Delta D(u,v)$ assumes the value $1$ at some point in $\overline{\Omega}\setminus\{(1,1)\}$ . By Lemma 7.2 it follows that $\delta_{F}(\alpha)=1$ if and only if the sequence $(\mu_{n},\nu_{n})$ is infinitely often arbitrarily close to such a critical point. It follows from Lemma 11.5 that $\delta_{F}(\alpha)=1$ for almost all $\alpha$ .

To finish the proof it suffices to show that there are uncountably many such $\alpha$ . Since we already know Theorem 1 is true in the case of the sup-norm, suppose that $F$ is not the sup-norm. Then the lattice generated by $(1,0)$ and $(0,1)$ is not potentially critical, so we have $\Delta D(0,0)<1$ . Thus any $\alpha$ for which $(\mu_{n},\nu_{n})$ converges to $(0,0)$ has $\delta_{F}(\alpha)<1$ , and there are uncountably many such $\alpha$ (for example, the set of $\alpha$ with strictly increasing partial quotients).∎

12. Concluding remarks

In addition to the proof of Theorem 1, there are other applications of the metric theory of $\mathcal{S}$ -expansions and ergodic theory to quantities related to $\delta_{p}(\alpha).$ For instance we may treat the distribution of the values of

[TABLE]

from Theorem 6. For almost all $\alpha$ the distribution function

[TABLE]

exists for all $z\in[0,1].$ For $p=1,2,\infty$ it can be evaluated explicitly, as was done for $p=\infty$ in Theorem 4 of [5]. In particular, for almost all $\alpha$

[TABLE]

exists where

[TABLE]

It is well known that a close connection exists between dynamical systems associated to various kinds of continued fractions and the geodesic flow on ${\rm SL}(2,\mathbb{Z})\backslash{\rm SL}(2,\mathbb{R})$ . See [14] and a discussion in [2] for more on this connection and for references to the literature. Roughly speaking, the natural extension of a continued fraction transformation can be identified with a cross section for the geodesic flow. For example, the transformation $T$ from (2.9) of the regular continued fraction’s natural extension gives a planar representation of the first return map and $\omega$ corresponds to the Liouville measure. Geodesics can be identified with (proper classes of) indefinite binary quadratic forms and a cross section with a reduction domain. The trajectories we study in this paper correspond to cuspidal geodesics or, equivalently, forms with one rational root.

Of course there is great interest in similar Diophantine problems about general indefinite forms and hence general geodesic trajectories. A prime example is the Markov problem [32] about the minima of such forms and their possible values; these values determine the Markov spectrum (see [3] and its references). The Lagrange spectrum is similarly defined using cuspidal trajectories; it is determined by the values of

[TABLE]

The Dirichlet spectrum is determined by the values of $\delta(\alpha)=\limsup_{t\geq 1}\rho_{t}(\alpha);$ in [22] it is defined to be the set of values of $\frac{\delta(\alpha)}{1-\delta(\alpha)}.$ There is a spectrum that is related to the Dirichlet spectrum in the same way that the Markov spectrum is related to the Lagrange spectrum. Like the Markov problem, its study involves general geodesic trajectories and their associated continued fractions. Again speaking roughly, we replace $\limsup$ over cuspidal geodesics in the definition of $\delta(\alpha)$ by the supremum over all geodesics. Mordell [38] introduced this problem (actually an $n$ -dimensional version), which he posed as a kind of converse to Minkowski’s linear forms theorem. The case of two dimensions was treated in more detail by Szekeres [54], Oppenheim [41] and Burger [6]. This problem in higher dimensions has also attracted a lot of attention (see e.g. [44, 45, 53]).

It should be apparent that a general spectrum of this type can be defined for any strongly symmetric norm $F$ , not just the sup-norm, and that an associated reduction theory for indefinite binary quadratic forms can be developed that uses $F$ -continued fractions. For the 2-norm the problem was introduced by Oppenheim [42] and the relevant reduction theory was already found by Hermite. Minkowski developed the reduction theory for the 1-norm with Hermite’s theory in mind and certainly knew that a version could be based on the $p$ -norm for a general $p$ [36, footnote on p. 166]. However, outside of the sup-norm, only isolated aspects of the spectrum and reduction theory have been considered and only for the $p$ -norm for $p=1,2.$

Appendix A Lemmas about norms

Here we state and prove a number of simple technical lemmas that are referred to in the body of the paper. Here $F$ is a norm on $\mathbb{R}^{2}$ with unit ball $\mathcal{B}$ and $P=(x,y),P^{\prime}=(x^{\prime},y^{\prime})\in\mathbb{R}^{2}.$ For $t>0$ we define as above $F_{t}(x,y)=F(t^{-1}x,ty).$ The lemmas give various properties of norms that satisfy the first condition (2.4) of strong symmetry. Note that if $F$ satisfies (2.4) then so does $F_{t}$ for any $t>0.$ The first result is crucial and is used repeatedly in this paper.

Lemma A.1.

Suppose that $F$ satisfies (2.4). If $|x^{\prime}|\leq|x|$ and $|y^{\prime}|\leq|y|$ then we have that

[TABLE]

Proof.

To see this observe that if $F(P)=s$ then $F(\pm x,\pm y)=s$ hence $F(x^{\prime},y^{\prime})\leq s$ by convexity. ∎

Lemma A.2.

If $F,G,H$ satisfy (2.4) then so does $K$ defined by

[TABLE]

Proof.

This follows easily using Lemma A.1. ∎

Lemma A.3.

Suppose that $F$ satisfies (2.4). The following properties hold.

(i)

If $F(P^{\prime})\geq F(P)$ and $|y^{\prime}|<|y|$ then for some unique $t\geq 1$ we have

[TABLE] 2. (ii)

If $F(P^{\prime})\geq F(P)$ and $|x^{\prime}|<|x|$ then for some unique $t\leq 1$ we have

[TABLE]

Proof.

We only prove (i) as (ii) is a consequence of (i) applied to the norm $G(x,y)=F(y,x).$

Existence: If $F(P^{\prime})=F(P)$ take $t=1.$ Otherwise for any $P\in\mathbb{R}^{2}$ define the continuous function $f_{P}:[1,\infty)\rightarrow\mathbb{R}^{+}$ by $f_{P}(t)=t^{-1}F_{t}(P).$ Now by Lemma A.1

[TABLE]

On the other hand, $f_{P^{\prime}}(t)=F(t^{-2}x^{\prime},y^{\prime})\rightarrow F(0,y^{\prime})=|y^{\prime}|F(0,1)<|y|F(0,1)$ as $t\rightarrow\infty$ . Because $f_{P}(1)<f_{P^{\prime}}(1)$ the existence of desired $t$ follows by the intermediate value theorem.

Uniqueness: Suppose that for $t_{1}\neq t_{2}$ with $t_{1},t_{2}\geq 1$ we have

[TABLE]

This implies that $|x|=|x^{\prime}|$ and that $|y|=|y^{\prime}|$ , which is not true. ∎

The following result is trivial in case the norm is strictly convex.

Lemma A.4.

Suppose that that $F$ satisfies (2.4), that we have $F(P)=F(P^{\prime})$ and that $0<x^{\prime}<x$ and $0<|y|<y^{\prime}$ . Then for any $d\geq 1$

[TABLE]

Proof.

To see this note first that in order for equality to hold in (A.1) we must have that

[TABLE]

which implies that

[TABLE]

hence

[TABLE]

That this is impossible follows by a simple convexity argument using the locations of

[TABLE]

together with (2.4). ∎

Lemma A.5.

Suppose that that $F$ satisfies (2.4). For $\sigma,\sigma^{\prime}\in[0,1]$ with $\sigma+\sigma^{\prime}=1$ and $1\leq t_{1}\leq t_{2}$ we have

[TABLE]

Proof.

Using the fact that the function $t\mapsto t^{-1}$ is concave up and applying Lemma A.1 we get that

[TABLE]

By the defining properties of a norm we finish the proof. ∎

Bibliography57

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Andersen, N. & Duke, W., Markov spectra for modular billiards, to appear in Math. Annalen (2019).
2[2] Arnoux, P. & Schmidt, T.A., Cross sections for geodesic flows and α 𝛼 \alpha -continued fractions. Nonlinearity 26 (2013), no. 3, 711–726.
3[3] Bombieri, E., Continued fractions and the Markoff tree. Expo. Math. 25 (2007), no. 3, 187–213.
4[4] Bosma, W., Optimal continued fractions. Nederl. Akad. Wetensch. Indag. Math. 49 (1987), no. 4, 353–379.
5[5] Bosma, W. & Jager, H. & Wiedijk, F., Some metrical observations on the approximation by continued fractions. Nederl. Akad. Wetensch. Indag. Math. 45 (1983), no. 3, 281–299.
6[6] Burger, E. B., On a question of Mordell and a spectrum of linear forms. J. London Math. Soc. (2) 62 (2000), no. 3, 701–715.
7[7] Cassels, J. W. S., An introduction to the geometry of numbers. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berücksichtigung der Anwendungsgebiete, Bd. 99 Springer-Verlag, Berlin-Göttingen-Heidelberg 1959 viii+344 pp.
8[8] Cohn, H. Minkowski’s conjecture on critical lattices in the metric ( | ξ | p + | η | p ) 1 p superscript superscript 𝜉 𝑝 superscript 𝜂 𝑝 1 𝑝 (|\xi|^{p}+|\eta|^{p})^{\frac{1}{p}} , Ann. of Math. 51 (1950) 734–738.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

On a theorem of Davenport and Schmidt

Abstract.

1. Introduction

Dirichlet Approximation Theorem**.**

Theorem**.**

Corollary**.**

Corollary**.**

2. Improving the Minkowski approximation theorem

Minkowski Approximation Theorem**.**

Definition 1**.**

Theorem 1**.**

Theorem 2**.**

Theorem 3**.**

Theorem 4**.**

Remarks:

3. The continued fraction associated to a norm

Definition 2**.**

Theorem 5**.**

Theorem 6**.**

Theorem 7**.**

4. Geometry of numbers

Minkowski’s First Convex Body Theorem**.**

5. A Minkowski-type algorithm

Lemma 5.1**.**

Proof.

Lemma 5.2**.**

Proof.

Proof of Theorem 2

6. The continued fraction

Lemma 6.1**.**

Proof.

Proof of Theorem 5

Lemma 6.2**.**

Proof.

7. A formula for δF(α)\delta_{F}(\alpha)δF​(α)

Lemma 7.1**.**

Proof.

Lemma 7.2**.**

Proof.

Proof of Theorem 6

8. S\mathcal{S}S-expansions

Lemma 8.1**.**

Proof.

Lemma 8.2**.**

Proof.

Proof of Theorem 7

Lemma 8.3**.**

Proof.

Lemma 8.4**.**

9. Values of δF(α)\delta_{F}(\alpha)δF​(α) for well approximable numbers

Lemma 9.1**.**

Proof.

Lemma 9.2**.**

Proof.

10. Values of δp(α)\delta_{p}(\alpha)δp​(α) for any irrational α\alphaα

Proof of Theorem 4

Lemma 10.1**.**

Proof.

Lemma 10.2**.**

Proof.

11. The dynamical system

Lemma 11.1**.**

Proof.

Lemma 11.2**.**

Proof.

Lemma 11.3**.**

Proof.

Lemma 11.4**.**

Proof.

Lemma 11.5**.**

Proof.

Proof of Theorem 1

12. Concluding remarks

Appendix A Lemmas about norms

Dirichlet Approximation Theorem.

Theorem.

Corollary.

Corollary.

Minkowski Approximation Theorem.

Definition 1.

Theorem 1.

Theorem 2.

Theorem 3.

Theorem 4.

Definition 2.

Theorem 5.

Theorem 6.

Theorem 7.

Minkowski’s First Convex Body Theorem.

Lemma 5.1.

Lemma 5.2.

Lemma 6.1.

Lemma 6.2.

7. A formula for $\delta_{F}(\alpha)$

Lemma 7.1.

Lemma 7.2.

8. $\mathcal{S}$ -expansions

Lemma 8.1.

Lemma 8.2.

Lemma 8.3.

Lemma 8.4.

9. Values of $\delta_{F}(\alpha)$ for well approximable numbers

Lemma 9.1.

Lemma 9.2.

10. Values of $\delta_{p}(\alpha)$ for any irrational $\alpha$

Lemma 10.1.

Lemma 10.2.

Lemma 11.1.

Lemma 11.2.

Lemma 11.3.

Lemma 11.4.

Lemma 11.5.

Lemma A.1.

Lemma A.2.

Lemma A.3.

Lemma A.4.

Lemma A.5.