Approximation of the Lagrange and Markov spectra
Vincent Delecroix, Carlos Matheus, Carlos Gustavo Moreira

TL;DR
This paper presents a polynomial time algorithm for approximating the complex Lagrange spectrum and extends the method to the Markov spectrum, both related to diophantine approximation and quadratic forms.
Contribution
It introduces a novel polynomial time algorithm for approximating the Lagrange spectrum and extends this approach to the Markov spectrum.
Findings
Algorithm approximates the Lagrange spectrum in Hausdorff distance
Extension of the algorithm to the Markov spectrum
Efficient approximation method for spectra related to quadratic forms
Abstract
The (classical) Lagrange spectrum is a closed subset of the positive real numbers defined in terms of diophantine approximation. Its structure is quite involved. This article describes a polynomial time algorithm to approximate it in Hausdorff distance. It also extends to approximate the Markov spectrum related to infimum of binary quadratic forms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Dynamics and Fractals · Advanced Mathematical Theories and Applications · Quantum chaos and dynamical systems
Approximations of the Lagrange and Markov spectra
Vincent Delecroix, Carlos Matheus and Carlos Gustavo Moreira
Vincent Delecroix: Max-Planck Institut Bonn, Germany and Université de Bordeaux CNRS (UMR 5800) Bordeaux, France.
Carlos Matheus: CMLS, École Polytechnique, CNRS (UMR 7640), 91128, Palaiseau, France.
Carlos Gustavo Moreira: School of Mathematical Sciences, Nankai University, Tianjin 300071, P. R. China, and IMPA, Estrada Dona Castorina 110, CEP 22460-320, Rio de Janeiro, Brazil
Abstract.
We describe a polynomial time algorithm providing finite sets arbitrarily close (in Hausdorff topology) to the Lagrange and Markov spectra.
1. Introduction
Given a positive real number we define the best constant of Diophantine approximation of as
[TABLE]
where and are bound to be positive integers. The Lagrange spectrum is the set . Perron [Per21] showed that if is the continued fraction expansion of then
[TABLE]
In other words, one can equivalently define the Lagrange spectrum on the bi-infinite shift . More concretely, let us define the height function by
[TABLE]
where . We have
[TABLE]
where is the shift map. The Markov spectrum can be defined similarly as
[TABLE]
where .
and are subsets of which were first systematically studied by Markov [Ma1], [Ma2] circa 1879. It is known that the Lagrange, resp. Markov spectrum is the closure of the values of for periodic, resp. ultimately periodic sequences (see [CF89, Chapter 3]): in particular, and are closed sets with . Nevertheless, they do not coincide [Fre68]. Actually, the second and third authors of the present text proved recently [MaMo1] that the Hausdorff dimension of satisfies
[TABLE]
Also, it was shown by Freiman [Fre73] and Schecker [Sch77] that contains the half-line . We recommend consulting Cusick–Flahive book [CF89] for a detailed account of several features of these fascinating spectra describing also the cusp excursions of geodesics on the modular surface.
Our aim in this article is to approximate and by mean of computations. More precisely we will be doing so by constructing finite sets that are close in Hausdorff distance. Let and be set of real numbers. We say that and are -close (in Hausdorff distance) if
[TABLE]
Theorem 1**.**
Let . Then there exists an algorithm that given provide finite sets -close respectively to the Lagrange and Markov spectrum in . There exists a constant such that its running time is with the following upper bounds
- •
* for ,*
- •
* for ,*
- •
* for .*
The numbers above are actually Hausdorff dimensions of the sets , , of real numbers whose continued fraction expansions only contain the digits from to .
Remark 2**.**
As we mentioned already, contains the half line and it does not make any sense to consider values of larger than .
Among our several motivations to get Theorem 1, we would like to mention that an efficient algorithm producing high resolution drawings of and could potentially lead the way to solve some long-standing questions such as Berstein111In fact, despite the fact that some heuristic arguments from the projection theory of fractal sets are compatible with the presence of intervals in before the so-called Hall’s ray, Berstein conjecture is somewhat surprising to us because it propose a concrete relatively large interval before Hall’s ray. conjecture [Be73] that . As it turns out, the algorithm provided by Theorem 1 is not sufficiently powerful yet to put us in good position to attack any of these questions. We hope to pursue this discussion in future works.
Let us now describe the main ideas of the algorithm in Theorem 1. Because we consider Lagrange and Markov values in the interval we can restrict to continued fractions with partial quotients . Let . We denote by and respectively the Lagrange and Markov spectra restricted to the shift , that is and . For each , the relation holds.
The first step of the algorithm consists in constructing a subshift of finite type on the alphabet depending on the quality of approximation . is the set of infinite paths on a graph and each edge of corresponds to a certain cylinder set of the shift satisfying
[TABLE]
In the above equation and all along the text we identify a finite pointed word (ie with a distinguished origin ) and there associated cylinder set .
We turn the graph into a weighted graph by considering on the edge associated to the weight . The weighted graph provide an approximation of the shift together with the height function . The condition (1) gives upper bound on the quality of approximation. The rest of the algorithm consists in studying directly on the graph the discrete analogue of Lagrange and Markov spectrum. The overall complexity of the algorithm is governed by the size of . It is not so suprising that this is related to the Hausdorff dimension of the Lagrange and Markov spectra.
The algorithm developed in this article has been implemented by the first author using the computer algebra software SageMath [S*+*09]. Part of the code has also been optimized using the Python-to-C compiler Cython [Cyt] and the figures were generated with Matplotlib (see [Hu07]). The code is publicly available at https://plmlab.math.cnrs.fr/delecroix/lagrange.
Let us describe the organization of the article. In Section 2, we compute some auxiliary intervals containing the sets for . In Section 3, we describe the weighted directed graphs and their basic properties. After that, Section 4 provides the discrete analogue of Lagrange and Markov spectra on a weighted directed graph. In Section 5, we justify that the discrete spectrum provide -approximation of the original Lagrange and Markov spectra. The main ingredient for the running time complexity is contained in Theorem 13 from Section 6. Finally, for the sake of comparison with Theorem 1, we consider in Section 7 the alternative idea of approaching the Lagrange spectrum via periodic orbits in . Unfortunately our complexity bounds for the resulting algorithm are very poor: roughly speaking, we are currently obliged to perform calculations with periodic orbits in order to rigorously ensure that we got a -dense subset of !
The final Section 8 contains some rigurous pictures obtained via the algorithm from Theorem 1.
2. Preliminary bounds on
In this entire article, is an integer except when it is explicitly stated otherwise. Recall that is the set of Lagrange values when is restricted to the shift . It is easy to see that the smallest and largest values of are respectively and where for a finite word we denote the associated periodic biinfinite word.
The points
[TABLE]
and
[TABLE]
allow to determine the intervals containing for , see Figure 1.
The ranges of for which we provide upper bounds on in Theorem 1 are visible on Figure 1
- •
is the maximum of
- •
is the maximum of .
Let us also mention that is the maximum of .
3. Shifts of finite type
In this section we construct the graphs and their associated shifts of finite type . The construction requires intermediate graphs that are associated to the one-sided shift .
3.1. -cylinders for the Gauss map
In this section we consider approximations of by shift of finite type. The continued fraction embedds the shift into as
[TABLE]
The image of under this map is a Cantor set .
For a finite (one sided) cylinder for the Gauss map we use the notation
[TABLE]
A cylinder projects on on a subset where is the interval with extremities
[TABLE]
where and . The values of and for were computed in Section 2. Note that depending on the parity of the length of one or the other is the left handside of the interval. The diameter of is
[TABLE]
Note that it is slightly smaller than the size when we do not restrict to the subshift . Now given we consider the following set of cylinders
[TABLE]
where denotes the prefix of length of and is defined in (2).
First, notice that if then no proper prefix of belongs to and that any proper suffix of is a prefix of some element in . The set can naturally be thought as the leaves of a tree rooted at the empty word and where the edges correspond to adding a letter to the right. Namely, consider the set to be the union of and all prefixes of elements of . The tree has vertex set and we put an oriented edge from to if is the prefix of length of .
We add edges on this tree corresponding to the so called suffix links. For each we add an edge from to its suffix of length . As we already mentioned, this suffix belongs necessarily to . One can visualize the tree and the suffix links of on Figure 2.
Now we define the set as the set of endpoints of the suffix links (in other words the maximal non-trivial suffixes of elements of ). We consider two kinds of edges on the vertex set . First, for each (oriented) path in the tree between pairs of vertices we add an edge in . We call such edge a prolongation edge (they are in black on Figure 2). Secondly, for each suffix link from to in , there is a unique vertex in and a path from to that avoids any other element from . We add an edge from to that we call a shift edge.
Each edge carries a label that is a finite word on (possibly empty). They are directly induced from the tree for which each edge carry a letter. Each prolongation edge in already carries a letter and we keep this letter as a label. Each shift edge is made of the concatenation of a path and a suffix link and we associate to this edge the label of the path in the tree .
Lemma 3**.**
For any and the graph recognize the shift : for any biinfinite word there exists a unique biinfinite path in so that can be read along .
Remark 4**.**
The construction of the graphs from can be generalized to any set of words with the same properties (every word in has a prefix in the set and no proper prefix of an element of the set is contained in the set). If we had used instead of the set of words of given combinatorial length we would have obtain the de Bruijn graph [dBr].
3.2. From to
Now that we have the graph at hand we explain the construction of . Similarly to , the graph has two kinds of edges: prolongation edges and shift edges. The shift edges are in bijection with . Let . Let and be the source and target of the shift edge associated to in . By construction, if then and for some . The source of the edge corresponding to in is and its target is where is the unique element in that is a prefix of .
We now describe prolongation edges. To each prolongation edge in from to we associate for each in and each in a prolongation edge from to .
For the shift edge corresponding to we associate the weight
[TABLE]
where is the middle of the interval determined by a cylinder given by
[TABLE]
We give weight [math] to each prolongation edge.
As before, the biinfinite paths on edges of define a subshift of finite type.
Lemma 5**.**
For each , the graph recognizes the subshift . Moreover, for any shift edge in associated to the cylinder we have
[TABLE]
Proof.
Recall that the weights defined in (4) are in between the extremal possible values of in the cylinder . But and are in constructed in Section 3.1 and were chosen so that the corresponding image under the continued fraction map have diameter . Hence for each of and we are off by at most so that is off by at most . ∎
Remark 6**.**
Since the Gauss map has derivative , we have for any .
In particular, an alternative way of constructing a subshift would have been to pick all possible cylinder of combinatorial length (where is chosen large enough so that all diameters are smaller than ), but this would have lead to a larger set of cylinders.
4. Lagrange and Markov edges in weighted directed graphs
Let be a weighted directed graph. We denote and respectively the vertices and edges of and the weight function. The codomain of the weight function needs not be , any totally ordered set would do.
We call an edge in to be a Lagrange edge if there exists a cycle in that passes through and so that the weight of the edge is maximal among the weights of edges in . An edge is called a Markov edge if there exist two cycles and and a path from to so that the edge is maximal among the weights of edges in . The definition is illustrated with a simple example on Figure 3.
A simple approach for computing these edges is to test for each edge whether it is Lagrange or Markov.
Theorem 7**.**
Given a directed graph and and an edge of . Determining whether is Lagrange or Markov has complexity where is the number of edges in .
Proof.
Let be an edge and and its source and target. Then is a Lagrange edge if and only if there is a path from to with maximum edge weight . To compute that, one can simply do a depth-first search on edges with weight not greater than . Hence testing whether a single edge is Lagrange is . Now is a Markov edge if one can build a path connected to a cycle (both backward and forward). Similarly, one can detect such cycle with two depth first searches. In both cases, the search is bounded by the number of edges in the graph . ∎
As a consequence of Theorem 7 the complexity of computing all Lagrange and Markov edges in a given graph has complexity where is the number of edges in . We now describe a procedure to reduce the computational time for the search of Lagrange and Markov edges based on online cycle detection and strongly connected component maintenance [HKMST08, HKMST12, BFGT16].
Theorem 8**.**
Computing the set of weights of Lagrange edges or the set of weights of Markov edges in a directed weighted graph can be achieved in where is the number of edges of .
Proof.
Order the edges in by non-decreasing weight. Namely with . We define a sequence of acyclic graphs by considering the graph obtained from the edges , identifying the vertices that belong to a same strongly connected component222Recall that a directed graph is strongly connected whenever there are oriented paths joining any given pair of vertices and a strongly connected component of a directed graph is a maximal strongly connected directed subgraph. and removing the loops (edges from a vertex to itself). The graph is concretely obtained from by adding the -th edge and possibly identifying vertices in a newly appeared strongly connected component. As shown in [HKMST08, HKMST12, BFGT16], maintaining the strongly connected components in dynamical graph, or equivalently computing the sequence , can be done in .
Now the weights of Lagrange edges are exactly the weights of edges that create a cycle when they are added in . This shows that it comes at no additional cost. For Markov edges however, one needs to make a further traversal of the graph. However this can be reduced to a total cost of from which we obtain the same upper bound . In order to do so, one needs to maintain two flags for each edge: whether it can reach in forward and backward directions a non-trivial strongly connected component. Updating these flags is only performed when a new strongly connected component is detected and each edge is at most traversed once.
Now, having this extra information an edge is Markov if and only if at the time it is added in it is such that its target can reach a strongly connected component in forward direction and its source can reach a strongly connected component in backward direction. ∎
5. Approximation of Lagrange and Markov spectra
Recall that we defined graphs in Section 3 and Lagrange and Markov edges in a weighted directed graph were defined in Section 4. The main aim of this section is to prove the following result.
Theorem 9**.**
For any the set of weights of respectively Lagrange and Markov edges in is -close to respectively and .
Proof.
We do the proof for the Markov spectrum, the case of Lagrange being similar.
Let and let . We will show that there is a Markov edge whose weight is -close to . Let be the shift edge corresponding to in the graph . This sequence of shift edges determine a biinfinite path . Let be the edge in with maximal weight. We claim that is a Markov edge of the graph and that .
By construction, we have . Hence the weight of the supremum satisfies the required bound. Let be any index such that . Since is biinfinite, there is a smallest such that the path intersects itself. That is, there is with . Since the weights on the edges between indices and are at most we constructed a cycle all of whose weights are at most that can be reached from by a path with weights not larger than . The construction of a path in backward direction is performed similarly and concludes the fact that is a Markov edge in . ∎
6. The size of and algorithm complexity
As we saw in Section 4 the time complexity of detecting Lagrange and Markov edges in a graph is polynomial in the size of the graph. In this section we provide an upper bound on the size of the graphs . It will prove the polynomial bounds in Theorem 1.
Lemma 10**.**
The number of edges in is bounded by
[TABLE]
To prove the lemma we need to intermediate results that will also be used later. The first one gives a control on the diameter in terms of the combinatorial length.
Lemma 11** ([CF89], Lemma 2 p. 2).**
Let and with . Then .
The second provides a lower bound on the diameter for elements in .
Lemma 12**.**
For any and any we have
[TABLE]
Proof.
On one hand, we have for each . On the other hand, we have that
[TABLE]
and
[TABLE]
Since
[TABLE]
and
[TABLE]
we obtain that
[TABLE]
The estimate follows. ∎
Proof of Lemma 10.
Recall that the graph has two kind of edges: prolongation edges and shift edges. The shift edges are in bijection with which provides the term in .
Now we need to bound the number of prolongation edges and we claim that there number is at most
[TABLE]
This follows from Lemma 11 and the lower bound estimates on from Lemma 12. ∎
It now remains to estimate the size of the sets . As we will see next, the growth rate in is intimately linked to the Hausdorff dimension of the Lagrange and Markov spectra. To do so we apply the techniques in [Mor18] and Palis–Takens book [PaTa] to make this relation precise.
Let us recall that we defined the sets in Section 3 as the image of the one-sided shift under the continued fraction map. More concretely
[TABLE]
The set is a closed invariant subset of the Gauss map.
It is known that the Hausdorff dimension of is strictly increasing in and as (cf. Hensley [He92] and [He96]). For our purposes, it is useful to know that for small values of we have the following estimates (from [Je04, JePo01, JePo18])
[TABLE]
Theorem 13**.**
There exist constants and such that for any positive integer we have
[TABLE]
where the constants and can be explicitly computed
[TABLE]
Putting together the estimates from Lemma 10 and Theorem 13 we obtain an estimate on the size.
Corollary 14**.**
We have for .
Proof of Theorem 13.
Observe that the -th iterate of the Gauss map sends to . Thus, the average of its derivative belong to the interval
[TABLE]
Here we used the lower bound estimate from Lemma 12.
Since the distortion of the iterates of the Gauss map (i.e., the ratio between its maximal and minimal derivatives) is (see, e.g., Proposition 2 in [Mor18]), the maximal derivative of the -th iterate of the Gauss map on is and the minimal derivative of the -th iterate of the Gauss map on is .
As it is explained in pages 68 to 70 of Palis–Takens book [PaTa], one has
[TABLE]
In particular, it follows that
[TABLE]
This completes the proof of the desired theorem. ∎
7. Approximation by periodic orbits
In this section we consider the following question. How well the periodic (resp. ultimately periodic) sequences in with period at most approximate the set (resp. )?
Proposition 15**.**
Let and . Then the subset
[TABLE]
is -dense in . Similarly, the subset
[TABLE]
is -dense in .
Proof.
The main ingredient in our estimates is Lemma 11.
Lagrange spectrum. Fix . Let and put .
By definition, . Hence, given , there are infinitely many such that . Also, there is a sequence as such that for all . Fix . Given an index consider the finite sequence with terms . There is a sequence such that for infinitely many values of , i.e., there are with , . Note that we may (and do) assume that .
For each , consider the subshift of given by . Note that is invariant by transposition operation and by the shift map . Moreover, it is contained (and well approximated) by the following subshift of finite type: let be the set of all factors of size of all elements of , and denote by the set of all infinite words in whose factors of size are all in .
We can describe as a (not necessarily transitive) Markov shift: the allowed transitions are of the type with and belonging to .
By definition, we have , and it is possible to connect to itself in by a sequence of allowed transitions. The minimum number of transitions needed to connect to itself in is trivially bounded by the size of , which is at most .
In other words, for each , there is and a factor of size of an element of with .
In particular, if is the periodic sequence of period , then Lemma 11 ensures that . Since , there is a periodic sequence with period such that for infinitely many and, a fortiori, .
Markov spectrum. Let , consider , and suppose that for all . We may (and do) also assume that .
By the pigeonhole principle, there are and such that and .
By Lemma 11, if is the doubly pre-periodic sequence
[TABLE]
then . Notice that the total size of the block formed by the central block and the periods on both sides is at most . ∎
Remark 16**.**
The Markov spectrum is also characterized by the values of real indefinite binary quadratic forms (see [CF89, Chapter 1]). An attempt to draw some portions of using certain binary quadratic forms of bounded heights was performed by T. Morrison [Mo12], but unfortunately this text does not discuss the quality of the approximation of obtained by this method.
8. High resolution Lagrange spectra and
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[BFGT 16] M. Bender, J. Fineman, S. Gilbert, R. Tarjan A new approach to incremental cycle detection and related problems ACM Transactions on Algorithms Volume 12 Issue 2, February 2016 Article No. 14
- 2[Be 73] A. A. Berstein, The connections between the Markov and Lagrange spectra , Number-theoretic studies in the Markov spectrum and in the structural theory of set addition, pp. 16–49, 121–125. Kalinin. Gos. Univ., Moscow, 1973.
- 3[CF 89] T. Cusick and M. Flahive, The Markoff and Lagrange spectra , Mathematical Surveys and Monographs, 30. American Mathematical Society, Providence, RI, 1989. x+97 pp.
- 4[Cyt] R. Bradshaw, S. Behnel, D. S. Seljebotn, G. Ewing, et al., The Cython compiler , http://cython.org.
- 5[d Br] N. G. de Bruijn, A combinatorial problem , Nederl. Akad. Wetensch., Proc. 49 (1946), 758–764.
- 6[Fre 68] G. A. Freiman, Noncoincidence of the Markoff and Lagrange spectra , Mat. Zametki 3 (1968), 195–200; English transl., Math. Notes 3 (1968),125–128.
- 7[Fre 73] G. A. Freiman, The initial point of Hall’s ray , Number-theoretic studies in the Markov spectrum and in the structural theory of set addition, pp. 87–120, 121–125. Kalinin. Gos. Univ., Moscow, 1973.
- 8[HKMST 08] B. Haeupler, T. Kavitha, R. Mathew, S. Sen, R. E. Tarjan, Faster algorithms for incremental topological ordering , in Proceedings of ICALP 2008. Springer.
