A robust implementation for solving the $S$-unit equation and several applications
Alejandra Alvarado, Angelos Koutsianas, Beth Malmskog, Christopher, Rasmussen, Christelle Vincent, Mckenzie West

TL;DR
This paper introduces a new implementation in SageMath for solving the $S$-unit equation, enabling extensive computations that lead to applications such as an asymptotic Fermat's Last Theorem for certain cubic fields and solutions to Ramanujan-Nagell equations.
Contribution
The paper provides a robust implementation for solving the $S$-unit equation in SageMath, with mathematical foundations and extensive computational results for various number fields.
Findings
Bounded solutions for small degree fields and sets $S$
Proof of an asymptotic Fermat's Last Theorem in specific cubic fields
Complete solutions to certain Ramanujan-Nagell equations
Abstract
Let be a number field, and a finite set of places in containing all infinite places. We present an implementation for solving the -unit equation , in the computer algebra package SageMath. This paper outlines the mathematical basis for the implementation. We discuss and reference the results of extensive computations, including exponent bounds for solutions in many fields of small degree for small sets . As an application, we prove an asymptotic version of Fermat's Last Theorem for totally real cubic number fields with bounded discriminant where 2 is totally ramified. In addition, we use the implementation to find all solutions to some cubic Ramanujan-Nagell equations.
| Case | |||
|---|---|---|---|
| and | |||
| and |
| Case | Case | |||||
|---|---|---|---|---|---|---|
| and | ||||||
| and | and | |||||
| and | , , | |||||
| and | , , | |||||
| and | , | |||||
| Runtime (in seconds) | ||||
|---|---|---|---|---|
| 16 | 01.16 | |||
| 0 | 02.06 | |||
| 0 | 64.00 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnalytic Number Theory Research · Algebraic Geometry and Number Theory · Advanced Mathematical Identities
A robust implementation for solving the -unit equation and several applications
Alejandra Alvarado
Alejandra Alvarado, Department of Mathematics and Computer Science, Eastern Illinois University
,
Angelos Koutsianas
Angelos Koutisanas, Department of Mathematics, University of British Columbia
,
Beth Malmskog
Beth Malmskog, Department of Mathematics and Computer Science, Colorado College
,
Christopher Rasmussen
Christopher Rasmussen, Department of Mathematics and Computer Science, Wesleyan University
,
Christelle Vincent
Christelle Vincent, Department of Mathematics and Statistics, University of Vermont
and
Mckenzie West
Mckenzie West, Department of Mathematics, University of Wisconsin Eau Claire
Abstract.
Let be a number field, and a finite set of places in containing all infinite places. We present an implementation for solving the -unit equation , in the computer algebra package SageMath. This paper outlines the mathematical basis for the implementation. We discuss and reference the results of extensive computations, including exponent bounds for solutions in many fields of small degree for small sets . As an application, we prove an asymptotic version of Fermat’s Last Theorem for totally real cubic number fields with bounded discriminant where 2 is totally ramified. In addition, we use the implementation to find all solutions to some cubic Ramanujan-Nagell equations.
1. Introduction
In 1909, Thue proved there are only finitely many integral solutions to what we now call the Thue equation; i.e, that for any -irreducible binary form of degree at least 3, defined over the integers, there are only finitely many solutions to the equation
[TABLE]
where is any non-zero integer [39]. Thue accomplished this by formally factoring into linear terms of the form , where is algebraic, then bounding the quality of rational approximations of in terms of the size of and . Thus bounds on integer solutions to the Thue equation arose out of the theory of approximating algebraic numbers by rationals. Thue’s theorem was generalized by Siegel [34]111See also the recent translation [17] by Fuchs. and then Mahler [25]. These generalizations gave rise to a central fact of modern computational number theory: if is a number field, and a finite list of places of including all infinite places, then there are only finitely many solutions to the equation
[TABLE]
Here, is the unit group of the ring of -integers in . We refer to (1) as the -unit equation. In this paper, we describe an algorithm to determine the complete set of solutions to the -unit equation for general and . More generally, for fixed , we can see that the equation will also have only finitely many solutions by expanding the set to include all primes dividing and and searching for solutions to (1). Thus it suffices to solve (1) to address the more general case, and we focus on (1) here (though it should be remarked that this is not the most efficient way to solve ).
The work of Gelfond and Schneider, resolving Hilbert’s seventh problem in the affirmative (all irrational algebraic powers of algebraic numbers are transcendental once trivial cases are ignored), determined lower bounds on the absolute value of a -linear combination of two -linearly independent logarithms of algebraic numbers. Alan Baker’s 1967 theorem [1] generalized these results to the case of many logarithms. Baker, Wüstholz, and many others continued to improve these bounds. Naturally, one should ask if similar results are available over local fields, and indeed such results began to appear quickly. In 1968, Brumer proved the first analogue of Baker’s work for -adic logarithms [8], followed by many improvements and generalizations, such as the results of Yu [44]. Improvements in both the archimedean and nonarchimedean cases continue to appear, such as in [20, 4, 47, 19].
For any choice of and , is a finitely generated -module. Fixing a basis for the torsion free part, we can express any as for some root of unity and some . Building on the lower bounds for linear combinations of logarithms, Győry [18] determined effectively computable bounds for the exponents . This was a great victory for computational number theory, as this provably restricted all solutions to (1) to a finite search space. Unfortunately, the demonstrated bounds were enormous and as a matter of practice, it was computationally infeasible to conduct an exhaustive search for solutions, even in the very simplest cases. Baker and Davenport devised a clever method of reducing the bounds in special cases in [2]. However, in [14], de Weger built on the ideas of Baker-Davenport to develop a powerful general method of algorithmically reducing the bounds to a manageable size, relying on the lattice basis reduction algorithm of Lenstra, Lenstra, and Lovász [24] (henceforth referred to as the “LLL algorithm”). Though it is has not been proven that de Weger’s method will always reduce the bounds coming from the results in linear forms of logarithms, this is the rule in practice. In many cases, de Weger’s approach provides sufficient improvements that, with careful sieving (or sometimes even with only brute force), the entire search space can be exhausted and complete lists of solutions can be enumerated.
Beyond the improvements provided by LLL-based reduction, many mathematicians have developed further algorithms for efficiently searching below the “LLL bounds” provided by de Weger’s work. Two powerful examples are reported in [43] and [38]. Increasingly, the theoretical improvements (assisted by technological improvements) have pushed ambitious and interesting computational problems within reach. For example, Smart determined the entire set of all genus curves over with good reduction away from , based in part on solving (1) for a family of number fields unramified away from [36].
We have written a package of Python functions for inclusion in the computer algebra system SageMath [32], which solves the -unit equation (1) over any number field and for any finite set of finite places. As experienced readers may expect, the package is not practical when either or is too large, although there is no theoretical obstruction. While this package is the independent creation of the authors, it is based in part on the descriptions of algorithms implemented by Smart [35, 36, 37]. Specifically, we follow Smart’s development in determining initial large bounds, including the numbering of constants, in [35], with some adjustments and small corrections. In reducing the bounds, we follow [37], again with some adjustments. The sieving step is based on ideas cited by Smart [36] as due to others (as noted in Section 6) but has been redeveloped in new notation and style. We include proofs of our versions of results when we made adjustments to versions in the literature. To the authors’ knowledge, our package is the first publicly available implementation for solving the -unit equation over any field other than ; the present article describes the algorithm and its implementation. The implementation was a highly non-trivial undertaking, involving efforts spreading over more than seven years on the parts of individuals and the entire team.
We also provide new results facilitated by our implementation. In particular, we first provide a discussion of and link to explicit exponent bounds for solutions of the -unit equation in all cases where is ramified only at primes above some subset of and
[TABLE]
We improve the best known exponent bounds for solutions of the -unit equation over number fields related to a class of genus curves over with good reduction away from . We solve the -unit equation in the totally real cubic number fields in which is totally ramified and the absolute discriminant of , , satisfies , and we use these results to verify that an asymptotic version of Fermat’s Last Theorem holds over these fields. Finally, we find all solutions to certain cubic Ramanujan-Nagell equations.
1.1. Overview
The organization of the paper proceeds as follows. We introduce certain notations in §2. In §3, we review the relevant work of Baker-Wüstholz and Yu. This is used in §4 to establish a “pre-LLL” exponent bound for each place in . In §5, we explain the process of using LLL to reduce these exponent bounds – the approach is different for archimedean and nonarchimedean places. In §6, we describe the sieve for further constraining the final search space. We devote §7 to a discussion of our experimental observations, having now executed our algorithm in several dozen cases. We highlight a special condition ( contains only one finite place) under which a significant improvement in the search space can be obtained. Although narrow in scope, the special condition is sufficiently natural, and the savings sufficiently nontrivial, as to warrant its discussion. Finally, §8 introduces two applications: an asymptotic version of Fermat’s Last Theorem over totally real cubic fields and a solution to a cubic variant of the Ramanujan-Nagell equation.
Acknowledgments
We are delighted to recognize the Institute for Computational and Experimental Research in Mathematics for both funding and hosting a 2017 collaboration during which a great deal of this project was completed. Part of this work began at the 2014 workshop SageDays 62, and we would like to thank Anna Haensch and Lola Thompson for organizing that workshop and Microsoft Research and The Beatrice Yormark Fund for Women in Mathematics for funding. Some of the work was supported by the van Vleck fund at Wesleyan University. The authors would like to thank many people for helpful conversations that led to improvements in the code and gave direction to this project, including Bjorn Poonen, Andrew Sutherland, and Norman Danner. We would also like especially to thank David Roe for his contributions to refining and reviewing the code for inclusion in SageMath. The third author was partially supported in this work by NSA Grant #H98230-16-1-0300. We are very grateful to the anonymous referees for their careful reading of this work and their many helpful comments which have improved the quality of this paper.
2. Notation
2.1. -units in number fields
Throughout this paper, we let denote the algebraic closure of inside , the field of complex numbers. Unless stated otherwise, we fix the following notation throughout:
[TABLE]
If is a monic and irreducible polynomial, we let denote the number field , where is a root of . Always, denotes the principal branch of the complex logarithm function, with argument in .
2.2. Absolute Values and Completions
Each place of determines an associated value, , which we now describe.
Let denote the usual absolute value on . If is an infinite place, choose , an embedding corresponding to . The associated absolute value depends on whether is a real or complex (meaning non-real) place of :
[TABLE]
Now suppose is a finite place. View as a prime ideal of , and let be the characteristic of the residue field . Let denote the ordinal function for . On this is defined by
[TABLE]
and it extends to in the obvious way. We let denote the usual absolute value of the -adic field . The absolute value associated to on is
[TABLE]
Let be the -adic completion of with respect to ; we also use for the absolute value on .
We fix once and for all an algebraic closure of , and let denote the completion of . We use to denote the natural extension of to all of . We define on to satisfy
[TABLE]
As may split into several prime ideals, the absolute value on may have several inequivalent extesnions to , of which is just one; so we must take care when viewing as a subfield of .
For any embedding , we obtain a subfield of as the composite . By the Prolongation Theorem [21, §18.5], there exists a choice of such that is value-isomorphic to . Henceforth, we always use this isomorphism to view as a subfield of . As the isomorphism respects the valuations, we know and satisfy
[TABLE]
2.3. Height functions
Suppose . We let denote the standard logarithmic Weil height on . This is defined as follows: for any ,
[TABLE]
where the sum runs over all places of . It is a consequence of the product formula ([21, Ch. 20, pgs. 326–327]) that is independent of the choice of coordinates for . For any , set . Note that this height is absolute in the sense that it is not dependent on which field extension containing the coordinates of is considered.
We introduce a modified version of this height function, used in §3. Suppose , and let . For any nonzero element , we define the function by
[TABLE]
The definition of another height function, , is slightly more technical and will be introduced when needed in §3.
2.4. -adic logarithms
Inside , consider the open disk
[TABLE]
On , we define the -adic logarithm by the series
[TABLE]
The series is convergent on ; moreover, on it satisfies the identity
[TABLE]
If we have
[TABLE]
Based on an idea due to Iwasawa, the -adic logarithm can be extended to any such that ; this extension continues to satisfy (4) (see [37, II.2.4]).
2.5. Solutions to the -unit equation
We let denote the additive -module . This is isomorphic to , and the list of generators determines an isomorphism
[TABLE]
We use the shorthand . For obvious reasons, we call the elements of exponent vectors. Much of our discussion will focus on bounds for the entries of an exponent vector. For , we use the notation to signify
[TABLE]
Within , we wish to determine
[TABLE]
Solving the -unit equation is equivalent to determining the set . We let denote the corresponding subset of .
3. The Bounds of Baker-Wüstholz and Yu
Suppose provide a solution to the -unit equation, so that . With respect to the ordered generating set , there are unique vectors such that
[TABLE]
The techniques of lattice reduction discussed in §5 will not produce an absolute bound for on their own; they can only be used to improve a known bound. So in this section, we recall bounds established by Baker-Wüstholz [3] and Kunrui Yu [47]. An excellent treatment of the background material appears in [15].
3.1. Statement of Yu’s Bound
Let be a finite place of , and let denote the rational prime below . We let be the smallest rational prime distinct from (so unless , in which case ). Let . We say satisifies Yu’s auxiliary condition if any of the following hold:
- (i)
and , 2. (ii)
and , 3. (iii)
and .
At the end of this section, we explain how the algorithm finds a bound in cases where does not satisfy Yu’s auxiliary condition.
Theorem 3.1** (Yu, [47, pg. 190]).**
Suppose and is a prime of . Suppose is a number field satisfying Yu’s auxiliary condition and are chosen which satisfy
[TABLE]
Suppose and . Finally, suppose satisfies
[TABLE]
Then there exist explicit constants and , given below, such that
[TABLE]
3.2. The constants and
We first discuss the constant , and the variant used in the algorithm. In Theorem 3.1, is roughly a product of the logarithmic heights of the . More precisely, decompose the set into a disjoint union , where is a maximal subset of which is multiplicatively independent. Such a decomposition need not be unique. Because of the possible dependence among the , Yu requires a modified height function:
[TABLE]
(The value is explained in the following subsection.) The constant , which depends on , , as well as the , is then defined by
[TABLE]
As shown in [47], one may choose any maximal independent set for the computation of . If optimization of the bound is critical, one may search over all possible and take the smallest possible bound. This observation is moot in our use, however.
Corollary 3.2**.**
Keeping the hypotheses of the previous theorem, suppose also that are multiplicatively independent. Set
[TABLE]
Then .
Proof.
In this case, is unique; either or . In either case, and the result follows immediately. ∎
In the algorithm, we are always in the situation of the Corollary. Rather than decide the question of independence between and the other , we just use the constant .
3.3. The constant
The value of is dependent on , , and , as follows. Let , so that is the -part of . Set
[TABLE]
Here, denotes the base of the natural logarithm. The constants , , and are given in Tables 3.1 and 3.2. Finally,
[TABLE]
3.4. A Remark about implementation
For this subsection only, suppose all hypotheses in Theorem 3.1 are satisfied, except does not satisfy Yu’s auxiliary condition. Set
[TABLE]
Let be a prime of above . Let be the ramification index of over . Because
[TABLE]
we see for all . Now Theorem 3.1 applies with and in place of and , respectively.
Corollary 3.3**.**
Under the conditions of this subsection,
[TABLE]
Note that even if splits as in , the choice of is irrelevant; both give the exact same bound in the Corollary.
3.5. Bound of Baker-Wüstholz
We now give an effective version of Baker’s theorem. (Notations are as in §2.2, 2.3.)
Theorem 3.4** (Baker-Wüstholz, [3, pg. 20]).**
Let be a linear form in indeterminates,
[TABLE]
Let , and let . Let be the subfield of generated by the . If and
[TABLE]
then
[TABLE]
where the constant is defined by
[TABLE]
Note that we may be sure if the set is linearly independent over .
3.6. Obtaining the initial bound
The theorems of Baker-Wüstholz and Yu both provide inequalities of the form “a polynomial function of is bounded by a polynomial function of ,” which in turn guarantee an absolute bound on . The analysis to determine such a bound explicitly is standard; we will use the following result of Pethő and de Weger for this purpose.
Lemma 3.5** (Pethő and de Weger [29, Lemma 2.2]).**
Suppose the real numbers satisfy , , b>\bigl{(}\frac{e^{2}}{h}\bigr{)}^{h}, and let be the largest solution to the equation
[TABLE]
Then
[TABLE]
4. Initial Exponent Bounds
4.1. An upper bound at the extremal place
Suppose is a solution to the -unit equation, with specified as in (6). We set , and assume . Relabeling and if necessary, we assume for some . Recall that contains precisely places, . We choose the indices so that
[TABLE]
Remark 4.1**.**
In the sequel, we number our constants in an effort to stay consistent with the enumeration given in Smart’s paper [35]. There, Smart considers a more general unit equation, and so introduces certain constants , , whose values are trivial in the present application. So while the alert reader may notice gaps in the enumeration of constants, this is intentional. (Adjusting our implementation to the more general setting is not difficult, but we are satisfied to limit the discussion to match the current state of the implementation.)
For any choice of define the matrix
[TABLE]
One may always choose so that is invertible (see [15, §5.1]), and so we assume this is the case. We have
[TABLE]
Let be the row norm of , i.e. , and set
[TABLE]
Note that this differs slightly from Smart’s definition, to ensure that . Then B\leq c_{1}\bigl{|}\log|\tau_{1}|_{\mathfrak{p}_{k}}\bigr{|}. We define
[TABLE]
By [35, Lemma 2], we have
[TABLE]
We now have an upper bound on in terms of . We next establish a lower bound, also involving , which will force a limit on the size of . The precise argument depends on whether is a finite or infinite place. For the purposes of the algorithm, we must compute this bound on for each possible index ; we have no choice but to take the largest possible bound, i.e., the larger of the two values and determined in the remainder of this section.
4.2. Case I: is finite
If is finite, then let also denote the associated prime ideal in . Let be the prime of lying below , and let and denote the ramification index and inertial degree of over , respectively. From (9) we have
[TABLE]
Setting
[TABLE]
the inequality (10) yields
[TABLE]
and so . We would like to apply Yu’s Theorem to , but unfortunately the generators may have nonzero order with respect to . So we now replace the with a different set of generators, as in [35, pgs. 824–825]. First, set . Necessarily, there exist indices for which . Choose so that
[TABLE]
and now relabel so that . For , define
[TABLE]
so that . Next, for each with , choose integers such that
[TABLE]
Necessarily, . Set .
Lemma 4.1**.**
We have .
Proof.
Since , we know . Thus,
[TABLE]
proving the claim. ∎
Setting and , we have arranged that
[TABLE]
Since and , there are only finitely many possible values for , and this finite set can be determined without any knowledge of or the . For each , we may apply Corollary 3.2 or 3.3 as appropriate, and obtain a constant such that
[TABLE]
Setting
[TABLE]
we may be sure every -unit solution satisfies
[TABLE]
Combining inequalities (11) and (13), we have
[TABLE]
Since and , it follows that
[TABLE]
Applying Lemma 3.5 with , , and , we may conclude
[TABLE]
Set
[TABLE]
If corresponds to a finite place, then .
In our implementation, the functions mus and possible_mu0s are used to recover the for each finite place . The constants determined from Yu’s Theorem are computed in Yu_bound, while the constant , which may be of independent interest, is computed by K0_func.
4.3. Case II: is infinite
We now assume is infinite. As in §2.3, we let denote the embedding of into such that
[TABLE]
We let denote for any , and we define
[TABLE]
The condition (9) can now be expressed as
[TABLE]
The choices of and guarantee that
[TABLE]
Set . The estimate holds for , and so
[TABLE]
The next step is to view as a linear form in logarithms and apply the theorem of Baker and Wüstholz. Set . Since is a th root of unity, there exists such that . By (6), we have
[TABLE]
where we have introduced to adjust for the principal branch of the logarithm. Certainly , and so . Set
[TABLE]
and . We now have
[TABLE]
Taking , we define
[TABLE]
(Recall that is defined in Theorem 3.4.) We have . Applying Theorem 3.4 to , we obtain
[TABLE]
Combining this inequality with (14), we obtain
[TABLE]
This yields the inequality
[TABLE]
where
[TABLE]
As and , we have and . So by Lemma 3.5, (provided ), where
[TABLE]
Thus, setting
[TABLE]
we may be sure . In our implementation, the constant is computed in the function K1_func.
Combining all the results of this section, we obtain the following.
Lemma 4.2**.**
The constant satisfies .
5. LLL Reduction
In this section we explain how we can reduce the upper bound we have computed in Section 4. This is necessary, because in practice the size of the initial bound is extremely large and cannot be used for practical computations. The idea of the method we will present here has its origin in de Weger’s thesis [13, 12, 14] where he develops a method based on multi-dimensional approximation lattices of linear form of -adic numbers to solve (among many other equations) -unit equations222It is worth mentioning the recent results of von Känel and Matschke [30], who solve -unit equations using modularity. over . These ideas of de Weger have been extended by himself and others to apply over any number field , and have also been used for the solution of other exponential Diophantine equations [40, 41, 42, 35].
In the reduction step we use the LLL reduction algorithm on lattices generated by integer matrices. So instead of the classical LLL algorithm [24], we use the algorithm in [12]. If is a lattice in , let . For , we define
[TABLE]
Computing the exact value of is a very challenging problem in general. Instead, the function minimal_vector computes a lower bound using standard properties of a reduced basis of a lattice and the LLL algorithm (see [37, Chapter V]). As in the previous section, we follow Smart’s notation in [35]. Most of the material we present in this section can also be found in [37, 15].
We preserve the meaning of from §4. When is a finite place, we let denote the prime of lying below . We continue to assume in this section.
5.1. Finite places
Suppose is a finite place. Set
[TABLE]
and suppose that . Define as . Combining (11), (2), and , shows that . Consequently, , and by (5),
[TABLE]
Let be as given in (12), so that we have
[TABLE]
Choose such that , and let denote the discriminant of . Set and , so that . Expressing with respect to the power basis, we obtain such that . Further, we may express
[TABLE]
Using an idea due to Evertse [42, p.257], we have
[TABLE]
Define
[TABLE]
and choose such that .
Should there be some index such that , then , and consequently
[TABLE]
For the remainder, then, we assume
[TABLE]
By the choice of , is a -adic integer for all , and we may rewrite (17) as
[TABLE]
For any and a positive integer , let denote the unique integer between [math] and such that . For a positive integer , let be the lattice generated by the columns of the matrix
[TABLE]
Define
[TABLE]
Also set
[TABLE]
The following lemma is a restatement of [35, Lemma 5], and provides an opportunity to improve the bound on .333Note in [35] has the value in our notation.
Lemma 5.1**.**
If , then .
In the function p_adic_LLL_bound we have implemented the above analysis. In more detail, the functions log_p and embedding_to_Kp are used to compute the constants up to a given precision. If this precision is , i.e., if the are stored as pairs of integers modulo , we clearly require for the algorithm to be meaningful. However, the shift by requires an additional -adic digits of precision. So the algorithm checks that ; if this fails, then the -adic logarithms are computed to higher precision and the process is repeated.
The function log_p is based on an algorithm of Smart in [37, p. 30]. However, our implementation also resolves a crucial computational problem in the evaluation of that to our knowledge has not been mentioned in the literature. To understand the issue, we must describe carefully what “computing the logarithm” means in a -adic setting. Let us view as a subfield of , and specify , a -basis for . Suppose and set . Necessarily, there exist such that .
As a practical matter, no algorithm can return the true value ; it can only return , an approximation to with very small. In practice however, we require something more specific. We want to find such that is small for each . When splits in , the original algorithm is not guaranteed to do this.
It can happen that at some other prime above , has a negative valuation. Consequently, the sum (3) used to compute will not converge -adically, and the approximations are not guaranteed to be -adically close to the . We resolve this problem by choosing a suitable element with for all , such that also satisfies
[TABLE]
Then it holds that
[TABLE]
By evaluating the difference on the right hand side, these -adic divergence issues are avoided, and we may be sure that the individual coefficients approximate the -adically.
In p_adic_LLL_bound_one_prime, we attempt to find a value such that Lemma 5.1 applies. If successful, we record the improved bound . The improvement offered by Lemma 5.1 depends only on the assumption that is the extremal place, and that some bound on the exponents is known. So we may replace by and attempt to apply Lemma 5.1 again, possibly improving the bound further. Because the application of LLL is very fast compared to the sieving step described in §6, the algorithm repeats this process until no further improvements can be made to . Once each has been optimized in this way, the function p_adic_LLL_bound returns
[TABLE]
5.2. Complex places
We now consider the case where is an infinite complex place. The reduction is quite analogous to the -adic case; again the standard references are [35, 37, 15]. We keep the notations from §4.3. For we define the complex numbers
[TABLE]
As is an infinite place, we have established already that the in
[TABLE]
satisfy the bounds
[TABLE]
We now attempt to use lattice reduction to improve the bound; the choice of lattice and certain constants will depend slightly on whether the are all purely imaginary. So we define
[TABLE]
and define
[TABLE]
If are not all pure imaginary, relabel so that . Now define
[TABLE]
Let be the integer matrix
[TABLE]
(By design, the upper left block of is the identity matrix in case the are all pure imaginary.) Now, let be the lattice generated by the columns of , and suppose is a positive lower bound for . When is a infinite non-real place, we define:
[TABLE]
Similar to [37, Lemma VI.2], we have
Lemma 5.2**.**
Suppose is a non-real infinite place. With notation as above, suppose is chosen such that . Then .
Proof.
There are two cases to consider, as or . In each case, our goal is to establish the inequality
[TABLE]
for the result follows by isolating in the inequality (19).
If the are not all pure imaginary, we define
[TABLE]
Then note that
[TABLE]
Therefore
[TABLE]
We know from (16) that , so
[TABLE]
Now notice that the vector
[TABLE]
is in the lattice , so . Further,
[TABLE]
which implies (19) and the result follows.
In case the are all pure imaginary, the approach is similar. Set
[TABLE]
Similar to the other case, we have , and therefore . Again applying (16) we obtain
[TABLE]
Now notice that the vector
[TABLE]
is in the lattice , so . Further,
[TABLE]
Again this implies (19). ∎
5.3. Real places
Now suppose that is a real infinite place. Although the arguments in §5.2 apply to , we can obtain a stronger improvement by analyzing this case separately. Replacing by as necessary, we may assume for all with .
The mere existence of a real place forces and . We set and define the real numbers
[TABLE]
As the , we may revisit (15); this time we obtain
[TABLE]
as no adjustments are required to accommodate the branch cut of the logarithm. Set
[TABLE]
and again let be generated by the columns of the matrix in (18). When is an infinite real place, we define:
[TABLE]
Lemma 5.3**.**
Suppose that is a real infinite place. With notation and definitions as above, suppose is chosen so that . Then .
Proof.
If we define
[TABLE]
then we obtain
[TABLE]
Observing that the vector
[TABLE]
and that for , , the remainder of the proof now follows the logic of Lemma 5.2 exactly. ∎
5.4. Implementation
The function minimal_vector is used in the implementation to compute a value for . In cx_LLL_bound, we have implemented the reduction step for the infinite places applying the above idea. As in the finite case, the parameter is chosen inside the function and changed as necessary to meet the bound (keeping in mind, of course, that the definitions of and depend on the particular place ). Notice that the proof of Lemmas 5.2 and 5.3 depend on obtaining true rounding in obtaining the coefficients of the matrix . In our implementation, we increase precision until this is assured. Similar to the case where is finite, the improvement of Lemma 5.2 needs only the assumption that is the extremal place and that some bound on the exponents is known. So we apply Lemma 5.2 repeatedly until no further improvement to is possible. Once this has been done for each infinite place, we set
[TABLE]
Consequently, we have the following bound which may be passed to the sieve in the next section.
Lemma 5.4**.**
Assume that for each , a value or exists for which the hypotheses of one of the Lemmas 5.1, 5.2, or 5.3 are met. Then the maximum exponent appearing in any solution of the -unit equation (1) satisfies
[TABLE]
We use the proof as an opportunity to summarize the algorithm, up to the sieving step of the next section.
Proof.
There is nothing to show if the solution set is empty, so let us assume otherwise. We know the -unit equation has only finitely many solutions. Keeping the notation of (6), let be a solution where is maximized. One of the places in , say , is extremal. If is finite, then the work in §4.2 demonstrates by applying one of the corollaries deduced from Yu’s bound. If is infinite, then the work in §4.3 demonstrates by applying the theorem of Baker-Wüstholz. This establishes an absolute bound on .
For each possible , the techniques of this section attempt to replace this absolute bound with a smaller bound. There is no mathematical proof that the lattice reduction techniques will succeed, i.e., that there will exist appropriate values and for which Lemmas 5.1, 5.2, or 5.3 apply. However, when they do exist, the improved bound is provably correct by the same lemmas. Here, such success is presumed for every , and (20) holds. ∎
In practice, if the hypotheses of Lemma 5.4 are not established, then one only has the weaker bounds coming from linear forms of logarithms – these are simply too large to allow for a provably complete search. However, the sieve described in the next section can still be used up to any prescribed bound ; it will find all solutions satisfying .
6. Further Reducing the Search Space: Sieving
The approach taken here, for sieving against primes outside of , is based on an algorithm described by Smart in [35]. Smart credits Tzanakis and de Weger with this approach [41]; Tzanakis reports that these ideas date back to Andrew Bremner.
6.1. Setup for the sieve
Recalling the notations of §2.5, we define for any ,
[TABLE]
This finite set will provide a useful search space for exponent vectors in a way we will make more precise below. There is an obvious surjective map . Despite the fact that this map is the identity (and not a reduction map) in the [math]th coordinate, we will refer to this as the reduction modulo map, and call an element an exponent vector modulo .
Let . The exponent vector for (relative to ) is . That is, it is the unique such that . Given any bound for the exponent vector of a , we obtain a finite subset of that contains every solution of the -unit equation. Unfortunately, this is usually still too large of a search space to be practical (see §7), so we must sieve this finite set (or rather, the equivalent finite set of exponent vectors) prior to the exhaustive search. The sieve attempts to provide an efficient solution to the following problem:
Problem 6.1**.**
Find a small set satisfying .
If we can find a small enough superset in a fast enough way, the -unit equation solutions can then be found by brute force search over .
Suppose . We call a complement vector for if . If a complement vector exists, it must be unique; the existence of a complement vector is equivalent to , and a pair of complement exponent vectors correspond to a solution of the -unit equation.
Suppose is a prime number. We say avoids if for all ideals . If splits completely in , then there are prime ideals above in , say . We let denote the residue field of . Since is completely split, we of course have for all .
Suppose , and is a rational prime number which splits completely in and which avoids . The residue field vector for (with respect to ) is
[TABLE]
where is the reduction of modulo . The residue field vector depends on the ordering of the primes above ; we fix one ordering once and for all whenever we consider residue field vectors with respect to .
Notice that we have the following commutative diagram, whose horizontal rows are exact.
[TABLE]
Suppose . Since any two lifts , of to differ by a multiple of , we see that and differ by a perfect th power, and so determine the same residue field vector. In other words, the dashed arrow in the diagram corresponds to a well-defined map , and so the notion of a residue field vector for is well-defined. With this in mind, we abuse notation slightly and also write , where is any lift of .
Lemma 6.1**.**
Suppose and set . Then
- (a)
. 2. (b)
. 3. (c)
no entry of is .
Proof.
Since , it follows that for any , , verifying (a). As avoids , for every . This proves (b). Since (b) holds for both and , (c) follows from (a). ∎
Suppose is an exponent vector modulo ; i.e., . We call a -complement vector for if
[TABLE]
Existence of a -complement vector is a necessary, but not sufficient, condition for to lift to the exponent vector of a unit in a solution to the -unit equation. Further, any particular may have more than one -complement vector associated to it. We set
[TABLE]
6.2. Execution of the sieve
The strategy for the sieve is to play the sets off of one another for multiple values of . Choose a finite list of rational prime numbers
[TABLE]
each of which splits completely in and avoids , and such that
[TABLE]
Any true solution to the -unit equation corresponds to exponent vectors found in the set , and such vectors must reduce modulo to vectors in for each . Conversely, given a choice for each , there is at most one vector such that for each , while also satisfying . Define to be the product of the maps :
[TABLE]
Certainly we have
[TABLE]
Because lifts from to are unique when they exist, provides a reasonable proxy for the search space. We seek to replace each with a subset such that we still have
[TABLE]
Suppose are distinct primes in , and suppose , . We say and are compatible if there exists such that and . Notice that for any , an element reduces modulo and to produce a compatible pair of exponent vectors.
When and are compatible, we further call the pair complement compatible if there exist and such that
- •
is -complementary to ,
- •
is -complementary to ,
- •
and are compatible.
Lemma 6.2**.**
Suppose the sets satisfy condition (21). Further, suppose , and set
[TABLE]
If there exists such that contains no vectors which are complement compatible to , then
[TABLE]
In other words, under the given condition, we will lose no true solutions by removing from .
Proof.
Towards a contradiction, suppose satisfies
[TABLE]
There is a unique satisfying . Set
[TABLE]
Then and are compatible by definition. But since and cannot be complement compatible, the vectors and cannot be compatible. This is impossible, since . Thus, no such exists and the claim holds. ∎
The algorithm based on this lemma is the following.
Algorithm 6.2** (Sieve).**
Assume that , are fixed and a representation of has been computed.
**INPUT: **
**OUTPUT: **
satisfying (21).
- 1.
Set for each . 2. 2.
Loop over :
- (a)
Loop over :
- i.
If contains no -complement vector for , remove from 2. ii.
Loop over :
If there are no which are complement compatible with , then remove from . 3. 3.
Did Step remove any elements from any set ?
- •
If YES, return to Step 2.
- •
If NO, then STOP.
Once the sieve has been completed, we may find all solutions to the -unit equation by doing an exhaustive search over .
7. Experimental Observations and Computational Choices
In developing this code and in pursuit of applications, we have computed a very large number of examples. Some observations and discussion may be enlightening to a reader who wishes to solve the -unit equation for their own application.
Our implementation provides the function sieve_below_bound(K, S, B), which returns all solutions to the -unit equation in up to a specified bound (the maximum absolute value of an entry in an exponent vector). This may be useful in settings where an exhaustive list of solutions is not needed. For example, in the field with , and , the provable LLL-reduced exponent bound is . However, all solutions actually satisfy the exponent bound , and the command sieve_below_bound(K, S, 5) executes in under seconds.
7.1. Sieving vs. simple exhaustion
Once a bound has been reduced as much as possible by LLL, this search space must be somehow exhausted. This general problem can be solved in multiple ways. Those appearing in the literature can be generally described by the following three ideas:
- (1)
simple (non-number theory-based) exhaustion, 2. (2)
sieve by reducing the problem modulo primes not in , and 3. (3)
sieve by reducing the problem modulo powers of primes in .
Idea (1) could be looked at through the more general lens of efficient programming, and a good programmer may be able to develop their own code to exhaust the search space effectively. The current implementation uses idea (2), inspired by Smart’s earlier exposition in [35] and is described here in Section 6. Item (3) paraphrases an interesting idea which is first due to de Weger in the case [12] and which was generalized to arbitrary number fields for -unit equations arising from Thue and Thue-Mahler equations by Tzanakis and de Weger [40, 41, 42]. Wildanger [43] and Smart [38] worked out the details of the full generalization, which was later simplified by Evertse and Győry [15]. This is an extremely promising and potentially effective method of reducing the search space, and has been implemented recently in special cases by several people, including Koutsianas [23], Bennett, Gherga, and Rechnitzer [6], von Känel and Matschke [30], and others. Future work will certainly focus on including this sieving technique for our functions.
In all these methods, we begin with the same search space as in (1), and the computational complexity of a brute force search is easy to estimate. Let be a bound for the maximum absolute value of an exponent in a solution to the -unit equation. Since we are searching for a pair , the size of our search space is given by
[TABLE]
Thus a naïve brute force search has complexity . In practice, a simple exhaustive search can be carried out by checking, for each element of , whether is an -unit. Assuming this check has constant time for a fixed and , we get the less extreme complexity of .
In carrying out computations, we find that the resources required to sieve a search space vary greatly, even for number fields of the same degree and -unit groups of the same rank. For example, we give the run time for three fields , where is the set of primes above in , in Table 7.1. The column gives the total number of distinct solutions found. In each case, the LLL-reduced bound is below , so complete sets of solutions are found in each case. Computations were performed in a paid account on the CoCalc platform in late 2018.
The resources required depend on the size of the search space, but also can vary greatly based on the particular list of primes chosen for the sieve, and even the order of those primes! In many cases, the sieve greatly reduces the time required to exhaust the space. In others, a brute force search of the reduced search space can actually be a better choice, as the sieving computation can take a mysteriously long time. Finding a way to understand and predict these difficulties is a priority for future work. The implementation of idea (3) could also make this unnecessary. In all cases, it is worthwhile to find the smallest reduced bound possible, whether as input for the built-in sieve or for use in a brute force search.
7.2. Finite place vs. infinite place bounds
In general, we find that the LLL-reduced bounds corresponding to infinite are smaller than the bounds for finite. To illustrate this, let be the set of number fields satisfying
[TABLE]
If , we set
[TABLE]
For any choice of and where , we have computed the LLL-reduced bounds under the assumption that is finite and under the assumption that is infinite. Complete bound data is available by email request to authors Malmskog or Rasmussen. Here we will consider only the case . Now, let and be the bounds obtained in §5 under the assumption that is a finite or infinite place, respectively. In Figure 1, we plot both and against the root discriminant of (which ranges from to in .) The bound usually exceeds , on average by a factor of .
Because the disparity between these bounds is so large, we would prefer to use . Generally, we have no control over whether is finite or infinite. However, if contains only one finite place, a small trick allows us to use . If is a solution to the -unit equation, note that and are also -unit equation solutions. We define the solution cycle of to be
[TABLE]
The following result is a restatement of [26, Lemma 6.3].
Lemma 7.1**.**
Let be a number field, and suppose is a finite set of places of containing all infinite places and at most one finite place, (i.e. ). Let be a solution to the -unit equation over . Then at least one element of belongs to a solution with corresponding to an infinite place.
This implies that under the hypothesis of the lemma, some representative of each solution cycle has an exponent vector bounded by ; recovering the entire solution cycle from one representative is trivial. Thus, we can determine all solutions to the -unit equation.
It may seem that the hypothesis of Lemma 7.1 – that there is only one finite place in – is a rather specialized condition. However, many interesting arithmetic applications involve searching for objects with “good” behavior away from one prime . In such cases, we take . Should ramify in , the condition is equivalent to being totally ramified, and this is not so uncommon when is small. Here, with , the lemma applies for of the number fields in .
To illustrate the utility of Lemma 7.1, consider the ratio of the sizes of the search spaces for two bounds and , given by
[TABLE]
This quantifies the potential savings when the better bound may be used. For , Figure 2 plots the savings against the root discriminant of for the fields in for which .
8. Applications
A major application of solving -unit equations is in enumerating solutions to Shafarevich-type problems, for example finding complete lists of curves of a given type with particular reduction properties. The blueprint for this implementation came from Smart’s 1997 enumeration of all genus 2 curves over with good reduction away from [36], building off earlier work with Merriman [27]. In 2017, Malmskog and Rasmussen used these methods to determine all Picard curves defined over with good reduction away from [26]. The same year, Koutsianas produced a new algorithm that uses solutions to the -unit equation to find all elliptic curves over an arbitrary number field having good reduction outside [23]. In the remainder of this article, we provide some new applications of the implementation.
8.1. Asymptotic Fermat
Let be a number field. We consider the nontrivial solutions to the Fermat equation:
[TABLE]
For fixed , it follows from the work of Faltings that is finite, but it is reasonable to ask whether is finite or infinite. Finiteness is equivalent to the condition that for sufficiently large . We say satisfies asymptotic Fermat if there exists a bound such that implies .
There are several number fields known to satisfy asymptotic Fermat: Jarvis-Meekin [22] demonstrate that satisfies asymptotic Fermat with . Freitas-Siksek give an explicit family of real quadratic fields of density which satisfy asymptotic Fermat. They also report that the real quartic field, satisfies asymptotic Fermat.
In [16], Freitas and Siksek find a condition on a totally real field which guarantees that satisfies asymptotic Fermat. For the remainder, suppose is totally real. Define
[TABLE]
Theorem 8.1** (Freitas-Siksek).**
Let be a totally real number field, with either odd or nonempty. Suppose that for every solution to the -unit equation, there is some such that . Then satisfies asymptotic Fermat.
Remark 8.1**.**
We note that Freitas-Siksek’s result is actually stronger, and they provide additional conditions under which must satisfy asymptotic Fermat. Also, more recent work of Şengün-Siksek [33] provides similar criteria for arbitrary number fields. However, the above formulation is sufficient for our application.
The reader may recall that Wiles’s classic proof of Fermat’s Last Theorem proceeds by taking a hypothetical solution and noting that the associated Frey elliptic curve is forced to satisfy an impossible set of constraints (that the curve is not modular). Freitas and Siksek’s approach is similar. Given a solution to over , they produce an elliptic curve (related to, but distinct from, the Frey curve) whose -invariant is arithmetically constrained. However, the -invariant is determined by the -invariants of ; these are guaranteed to arise as solutions to the -unit equation over . The result above follows from a delicate analysis of how these constraints interact.
We report a new list of cubic number fields which satisfy asymptotic Fermat. Using the implementation of the algorithm described in this paper, we find all solutions to the -unit equation ( as above), and verify the condition of Freitas-Siksek (this last step is trivial once all solutions have been determined).
Let denote the set of totally real cubic number fields in which is totally ramified and which have absolute discriminant satisfying . Table 8.1 lists all the fields of for . For each , we solved the appropriate -unit equation, and by applying Theorem 8.1, verified that satisfies asymptotic Fermat. Our results are not effective, as Theorem 8.1 does not provide the bound .
For each , denotes a minimal polynomial for ; is the absolute discriminant of . Because is totally ramified, Lemma 7.1 guarantees that every solution cycle will contain a solution with the extremal place infinite. Consequently, each solution cycle will contain at least one solution satisfying
[TABLE]
(Finding the remaining solutions in the solution cycle is trivial even if they do not satisfy this bound.)
Finally, indicates the number of distinct solutions to the -unit equation found. (These are unordered solutions, so that and are not considered distinct.) The reader should note that the two trivial solutions over , and , are counted in each field .
8.2. Cubic Ramanujan-Nagell equations
In 1913, Ramanujan conjectured that the only solutions of the Diophantine equation over the natural numbers satisfy [31]. This was settled in 1948 by Nagell [28]. The more general family of equations,
[TABLE]
are called Ramanujan-Nagell equations, and the literature for solving such equations is very rich (see for example [10, 9, 11, 7]). Very recently cubic Ramanujan-Nagell equations, have attracted the attention of mathematicians [5]. These are equations of the form
[TABLE]
We consider the particular example
[TABLE]
If , a more general version of (22) is solved in [5]. Here, we prove the following theorem.
Theorem 8.2**.**
Let be a prime with . All integer solutions of the cubic Ramanujan-Nagell equation (22) with are listed in Table 8.2.
Our method also works for the equation , where are different odd primes, and the proof is similar to the case .
Proof.
Let be the splitting field for . We observe is unramified outside . In fact, has class number and is totally ramified at . Let be the unique prime in above . Let be the set of all places of above , , or .
Suppose is a solution to (22). Let be a root of , and let denote a primitive cube root of unity. Define
[TABLE]
Then we must have and . For ,
[TABLE]
Since , we see for each . Also, it follows that the are pairwise coprime. Thus, if , then exactly one is divisible by , and . Now fix so that for at least one . Choose and set
[TABLE]
Then is a solution to the -unit equation and for some . Choose a root of unity and a basis for the torsion-free part of . Choose such that
[TABLE]
There exists such that . Define
[TABLE]
and set . By design,
[TABLE]
With these bounds established, the solutions to (22) may now be determined by exhaustion. ∎
As a final remark, we observe that we may choose so that . Let be the prime ideals in above . As is a PID, we may choose such that . Let generate the torsion-free part of . The choice now gives .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Baker. Linear forms in the logarithms of algebraic numbers. I, II, III. Mathematika 13 (1966), 204-216; ibid. 14 (1967), 102-107; ibid. , 14:220–228, 1967.
- 2[2] A. Baker and H. Davenport. The equations 3 x 2 − 2 = y 2 3 superscript 𝑥 2 2 superscript 𝑦 2 3x^{2}-2=y^{2} and 8 x 2 − 7 = z 2 8 superscript 𝑥 2 7 superscript 𝑧 2 8x^{2}-7=z^{2} . Quart. J. Math. Oxford , 20(2):129-137, 1969.
- 3[3] A. Baker and G. Wüstholz. Logarithmic forms and group varieties. J. Reine Angew. Math. , 442:19–62, 1993.
- 4[4] A. Baker and G. Wüstholz. Logarithmic forms and Diophantine geometry , volume 9 of New Mathematical Monographs . Cambridge University Press, Cambridge, 2007.
- 5[5] M. Bauer and M. A. Bennett. Ramanujan-Nagell cubics. Rocky Mountain J. Math. , 48(2):385–412, 2018.
- 6[6] M. A. Bennett, A. Gherga, and A. Rechnitzer. Computing elliptic curves over ℚ ℚ \mathbb{Q} . Math. Comp. , 88(317):1341–1390, 2019.
- 7[7] M. A. Bennett and C. M. Skinner. Ternary Diophantine equations via Galois representations and modular forms. Canad. J. Math. , 56(1):23–54, 2004.
- 8[8] A. Brumer. On the units of algebraic number fields. Mathematika , 14:121–124, 1967.
