A robust implementation for solving the $S$-unit equation and several   applications

Alejandra Alvarado; Angelos Koutsianas; Beth Malmskog; Christopher; Rasmussen; Christelle Vincent; Mckenzie West

arXiv:1903.00977·math.NT·July 10, 2020

A robust implementation for solving the $S$-unit equation and several applications

Alejandra Alvarado, Angelos Koutsianas, Beth Malmskog, Christopher, Rasmussen, Christelle Vincent, Mckenzie West

PDF

Open Access

TL;DR

This paper introduces a new implementation in SageMath for solving the $S$-unit equation, enabling extensive computations that lead to applications such as an asymptotic Fermat's Last Theorem for certain cubic fields and solutions to Ramanujan-Nagell equations.

Contribution

The paper provides a robust implementation for solving the $S$-unit equation in SageMath, with mathematical foundations and extensive computational results for various number fields.

Findings

01

Bounded solutions for small degree fields and sets $S$

02

Proof of an asymptotic Fermat's Last Theorem in specific cubic fields

03

Complete solutions to certain Ramanujan-Nagell equations

Abstract

Let $K$ be a number field, and $S$ a finite set of places in $K$ containing all infinite places. We present an implementation for solving the $S$ -unit equation $x + y = 1$ , $x, y \in O_{K, S}^{\times}$ in the computer algebra package SageMath. This paper outlines the mathematical basis for the implementation. We discuss and reference the results of extensive computations, including exponent bounds for solutions in many fields of small degree for small sets $S$ . As an application, we prove an asymptotic version of Fermat's Last Theorem for totally real cubic number fields with bounded discriminant where 2 is totally ramified. In addition, we use the implementation to find all solutions to some cubic Ramanujan-Nagell equations.

Tables5

Table 1. Table 3.1. The constants a ( 1 ) superscript 𝑎 1 a^{(1)} and κ 1 subscript 𝜅 1 \kappa_{1}

Case	$a^{(1)}$	$κ_{1}$
$p = 2$	$32$	$40$
$p = 3$	$16$	$20$
$p > 3$ and $e_{𝔭} \geq 2$	$16$	$20$
$p > 3$ and $e_{𝔭} = 1$	$\frac{8 (p - 1)}{p - 2}$	$10$

Table 2. Table 3.2. The constant c ( 1 ) superscript 𝑐 1 c^{(1)}

$p \leq 5$		$p > 5$
Case	$c^{(1)}$	Case	$c^{(1)}$
$p = 2$	$160$	$p \equiv 1 (4)$ and $e_{𝔭} = 1$	$1473$
$p = 3$ and $d_{K} = 1$	$537$	$p \equiv 1 (4)$ and $e_{𝔭} \geq 2$	$1502$
$p = 3$ and $d_{K} \geq 2$	$759$	$p \equiv 3 (4)$ , $e_{𝔭} = 1$ , $d_{K} = 1$	$1288$
$p = 5$ and $e_{𝔭} = 1$	$1473$	$p \equiv 3 (4)$ , $e_{𝔭} = 1$ , $d_{K} \geq 2$	$1282$
$p = 5$ and $e_{𝔭} \geq 2$	$319$	$p \equiv 3 (4)$ , $e_{𝔭} \geq 2$	$2190$

Table 3. Table 7.1. Runtimes for sieve_below_bound(K, S, 40)

$g (x)$	$t$	$w$	$N$	Runtime (in seconds)
$x^{4} - x^{2} + 1$	$2$	$12$	16	01.16
$x^{4} + 9$	$2$	$4$	0	02.06
$x^{4} + 12 x^{2} + 18$	$2$	$2$	0	64.00

Table 4. Table 8.1. Fields in 𝒦 2000 subscript 𝒦 2000 \mathscr{K}_{2000} and number of S 𝑆 S -unit equation solutions

$f_{K}$	$Δ_{K}$	$K_{1}^{LLL}$	$N (S, K)$
$x^{3} - x^{2} - 3 x + 1$	$2^{2} \cdot 37$	$225$	$53$
$x^{3} - x^{2} - 5 x - 1$	$2^{2} \cdot 101$	$175$	$11$
$x^{3} - x^{2} - 5 x + 3$	$2^{2} \cdot 3 \cdot 47$	$156$	$5$
$x^{3} - 6 x - 2$	$2^{2} \cdot 3^{3} \cdot 7$	$161$	$5$
$x^{3} - x^{2} - 7 x - 3$	$2^{2} \cdot 197$	$156$	$8$
$x^{3} - 8 x - 6$	$2^{2} \cdot 269$	$176$	$8$
$x^{3} - 10 x - 10$	$2^{2} \cdot 5^{2} \cdot 13$	$156$	$8$
$x^{3} - x^{2} - 7 x + 5$	$2^{2} \cdot 349$	$199$	$8$
$x^{3} - x^{2} - 9 x - 5$	$2^{2} \cdot 373$	$162$	$8$
$x^{3} - x^{2} - 7 x + 1$	$2^{2} \cdot 3 \cdot 127$	$180$	$2$
$x^{3} - x^{2} - 9 x + 11$	$2^{2} \cdot 389$	$198$	$8$
$x^{3} - 12 x - 14$	$2^{2} \cdot 3^{4} \cdot 5$	$164$	$2$
$x^{3} - 8 x - 2$	$2^{2} \cdot 5 \cdot 97$	$176$	$5$

Table 5. Table 8.2. Solutions to ( 22 ) with 3 < q ≤ 500 3 𝑞 500 3<q\leq 500 .

$q$	$x$	$k$	$n$	$q$	$x$	$k$	$n$
$11$	$2$	$1$	$1$	$73$	$4$	$2$	$1$
$17$	$- 4$	$4$	$1$	$89$	$2$	$4$	$1$
$17$	$2$	$2$	$1$	$179$	$- 4$	$5$	$1$
$19$	$- 2$	$3$	$1$	$251$	$2$	$5$	$1$
$67$	$4$	$1$	$1$	$307$	$4$	$5$	$1$
$73$	$- 2$	$4$	$1$

Equations296

F (x, y) = c,

F (x, y) = c,

x + y = 1, x, y \in O_{K, S}^{\times} .

x + y = 1, x, y \in O_{K, S}^{\times} .

[K : Q] \leq 5, S \subseteq {p \subseteq O_{K} : p ∣ 6} .

[K : Q] \leq 5, S \subseteq {p \subseteq O_{K} : p ∣ 6} .

∣ α ∣_{p} := {∣ σ_{p} (α) ∣ ∣ σ_{p} (α) ∣ p is real, p is complex.

∣ α ∣_{p} := {∣ σ_{p} (α) ∣ ∣ σ_{p} (α) ∣ p is real, p is complex.

ord_{p} (β) = m if β \in p^{m} - p^{m + 1},

ord_{p} (β) = m if β \in p^{m} - p^{m + 1},

∣ α ∣_{p} := p^{- f_{p} ord_{p} (α)} .

∣ α ∣_{p} := p^{- f_{p} ord_{p} (α)} .

∣ α ∣_{p} = p^{- ord_{p} α}, α \in C_{p}^{\times} .

∣ α ∣_{p} = p^{- ord_{p} α}, α \in C_{p}^{\times} .

ord_{p} β = e_{p} ord_{p} β, β \in K_{p} .

ord_{p} β = e_{p} ord_{p} β, β \in K_{p} .

h (x) = \frac{1}{d _{K}} p \sum lo g (j max {∣ x_{j} ∣_{p}}),

h (x) = \frac{1}{d _{K}} p \sum lo g (j max {∣ x_{j} ∣_{p}}),

h^{'} (β) = \frac{1}{d _{K^{'}}} max {d_{K^{'}} \cdot h (β), ∣ lo g β ∣, 1} .

h^{'} (β) = \frac{1}{d _{K^{'}}} max {d_{K^{'}} \cdot h (β), ∣ lo g β ∣, 1} .

Δ_{1} := {z \in C_{p} : ∣ z - 1 ∣_{p} < 1} .

Δ_{1} := {z \in C_{p} : ∣ z - 1 ∣_{p} < 1} .

lo g_{p} z = - n \geq 1 \sum \frac{( 1 - z ) ^{n}}{n} .

lo g_{p} z = - n \geq 1 \sum \frac{( 1 - z ) ^{n}}{n} .

lo g_{p} (x y) = lo g_{p} x + lo g_{p} y .

lo g_{p} (x y) = lo g_{p} x + lo g_{p} y .

ord_{p} (lo g_{p} (1 + z)) = ord_{p} z .

ord_{p} (lo g_{p} (1 + z)) = ord_{p} z .

Φ_{\uprho} : A_{K, S} ⟶ O_{K, S}^{\times}, a := (a_{0}, a_{1}, \dots, a_{t}) \mapsto i = 0 \prod t ρ_{i}^{a_{i}} .

Φ_{\uprho} : A_{K, S} ⟶ O_{K, S}^{\times}, a := (a_{0}, a_{1}, \dots, a_{t}) \mapsto i = 0 \prod t ρ_{i}^{a_{i}} .

0 < i \leq t max ∣ a_{i} ∣ \leq B .

0 < i \leq t max ∣ a_{i} ∣ \leq B .

X_{K, S} := {τ \in O_{K, S}^{\times} : 1 - τ \in O_{K, S}^{\times}} .

X_{K, S} := {τ \in O_{K, S}^{\times} : 1 - τ \in O_{K, S}^{\times}} .

τ_{i} = \uprho^{b_{i}} = j = 0 \prod t ρ_{j}^{b_{i, j}}, i = 1, 2.

τ_{i} = \uprho^{b_{i}} = j = 0 \prod t ρ_{j}^{b_{i, j}}, i = 1, 2.

ord_{p} μ_{j} = 0, 0 \leq j \leq n - 1.

ord_{p} μ_{j} = 0, 0 \leq j \leq n - 1.

B \geq max {∣ b_{0} ∣, \dots, ∣ b_{n - 1} ∣, 3} .

B \geq max {∣ b_{0} ∣, \dots, ∣ b_{n - 1} ∣, 3} .

ord_{p} (Θ - 1) < C_{1}^{*} Ω lo g B .

ord_{p} (Θ - 1) < C_{1}^{*} Ω lo g B .

h_{p} (μ) := max {h (μ), \frac{f _{p}}{κ _{1} ( n + 4 ) d _{K}}} .

h_{p} (μ) := max {h (μ), \frac{f _{p}}{κ _{1} ( n + 4 ) d _{K}}} .

Ω (n, d_{K}, p) := μ \in a \prod h (μ) \cdot μ \in b \prod h_{p} (μ) .

Ω (n, d_{K}, p) := μ \in a \prod h (μ) \cdot μ \in b \prod h_{p} (μ) .

Ω^{'} (n, d_{K}, p) := h_{p} (μ_{0}) j = 1 \prod n - 1 h (μ_{j}) .

Ω^{'} (n, d_{K}, p) := h_{p} (μ_{0}) j = 1 \prod n - 1 h (μ_{j}) .

k_{2}

k_{2}

k_{3}

k_{4}

C_{1}^{*} (n, d_{K}, p) := (n + 1) k_{2} k_{3} k_{4} .

C_{1}^{*} (n, d_{K}, p) := (n + 1) k_{2} k_{3} k_{4} .

K^{'} := {K (ζ_{4}) K (ζ_{3}) q = 2, q = 3.

K^{'} := {K (ζ_{4}) K (ζ_{3}) q = 2, q = 3.

ord_{p} α = e_{P ∣ p} ord_{P} α, α \in K^{\times},

ord_{p} α = e_{P ∣ p} ord_{P} α, α \in K^{\times},

ord_{p} (Θ - 1) < e_{P ∣ p} \cdot C_{1}^{*} (n, d_{K^{'}}, P) \cdot Ω^{'} (n, d_{K^{'}}, P) \cdot lo g B .

ord_{p} (Θ - 1) < e_{P ∣ p} \cdot C_{1}^{*} (n, d_{K^{'}}, P) \cdot Ω^{'} (n, d_{K^{'}}, P) \cdot lo g B .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnalytic Number Theory Research · Algebraic Geometry and Number Theory · Advanced Mathematical Identities

Full text

A robust implementation for solving the $S$ -unit equation and several applications

Alejandra Alvarado

Alejandra Alvarado, Department of Mathematics and Computer Science, Eastern Illinois University

[email protected]

,

Angelos Koutsianas

Angelos Koutisanas, Department of Mathematics, University of British Columbia

[email protected]

,

Beth Malmskog

Beth Malmskog, Department of Mathematics and Computer Science, Colorado College

[email protected]

,

Christopher Rasmussen

Christopher Rasmussen, Department of Mathematics and Computer Science, Wesleyan University

[email protected]

,

Christelle Vincent

Christelle Vincent, Department of Mathematics and Statistics, University of Vermont

[email protected]

and

Mckenzie West

Mckenzie West, Department of Mathematics, University of Wisconsin Eau Claire

[email protected]

Abstract.

Let $K$ be a number field, and $S$ a finite set of places in $K$ containing all infinite places. We present an implementation for solving the $S$ -unit equation $x+y=1$ , $x,y\in\mathscr{O}_{K,S}^{\times}$ in the computer algebra package SageMath. This paper outlines the mathematical basis for the implementation. We discuss and reference the results of extensive computations, including exponent bounds for solutions in many fields of small degree for small sets $S$ . As an application, we prove an asymptotic version of Fermat’s Last Theorem for totally real cubic number fields with bounded discriminant where 2 is totally ramified. In addition, we use the implementation to find all solutions to some cubic Ramanujan-Nagell equations.

1. Introduction

In 1909, Thue proved there are only finitely many integral solutions to what we now call the Thue equation; i.e, that for any $\mathbb{Q}$ -irreducible binary form $F(X,Y)$ of degree at least 3, defined over the integers, there are only finitely many solutions $(x,y)\in\mathbb{Z}^{2}$ to the equation

[TABLE]

where $c$ is any non-zero integer [39]. Thue accomplished this by formally factoring $F$ into linear terms of the form $(x-\alpha y)$ , where $\alpha$ is algebraic, then bounding the quality of rational approximations of $\alpha$ in terms of the size of $x$ and $y$ . Thus bounds on integer solutions to the Thue equation arose out of the theory of approximating algebraic numbers by rationals. Thue’s theorem was generalized by Siegel [34]111See also the recent translation [17] by Fuchs. and then Mahler [25]. These generalizations gave rise to a central fact of modern computational number theory: if $K$ is a number field, and $S$ a finite list of places of $K$ including all infinite places, then there are only finitely many solutions $(x,y)$ to the equation

[TABLE]

Here, $\mathscr{O}_{K,S}^{\times}$ is the unit group of the ring $\mathscr{O}_{K,S}$ of $S$ -integers in $K$ . We refer to (1) as the $S$ -unit equation. In this paper, we describe an algorithm to determine the complete set of solutions to the $S$ -unit equation for general $K$ and $S$ . More generally, for fixed $a,b\in\mathscr{O}_{K,S}$ , we can see that the equation $ax+by=1$ will also have only finitely many solutions by expanding the set $S$ to include all primes dividing $a$ and $b$ and searching for solutions to (1). Thus it suffices to solve (1) to address the more general case, and we focus on (1) here (though it should be remarked that this is not the most efficient way to solve $ax+by=1$ ).

The work of Gelfond and Schneider, resolving Hilbert’s seventh problem in the affirmative (all irrational algebraic powers of algebraic numbers are transcendental once trivial cases are ignored), determined lower bounds on the absolute value of a $\mathbb{Q}$ -linear combination of two $\mathbb{Q}$ -linearly independent logarithms of algebraic numbers. Alan Baker’s 1967 theorem [1] generalized these results to the case of many logarithms. Baker, Wüstholz, and many others continued to improve these bounds. Naturally, one should ask if similar results are available over local fields, and indeed such results began to appear quickly. In 1968, Brumer proved the first analogue of Baker’s work for $p$ -adic logarithms [8], followed by many improvements and generalizations, such as the results of Yu [44]. Improvements in both the archimedean and nonarchimedean cases continue to appear, such as in [20, 4, 47, 19].

For any choice of $K$ and $S$ , $\mathscr{O}_{K,S}^{\times}$ is a finitely generated $\mathbb{Z}$ -module. Fixing a basis $\rho_{1},\dots,\rho_{t}$ for the torsion free part, we can express any $x\in\mathscr{O}_{K,S}^{\times}$ as $x=\xi\cdot\prod_{i=1}^{t}\rho_{i}^{a_{i}}$ for some root of unity $\xi\in K$ and some $a_{i}\in\mathbb{Z}$ . Building on the lower bounds for linear combinations of logarithms, Győry [18] determined effectively computable bounds for the exponents $a_{i}$ . This was a great victory for computational number theory, as this provably restricted all solutions to (1) to a finite search space. Unfortunately, the demonstrated bounds were enormous and as a matter of practice, it was computationally infeasible to conduct an exhaustive search for solutions, even in the very simplest cases. Baker and Davenport devised a clever method of reducing the bounds in special cases in [2]. However, in [14], de Weger built on the ideas of Baker-Davenport to develop a powerful general method of algorithmically reducing the bounds to a manageable size, relying on the lattice basis reduction algorithm of Lenstra, Lenstra, and Lovász [24] (henceforth referred to as the “LLL algorithm”). Though it is has not been proven that de Weger’s method will always reduce the bounds coming from the results in linear forms of logarithms, this is the rule in practice. In many cases, de Weger’s approach provides sufficient improvements that, with careful sieving (or sometimes even with only brute force), the entire search space can be exhausted and complete lists of solutions can be enumerated.

Beyond the improvements provided by LLL-based reduction, many mathematicians have developed further algorithms for efficiently searching below the “LLL bounds” provided by de Weger’s work. Two powerful examples are reported in [43] and [38]. Increasingly, the theoretical improvements (assisted by technological improvements) have pushed ambitious and interesting computational problems within reach. For example, Smart determined the entire set of all genus $2$ curves over $\mathbb{Q}$ with good reduction away from $2$ , based in part on solving (1) for a family of number fields unramified away from $2$ [36].

We have written a package of Python functions for inclusion in the computer algebra system SageMath [32], which solves the $S$ -unit equation (1) over any number field $K$ and for any finite set $S$ of finite places. As experienced readers may expect, the package is not practical when either $[K:\mathbb{Q}]$ or $|S|$ is too large, although there is no theoretical obstruction. While this package is the independent creation of the authors, it is based in part on the descriptions of algorithms implemented by Smart [35, 36, 37]. Specifically, we follow Smart’s development in determining initial large bounds, including the numbering of constants, in [35], with some adjustments and small corrections. In reducing the bounds, we follow [37], again with some adjustments. The sieving step is based on ideas cited by Smart [36] as due to others (as noted in Section 6) but has been redeveloped in new notation and style. We include proofs of our versions of results when we made adjustments to versions in the literature. To the authors’ knowledge, our package is the first publicly available implementation for solving the $S$ -unit equation over any field other than $\mathbb{Q}$ ; the present article describes the algorithm and its implementation. The implementation was a highly non-trivial undertaking, involving efforts spreading over more than seven years on the parts of individuals and the entire team.

We also provide new results facilitated by our implementation. In particular, we first provide a discussion of and link to explicit exponent bounds for solutions of the $S$ -unit equation in all cases $(K,S)$ where $K/\mathbb{Q}$ is ramified only at primes above some subset of $\{2,3\}$ and

[TABLE]

We improve the best known exponent bounds for solutions of the $S$ -unit equation over number fields related to a class of genus $2$ curves over $\mathbb{Q}$ with good reduction away from $3$ . We solve the $S$ -unit equation in the $13$ totally real cubic number fields $K$ in which $2$ is totally ramified and the absolute discriminant of $K$ , $\Delta_{K}$ , satisfies $|\Delta_{K}|\leq 2000$ , and we use these results to verify that an asymptotic version of Fermat’s Last Theorem holds over these fields. Finally, we find all solutions to certain cubic Ramanujan-Nagell equations.

1.1. Overview

The organization of the paper proceeds as follows. We introduce certain notations in §2. In §3, we review the relevant work of Baker-Wüstholz and Yu. This is used in §4 to establish a “pre-LLL” exponent bound for each place in $S$ . In §5, we explain the process of using LLL to reduce these exponent bounds – the approach is different for archimedean and nonarchimedean places. In §6, we describe the sieve for further constraining the final search space. We devote §7 to a discussion of our experimental observations, having now executed our algorithm in several dozen cases. We highlight a special condition ( $S$ contains only one finite place) under which a significant improvement in the search space can be obtained. Although narrow in scope, the special condition is sufficiently natural, and the savings sufficiently nontrivial, as to warrant its discussion. Finally, §8 introduces two applications: an asymptotic version of Fermat’s Last Theorem over totally real cubic fields and a solution to a cubic variant of the Ramanujan-Nagell equation.

Acknowledgments

We are delighted to recognize the Institute for Computational and Experimental Research in Mathematics for both funding and hosting a 2017 collaboration during which a great deal of this project was completed. Part of this work began at the 2014 workshop SageDays 62, and we would like to thank Anna Haensch and Lola Thompson for organizing that workshop and Microsoft Research and The Beatrice Yormark Fund for Women in Mathematics for funding. Some of the work was supported by the van Vleck fund at Wesleyan University. The authors would like to thank many people for helpful conversations that led to improvements in the code and gave direction to this project, including Bjorn Poonen, Andrew Sutherland, and Norman Danner. We would also like especially to thank David Roe for his contributions to refining and reviewing the code for inclusion in SageMath. The third author was partially supported in this work by NSA Grant #H98230-16-1-0300. We are very grateful to the anonymous referees for their careful reading of this work and their many helpful comments which have improved the quality of this paper.

2. Notation

2.1. $S$ -units in number fields

Throughout this paper, we let $\bar{\mathbb{Q}}$ denote the algebraic closure of $\mathbb{Q}$ inside $\mathbb{C}$ , the field of complex numbers. Unless stated otherwise, we fix the following notation throughout:

[TABLE]

If $f(x)\in\mathbb{Z}[x]$ is a monic and irreducible polynomial, we let $K_{f}$ denote the number field $\mathbb{Q}(\xi)$ , where $\xi$ is a root of $f(x)$ . Always, $\log$ denotes the principal branch of the complex logarithm function, with argument in $(-\pi,\pi]$ .

2.2. Absolute Values and Completions

Each place of $K$ determines an associated value, $|\cdot|_{\mathfrak{p}}$ , which we now describe.

Let $|\cdot|$ denote the usual absolute value on $\mathbb{C}$ . If $\mathfrak{p}$ is an infinite place, choose $\sigma_{\mathfrak{p}}\colon K\to\mathbb{C}$ , an embedding corresponding to $\mathfrak{p}$ . The associated absolute value depends on whether $\mathfrak{p}$ is a real or complex (meaning non-real) place of $K$ :

[TABLE]

Now suppose $\mathfrak{p}$ is a finite place. View $\mathfrak{p}$ as a prime ideal of $\mathscr{O}_{K}$ , and let $p$ be the characteristic of the residue field $\mathscr{O}_{K}/\mathfrak{p}$ . Let $\operatorname{ord}_{\mathfrak{p}}$ denote the ordinal function for $\mathfrak{p}$ . On $\mathscr{O}_{K}^{\times}$ this is defined by

[TABLE]

and it extends to $K^{\times}$ in the obvious way. We let $|\cdot|_{p}$ denote the usual absolute value of the $p$ -adic field $\mathbb{Q}_{p}$ . The absolute value associated to $\mathfrak{p}$ on $K$ is

[TABLE]

Let $K_{\mathfrak{p}}$ be the $\mathfrak{p}$ -adic completion of $K$ with respect to $|\cdot|_{\mathfrak{p}}$ ; we also use $|\cdot|_{\mathfrak{p}}$ for the absolute value on $K_{\mathfrak{p}}$ .

We fix once and for all an algebraic closure $\bar{\mathbb{Q}}_{p}$ of $\mathbb{Q}_{p}$ , and let $\mathbb{C}_{p}$ denote the completion of $\bar{\mathbb{Q}}_{p}$ . We use $|\cdot|_{p}$ to denote the natural extension of $|\cdot|_{p}$ to all of $\mathbb{C}_{p}$ . We define $\operatorname{ord}_{p}$ on $\mathbb{C}_{p}^{\times}$ to satisfy

[TABLE]

As $p\mathscr{O}_{K}$ may split into several prime ideals, the absolute value $|\cdot|_{p}$ on $\mathbb{Q}$ may have several inequivalent extesnions to $K$ , of which $|\cdot|_{\mathfrak{p}}$ is just one; so we must take care when viewing $K_{\mathfrak{p}}$ as a subfield of $\bar{\mathbb{Q}}_{p}$ .

For any embedding $\vartheta\colon K\to\bar{\mathbb{Q}}_{p}$ , we obtain a subfield of $\bar{\mathbb{Q}}_{p}$ as the composite $\vartheta(K)\cdot\mathbb{Q}_{p}$ . By the Prolongation Theorem [21, §18.5], there exists a choice of $\vartheta$ such that ${(K_{\mathfrak{p}},|\cdot|_{\mathfrak{p}})}$ is value-isomorphic to ${(\vartheta(K)\mathbb{Q}_{p},|\cdot|_{p})}$ . Henceforth, we always use this isomorphism to view $K_{\mathfrak{p}}$ as a subfield of $\bar{\mathbb{Q}}_{p}$ . As the isomorphism respects the valuations, we know $\operatorname{ord}_{p}$ and $\operatorname{ord}_{\mathfrak{p}}$ satisfy

[TABLE]

2.3. Height functions

Suppose $n\geq 1$ . We let $h$ denote the standard logarithmic Weil height on $\mathbb{P}^{n}(K)$ . This is defined as follows: for any ${\mathbf{x}=(x_{0}:\cdots:x_{n})}\in\mathbb{P}^{n}(K)$ ,

[TABLE]

where the sum runs over all places of $K$ . It is a consequence of the product formula ([21, Ch. 20, pgs. 326–327]) that $h(\mathbf{x})$ is independent of the choice of coordinates for $\mathbf{x}$ . For any $\alpha\in K$ , set ${h(\alpha)=h((1:\alpha))}$ . Note that this height is absolute in the sense that it is not dependent on which field extension $K$ containing the coordinates of $\mathbf{x}$ is considered.

We introduce a modified version of this height function, used in §3. Suppose $\alpha_{1},\dots,\alpha_{n}\in K$ , and let $K^{\prime}=\mathbb{Q}(\alpha_{1},\dots,\alpha_{n})\subseteq K$ . For any nonzero element $\beta\in K^{\prime}$ , we define the function $h^{\prime}$ by

[TABLE]

The definition of another height function, $h_{\mathfrak{p}}$ , is slightly more technical and will be introduced when needed in §3.

2.4. $p$ -adic logarithms

Inside $\mathbb{C}_{p}$ , consider the open disk

[TABLE]

On $\Delta_{1}$ , we define the $p$ -adic logarithm by the series

[TABLE]

The series is convergent on $\Delta_{1}$ ; moreover, on $\Delta_{1}$ it satisfies the identity

[TABLE]

If $|z|_{p}<p^{-\frac{1}{p-1}}$ we have

[TABLE]

Based on an idea due to Iwasawa, the $p$ -adic logarithm can be extended to any $z\in\mathbb{C}_{p}$ such that $|z|_{p}=1$ ; this extension continues to satisfy (4) (see [37, II.2.4]).

2.5. Solutions to the $S$ -unit equation

We let $A_{K,S}$ denote the additive $\mathbb{Z}$ -module $(\mathbb{Z}/w\mathbb{Z})\times\mathbb{Z}^{t}$ . This is isomorphic to $\mathscr{O}_{K,S}^{\times}$ , and the list of generators ${\bm{\uprho}}$ determines an isomorphism

[TABLE]

We use the shorthand ${\bm{\uprho}}^{\mathbf{a}}:=\Phi_{\bm{\uprho}}(\mathbf{a})$ . For obvious reasons, we call the elements of $A_{K,S}$ exponent vectors. Much of our discussion will focus on bounds for the entries of an exponent vector. For $\mathbf{a}\in A_{K,S}$ , we use the notation $|\mathbf{a}|\leq B$ to signify

[TABLE]

Within $\mathscr{O}_{K,S}^{\times}$ , we wish to determine

[TABLE]

Solving the $S$ -unit equation is equivalent to determining the set $X_{K,S}$ . We let $E_{K,S}$ denote the corresponding subset $\Phi_{\bm{\uprho}}^{-1}(X_{K,S})$ of $A_{K,S}$ .

3. The Bounds of Baker-Wüstholz and Yu

Suppose $\tau_{1},\tau_{2}\in\mathscr{O}_{K,S}^{\times}$ provide a solution to the $S$ -unit equation, so that $\tau_{1}+\tau_{2}=1$ . With respect to the ordered generating set ${\bm{\uprho}}$ , there are unique vectors $\mathbf{b}_{i}=(b_{i,0},\dots,b_{i,t})\in A_{K,S}$ such that

[TABLE]

The techniques of lattice reduction discussed in §5 will not produce an absolute bound for $\left|b_{i,j}\right|$ on their own; they can only be used to improve a known bound. So in this section, we recall bounds established by Baker-Wüstholz [3] and Kunrui Yu [47]. An excellent treatment of the background material appears in [15].

3.1. Statement of Yu’s Bound

Let $\mathfrak{p}$ be a finite place of $K$ , and let $p$ denote the rational prime below $\mathfrak{p}$ . We let $q$ be the smallest rational prime distinct from $p$ (so $q=2$ unless $p=2$ , in which case $q=3$ ). Let $\zeta_{m}:=\exp(2\pi i/m)$ . We say $K$ satisifies Yu’s auxiliary condition if any of the following hold:

(i)

$q=2$ and $p^{f_{\mathfrak{p}}}\equiv 1\bmod{4}$ , 2. (ii)

$q=2$ and $\zeta_{4}\in K$ , 3. (iii)

$q=3$ and $\zeta_{3}\in K$ .

At the end of this section, we explain how the algorithm finds a bound in cases where $K$ does not satisfy Yu’s auxiliary condition.

Theorem 3.1 (Yu, [47, pg. 190]).

Suppose $n\geq 1$ and $\mathfrak{p}$ is a prime of $\mathscr{O}_{K}$ . Suppose $K$ is a number field satisfying Yu’s auxiliary condition and $\mu_{0},\mu_{1},\dots,\mu_{n-1}\in K^{\times}$ are chosen which satisfy

[TABLE]

Suppose $b_{j}\in\mathbb{Z}$ and $\Theta:=\prod\limits_{j=0}^{n-1}\mu_{j}^{b_{j}}\neq 1$ . Finally, suppose $B$ satisfies

[TABLE]

Then there exist explicit constants $C_{1}^{*}$ and $\Omega$ , given below, such that

[TABLE]

3.2. The constants $\Omega$ and $\Omega^{\prime}$

We first discuss the constant $\Omega$ , and the variant $\Omega^{\prime}$ used in the algorithm. In Theorem 3.1, $\Omega$ is roughly a product of the logarithmic heights of the $\mu_{j}$ . More precisely, decompose the set $\{\mu_{j}\}_{j}$ into a disjoint union $\mathfrak{a}\cup\mathfrak{b}$ , where $\mathfrak{a}$ is a maximal subset of $\{\mu_{j}\}_{j}$ which is multiplicatively independent. Such a decomposition need not be unique. Because of the possible dependence among the $\mu_{j}$ , Yu requires a modified height function:

[TABLE]

(The value $\kappa_{1}$ is explained in the following subsection.) The constant $\Omega$ , which depends on $n$ , $d_{K}$ , $\mathfrak{p}$ as well as the $\mu_{j}$ , is then defined by

[TABLE]

As shown in [47], one may choose any maximal independent set $\mathfrak{a}$ for the computation of $\Omega$ . If optimization of the bound is critical, one may search over all possible $\mathfrak{a}$ and take the smallest possible bound. This observation is moot in our use, however.

Corollary 3.2.

Keeping the hypotheses of the previous theorem, suppose also that $\mu_{1},\dots,\mu_{n-1}$ are multiplicatively independent. Set

[TABLE]

Then $\displaystyle\operatorname{ord}_{\mathfrak{p}}\left(\Theta-1\right)<C_{1}^{*}\Omega^{\prime}\log B$ .

Proof.

In this case, $\mathfrak{b}$ is unique; either $\mathfrak{b}=\{\mu_{0}\}$ or $\mathfrak{b}=\varnothing$ . In either case, $\Omega\leq\Omega^{\prime}$ and the result follows immediately. ∎

In the algorithm, we are always in the situation of the Corollary. Rather than decide the question of independence between $\mu_{0}$ and the other $\mu_{j}$ , we just use the constant $\Omega^{\prime}$ .

3.3. The constant $C_{1}^{*}$

The value of $C_{1}^{*}:=C_{1}^{*}(n,d_{K},\mathfrak{p})$ is dependent on $n$ , $d_{K}$ , and $\mathfrak{p}$ , as follows. Let $u:=\operatorname{ord}_{q}w$ , so that $q^{u}$ is the $q$ -part of $w$ . Set

[TABLE]

Here, $e$ denotes the base of the natural logarithm. The constants $a^{(1)}$ , $\kappa_{1}$ , and $c^{(1)}$ are given in Tables 3.1 and 3.2. Finally,

[TABLE]

3.4. A Remark about implementation

For this subsection only, suppose all hypotheses in Theorem 3.1 are satisfied, except $K$ does not satisfy Yu’s auxiliary condition. Set

[TABLE]

Let $\mathfrak{P}$ be a prime of $\mathscr{O}_{K^{\prime}}$ above $\mathfrak{p}$ . Let $e_{\mathfrak{P}\mid\mathfrak{p}}$ be the ramification index of $\mathfrak{P}$ over $\mathfrak{p}$ . Because

[TABLE]

we see $\operatorname{ord}_{\mathfrak{P}}\mu_{j}=0$ for all $j$ . Now Theorem 3.1 applies with $K^{\prime}$ and $\mathfrak{P}$ in place of $K$ and $\mathfrak{p}$ , respectively.

Corollary 3.3.

Under the conditions of this subsection,

[TABLE]

Note that even if $\mathfrak{p}$ splits as $\mathfrak{P}\mathfrak{P}^{\prime}$ in $K^{\prime}$ , the choice of $\mathfrak{P}$ is irrelevant; both give the exact same bound in the Corollary.

3.5. Bound of Baker-Wüstholz

We now give an effective version of Baker’s theorem. (Notations are as in §2.2, 2.3.)

Theorem 3.4 (Baker-Wüstholz, [3, pg. 20]).

Let $L$ be a linear form in $t+1$ indeterminates,

[TABLE]

Let $B=\max\{|b_{0}|,\dots,|b_{t}|\}$ , and let $\rho_{0},\dots,\rho_{t}\in\overline{\mathbb{Q}}-\{0,1\}$ . Let $K^{\prime}$ be the subfield of $\overline{\mathbb{Q}}$ generated by the $\rho_{i}$ . If $B>3$ and

[TABLE]

then

[TABLE]

where the constant $C(t,d_{K^{\prime}})$ is defined by

[TABLE]

Note that we may be sure $\Lambda\neq 0$ if the set $\{\log\rho_{i}\}$ is linearly independent over $\mathbb{Q}$ .

3.6. Obtaining the initial bound

The theorems of Baker-Wüstholz and Yu both provide inequalities of the form “a polynomial function of $B$ is bounded by a polynomial function of $\log(B)$ ,” which in turn guarantee an absolute bound on $B$ . The analysis to determine such a bound explicitly is standard; we will use the following result of Pethő and de Weger for this purpose.

Lemma 3.5 (Pethő and de Weger [29, Lemma 2.2]).

Suppose the real numbers $a,b,h$ satisfy $a\geq 0$ , $h\geq 1$ , $b>\bigl{(}\frac{e^{2}}{h}\bigr{)}^{h}$ , and let $x\in\mathbb{R}$ be the largest solution to the equation

[TABLE]

Then

[TABLE]

4. Initial Exponent Bounds

4.1. An upper bound at the extremal place

Suppose $(\tau_{1},\tau_{2})$ is a solution to the $S$ -unit equation, with $\tau_{i}$ specified as in (6). We set $B=\max_{i,j}\left|b_{i,j}\right|$ , and assume $B\geq\max\{4,w\}$ . Relabeling $\tau_{1}$ and $\tau_{2}$ if necessary, we assume $B=|b_{1,j}|$ for some $1\leq j\leq t$ . Recall that $S$ contains precisely $t+1$ places, $\mathfrak{p}_{1},\dots,\mathfrak{p}_{t+1}$ . We choose the indices $k,\ell\in\{1,2,\dots,t+1\}$ so that

[TABLE]

Remark 4.1.

In the sequel, we number our constants in an effort to stay consistent with the enumeration given in Smart’s paper [35]. There, Smart considers a more general unit equation, and so introduces certain constants $c_{4}(i)$ , $c_{6}(i)$ , $c_{7}(i),\dots$ whose values are trivial in the present application. So while the alert reader may notice gaps in the enumeration of constants, this is intentional. (Adjusting our implementation to the more general setting is not difficult, but we are satisfied to limit the discussion to match the current state of the implementation.)

For any choice of $U:=\{\mathfrak{u}_{1},\dots,\mathfrak{u}_{t}\}\subseteq S$ define the $t\times t$ matrix

[TABLE]

One may always choose $U$ so that $M$ is invertible (see [15, §5.1]), and so we assume this is the case. We have

[TABLE]

Let $\|M\|$ be the row norm of $M^{-1}$ , i.e. $\|M\|=\max_{i}\sum_{j=1}^{t}|m_{i,j}|$ , and set

[TABLE]

Note that this differs slightly from Smart’s definition, to ensure that $c_{1}\geq 1$ . Then $B\leq c_{1}\bigl{|}\log|\tau_{1}|_{\mathfrak{p}_{k}}\bigr{|}$ . We define

[TABLE]

By [35, Lemma 2], we have

[TABLE]

We now have an upper bound on $|\tau_{1}|_{\mathfrak{p}_{\ell}}$ in terms of $B$ . We next establish a lower bound, also involving $B$ , which will force a limit on the size of $B$ . The precise argument depends on whether $\mathfrak{p}_{\ell}$ is a finite or infinite place. For the purposes of the algorithm, we must compute this bound on $B$ for each possible index $1\leq\ell\leq t$ ; we have no choice but to take the largest possible bound, i.e., the larger of the two values $K_{0}$ and $K_{1}$ determined in the remainder of this section.

4.2. Case I: $\mathfrak{p}_{\ell}$ is finite

If $\mathfrak{p}_{\ell}$ is finite, then let $\mathfrak{p}_{\ell}$ also denote the associated prime ideal in $\mathscr{O}_{K}$ . Let $p$ be the prime of $\mathbb{Z}$ lying below $\mathfrak{p}_{\ell}$ , and let $e_{\ell}$ and $f_{\ell}$ denote the ramification index and inertial degree of $\mathfrak{p}_{\ell}$ over $p$ , respectively. From (9) we have

[TABLE]

Setting

[TABLE]

the inequality (10) yields

[TABLE]

and so $\operatorname{ord}_{\mathfrak{p}_{\ell}}\tau_{2}=0$ . We would like to apply Yu’s Theorem to ${\Theta=\tau_{2}}$ , but unfortunately the generators $\rho_{i}$ may have nonzero order with respect to $\mathfrak{p}_{\ell}$ . So we now replace the $\rho_{i}$ with a different set of generators, as in [35, pgs. 824–825]. First, set $n_{i}:=\operatorname{ord}_{\mathfrak{p}_{\ell}}\rho_{i}$ . Necessarily, there exist indices $i$ for which $n_{i}\neq 0$ . Choose $i_{0}$ so that

[TABLE]

and now relabel so that $i_{0}=t$ . For $1\leq i\leq t-1$ , define

[TABLE]

so that $\operatorname{ord}_{\mathfrak{p}_{\ell}}\mu_{i}=0$ . Next, for each $i$ with $1\leq i\leq t-1$ , choose integers $d_{i},r_{i}$ such that

[TABLE]

Necessarily, $|d_{i}|\leq B$ . Set $N:=\sum_{i=1}^{t-1}n_{i}r_{i}$ .

Lemma 4.1.

We have $N\equiv 0\pmod{n_{t}}$ .

Proof.

Since $\operatorname{ord}_{\mathfrak{p}_{\ell}}(\tau_{2}\rho_{0}^{-b_{2,0}})=0$ , we know $\sum_{i=1}^{t}n_{i}b_{2,i}=0$ . Thus,

[TABLE]

proving the claim. ∎

Setting $N_{0}=\frac{N}{n_{t}}$ and $\mu_{0}:=\rho_{0}^{b_{2,0}}\rho_{t}^{-N_{0}}\cdot\prod_{i=1}^{t-1}\rho_{i}^{r_{i}}$ , we have arranged that

[TABLE]

Since $0\leq b_{2,0}<w$ and $0\leq r_{i}<|n_{t}|$ , there are only finitely many possible values for $\mu_{0}$ , and this finite set can be determined without any knowledge of $B$ or the $b_{2,i}$ . For each $\mu_{0}$ , we may apply Corollary 3.2 or 3.3 as appropriate, and obtain a constant $c^{\prime}_{8}(\ell,\mu_{0})$ such that

[TABLE]

Setting

[TABLE]

we may be sure every $S$ -unit solution satisfies

[TABLE]

Combining inequalities (11) and (13), we have

[TABLE]

Since $c_{1}\geq 1$ and $c_{8}(\ell)\geq e^{2}(\log 2)^{-1}$ , it follows that

[TABLE]

Applying Lemma 3.5 with $a=0$ , $b=c_{8}(\ell)/e_{\ell}c_{5}(\ell)$ , and $h=1$ , we may conclude

[TABLE]

Set

[TABLE]

If $\ell$ corresponds to a finite place, then $B\leq K_{0}$ .

In our implementation, the functions mus and possible_mu0s are used to recover the $\mu_{i}$ for each finite place $\mathfrak{p}_{\ell}$ . The constants $c_{8}(\ell)$ determined from Yu’s Theorem are computed in Yu_bound, while the constant $K_{0}$ , which may be of independent interest, is computed by K0_func.

4.3. Case II: $\mathfrak{p}_{\ell}$ is infinite

We now assume $\mathfrak{p}_{\ell}$ is infinite. As in §2.3, we let $\sigma_{\mathfrak{p}_{\ell}}$ denote the embedding of $K$ into $\mathbb{C}$ such that

[TABLE]

We let $\alpha^{(\ell)}$ denote $\sigma_{\mathfrak{p}_{\ell}}(\alpha)$ for any $\alpha\in K$ , and we define

[TABLE]

The condition (9) can now be expressed as

[TABLE]

The choices of $c_{11}(\ell)$ and $c_{13}(\ell)$ guarantee that

[TABLE]

Set $\Lambda:=\log\tau_{2}^{(\ell)}$ . The estimate $|\log z|\leq 2|z-1|$ holds for $|z-1|\leq\frac{1}{4}$ , and so

[TABLE]

The next step is to view $\Lambda$ as a linear form in logarithms and apply the theorem of Baker and Wüstholz. Set $\zeta:=\exp\frac{2\pi\sqrt{-1}}{w}\in\mathbb{C}$ . Since $\rho_{0}$ is a $w$ th root of unity, there exists $0\leq k<w$ such that $(\rho_{0}^{(\ell)})^{b_{2,0}}=\zeta^{k}$ . By (6), we have

[TABLE]

where we have introduced $A\in\mathbb{Z}$ to adjust for the principal branch of the logarithm. Certainly $|A|\leq tB$ , and so $|Aw+k|\leq(t+1)Bw$ . Set

[TABLE]

and $L^{\prime}(z_{0},\dots,z_{t}):=\sum_{j=0}^{t}b^{\prime}_{2,j}z_{j}$ . We now have

[TABLE]

Taking $K^{\prime}=\mathbb{Q}(\rho_{0},\dots,\rho_{t})\cong\mathbb{Q}(\zeta,\rho_{1}^{(\ell)},\dots,\rho_{t}^{(\ell)})$ , we define

[TABLE]

(Recall that $C(t,d_{K^{\prime}})$ is defined in Theorem 3.4.) We have $|b^{\prime}_{2,j}|\leq B^{\prime}:=(t+1)Bw$ . Applying Theorem 3.4 to $\Lambda$ , we obtain

[TABLE]

Combining this inequality with (14), we obtain

[TABLE]

This yields the inequality

[TABLE]

where

[TABLE]

As $c_{13}(\ell)\leq\frac{1}{t}$ and $c_{14}(\ell)\geq 32^{3}$ , we have $a(\ell)\geq 0$ and ${b(\ell)\geq e^{2}}$ . So by Lemma 3.5, $B<c_{15}(\ell)$ (provided $B\geq c_{11}(\ell)$ ), where

[TABLE]

Thus, setting

[TABLE]

we may be sure $B\leq K_{1}$ . In our implementation, the constant $K_{1}$ is computed in the function K1_func.

Combining all the results of this section, we obtain the following.

Lemma 4.2.

The constant $B$ satisfies $B\leq\max\{4,w,K_{0},K_{1}\}$ .

5. LLL Reduction

In this section we explain how we can reduce the upper bound we have computed in Section 4. This is necessary, because in practice the size of the initial bound is extremely large and cannot be used for practical computations. The idea of the method we will present here has its origin in de Weger’s thesis [13, 12, 14] where he develops a method based on multi-dimensional approximation lattices of linear form of $p$ -adic numbers to solve (among many other equations) $S$ -unit equations222It is worth mentioning the recent results of von Känel and Matschke [30], who solve $S$ -unit equations using modularity. over $\mathbb{Q}$ . These ideas of de Weger have been extended by himself and others to apply over any number field $K$ , and have also been used for the solution of other exponential Diophantine equations [40, 41, 42, 35].

In the reduction step we use the LLL reduction algorithm on lattices generated by integer matrices. So instead of the classical LLL algorithm [24], we use the algorithm in [12]. If $\mathscr{L}$ is a lattice in $\mathbb{R}^{n}$ , let $\mathscr{L}^{*}=\mathscr{L}-\{\mathbf{0}\}$ . For $\mathbf{y}\in\mathbb{R}^{n}$ , we define

[TABLE]

Computing the exact value of $\ell(\mathscr{L},\mathbf{y})$ is a very challenging problem in general. Instead, the function minimal_vector computes a lower bound using standard properties of a reduced basis of a lattice and the LLL algorithm (see [37, Chapter V]). As in the previous section, we follow Smart’s notation in [35]. Most of the material we present in this section can also be found in [37, 15].

We preserve the meaning of $\mathfrak{p}_{\ell}$ from §4. When $\mathfrak{p}_{\ell}$ is a finite place, we let $p$ denote the prime of $\mathbb{Z}$ lying below $\mathfrak{p}_{\ell}$ . We continue to assume $B\geq\max\{4,w\}$ in this section.

5.1. Finite places

Suppose $\mathfrak{p}_{\ell}$ is a finite place. Set

[TABLE]

and suppose that $B\geq c_{16}(\ell)$ . Define $\Delta_{2}\in K_{\mathfrak{p}_{\ell}}$ as $\Delta_{2}:=\log_{p}\tau_{2}$ . Combining (11), (2), and ${B\geq c_{16}(\ell)}$ , shows that $\operatorname{ord}_{p}\tau_{1}>1$ . Consequently, $|\tau_{1}|_{p}<p^{-\frac{1}{p-1}}$ , and by (5),

[TABLE]

Let $\mu_{i},d_{i}$ be as given in (12), so that we have

[TABLE]

Choose $\theta\in K_{\mathfrak{p}_{\ell}}$ such that $K_{\mathfrak{p}_{\ell}}=\mathbb{Q}_{p}(\theta)$ , and let $\operatorname{Disc}(\theta)$ denote the discriminant of $\theta$ . Set $D_{p}(\theta)=\operatorname{ord}_{p}\operatorname{Disc}(\theta)$ and $n=[K_{\mathfrak{p}_{\ell}}:\mathbb{Q}_{p}]$ , so that $n=e_{\ell}f_{\ell}$ . Expressing $\Delta_{2}$ with respect to the power basis, we obtain $\Delta_{2,k}\in\mathbb{Q}_{p}$ such that $\Delta_{2}=\sum_{k=0}^{n-1}\Delta_{2,k}\theta^{k}$ . Further, we may express

[TABLE]

Using an idea due to Evertse [42, p.257], we have

[TABLE]

Define

[TABLE]

and choose $\lambda\in\mathbb{Q}_{p}$ such that $\operatorname{ord}_{p}\lambda=c_{17}(\ell)$ .

Should there be some index $k$ such that $c_{17}(\ell)>\operatorname{ord}_{p}(a_{0,k})$ , then $\operatorname{ord}_{p}\Delta_{2,k}=\operatorname{ord}_{p}a_{0,k}<c_{17}(\ell)$ , and consequently

[TABLE]

For the remainder, then, we assume

[TABLE]

By the choice of $\lambda$ , $\kappa_{j,k}:=a_{j,k}/\lambda$ is a $p$ -adic integer for all $j,k$ , and we may rewrite (17) as

[TABLE]

For any $a\in\mathbb{Z}_{p}$ and a positive integer $z$ , let $a^{(z)}$ denote the unique integer between [math] and $p^{z}$ such that $a\equiv a^{(z)}\pmod{p^{z}}$ . For a positive integer $u$ , let $\mathscr{L}$ be the lattice generated by the columns of the matrix

[TABLE]

Define

[TABLE]

Also set

[TABLE]

The following lemma is a restatement of [35, Lemma 5], and provides an opportunity to improve the bound on $B$ .333Note $s_{d}$ in [35] has the value $t-1$ in our notation.

Lemma 5.1.

If $\ell(\mathscr{L},\mathbf{y})>\sqrt{t-1}\cdot K_{0}$ , then $\displaystyle B<K_{0}^{\mathrm{LLL}}(\ell)$ .

In the function p_adic_LLL_bound we have implemented the above analysis. In more detail, the functions log_p and embedding_to_Kp are used to compute the constants $a_{j,k}\in\mathbb{Q}_{p}$ up to a given precision. If this precision is $M$ , i.e., if the $a_{j,k}$ are stored as pairs of integers modulo $p^{M}$ , we clearly require $M>u$ for the algorithm to be meaningful. However, the shift by $\lambda$ requires an additional $c_{17}(\ell)$ $p$ -adic digits of precision. So the algorithm checks that $M>u+c_{17}(k)$ ; if this fails, then the $p$ -adic logarithms are computed to higher precision and the process is repeated.

The function log_p is based on an algorithm of Smart in [37, p. 30]. However, our implementation also resolves a crucial computational problem in the evaluation of $\log_{p}$ that to our knowledge has not been mentioned in the literature. To understand the issue, we must describe carefully what “computing the logarithm” means in a $p$ -adic setting. Let us view $K$ as a subfield of $K_{\mathfrak{p}}$ , and specify $\omega_{1},\dots,\omega_{n}\in K$ , a $\mathbb{Q}_{p}$ -basis for $K_{\mathfrak{p}}$ . Suppose $\alpha\in K$ and set $\beta:=\log_{p}\alpha\in K_{\mathfrak{p}}$ . Necessarily, there exist $b_{j}\in\mathbb{Q}_{p}$ such that $\beta=\sum_{j}b_{j}\omega_{j}$ .

As a practical matter, no algorithm can return the true value $\beta$ ; it can only return $\tilde{\beta}\in K$ , an approximation to $\beta$ with $|\beta-\tilde{\beta}|_{\mathfrak{p}}$ very small. In practice however, we require something more specific. We want to find $\tilde{b}_{j}\in\mathbb{Q}$ such that $|b_{j}-\tilde{b}_{j}|_{p}$ is small for each $j$ . When $p$ splits in $K$ , the original algorithm is not guaranteed to do this.

It can happen that at some other prime $\mathfrak{p}^{\prime}$ above $p$ , $\alpha$ has a negative valuation. Consequently, the sum (3) used to compute $\tilde{\beta}$ will not converge $\mathfrak{p}^{\prime}$ -adically, and the approximations $\tilde{b}_{j}$ are not guaranteed to be $p$ -adically close to the $b_{j}$ . We resolve this problem by choosing a suitable element $\eta\in K$ with $\operatorname{ord}_{\mathfrak{p}^{\prime}}\eta\geq 0$ for all $\mathfrak{p^{\prime}}\mid p$ , such that $\eta$ also satisfies

[TABLE]

Then it holds that

[TABLE]

By evaluating the difference on the right hand side, these $\mathfrak{p}^{\prime}$ -adic divergence issues are avoided, and we may be sure that the individual coefficients $\tilde{b}_{j}$ approximate the $b_{j}$ $p$ -adically.

In p_adic_LLL_bound_one_prime, we attempt to find a value $u$ such that Lemma 5.1 applies. If successful, we record the improved bound $K_{0}^{\mathrm{LLL}}(\ell)$ . The improvement offered by Lemma 5.1 depends only on the assumption that $\mathfrak{p}_{\ell}$ is the extremal place, and that some bound $K_{0}\geq c_{16}(\ell)$ on the exponents is known. So we may replace $K_{0}$ by $K_{0}^{\mathrm{LLL}}(\ell)$ and attempt to apply Lemma 5.1 again, possibly improving the bound further. Because the application of LLL is very fast compared to the sieving step described in §6, the algorithm repeats this process until no further improvements can be made to $K_{0}^{\mathrm{LLL}}(\ell)$ . Once each $K_{0}^{\mathrm{LLL}}(\ell)$ has been optimized in this way, the function p_adic_LLL_bound returns

[TABLE]

5.2. Complex places

We now consider the case where $\mathfrak{p}_{\ell}$ is an infinite complex place. The reduction is quite analogous to the $p$ -adic case; again the standard references are [35, 37, 15]. We keep the notations from §4.3. For $0\leq j\leq t$ we define the complex numbers

[TABLE]

As $\mathfrak{p}_{\ell}$ is an infinite place, we have established already that the $b^{\prime}_{2,j}$ in

[TABLE]

satisfy the bounds

[TABLE]

We now attempt to use lattice reduction to improve the bound; the choice of lattice and certain constants will depend slightly on whether the $\kappa_{j}$ are all purely imaginary. So we define

[TABLE]

and define

[TABLE]

If $\kappa_{1},\dots,\kappa_{t}$ are not all pure imaginary, relabel $\kappa_{1},\dots,\kappa_{t}$ so that $\Re\kappa_{t}\neq 0$ . Now define

[TABLE]

Let $A$ be the $(t+1)\times(t+1)$ integer matrix

[TABLE]

(By design, the upper left $t\times t$ block of $A$ is the identity matrix in case the $\kappa_{j}$ are all pure imaginary.) Now, let $\mathscr{L}$ be the lattice generated by the columns of $A$ , and suppose $m_{\mathscr{L}}$ is a positive lower bound for $\ell(\mathscr{L},\mathbf{0})$ . When $\mathfrak{p}_{\ell}$ is a infinite non-real place, we define:

[TABLE]

Similar to [37, Lemma VI.2], we have

Lemma 5.2.

Suppose $\mathfrak{p}_{\ell}$ is a non-real infinite place. With notation as above, suppose $C$ is chosen such that $m_{\mathscr{L}}^{2}>T^{2}+S$ . Then $B\leq K_{1}^{\mathrm{LLL}}(\ell)$ .

Proof.

There are two cases to consider, as $\sigma=0$ or $\sigma=1$ . In each case, our goal is to establish the inequality

[TABLE]

for the result follows by isolating $B$ in the inequality (19).

If the $\kappa_{j}$ are not all pure imaginary, we define

[TABLE]

Then note that

[TABLE]

Therefore

[TABLE]

We know from (16) that $2e^{-c_{13}(\ell)B}\geq|\Lambda|$ , so

[TABLE]

Now notice that the vector

[TABLE]

is in the lattice $\mathscr{L}$ , so $|\mathbf{y}|\geq m_{\mathscr{L}}$ . Further,

[TABLE]

which implies (19) and the result follows.

In case the $\kappa_{j}$ are all pure imaginary, the approach is similar. Set

[TABLE]

Similar to the other case, we have $\left|C\Lambda-\left(\Phi\sqrt{-1}\right)\right|\leq T$ , and therefore $|\Phi|\leq T+|C\Lambda|$ . Again applying (16) we obtain

[TABLE]

Now notice that the vector

[TABLE]

is in the lattice $\mathscr{L}$ , so $|\mathbf{y}|\geq m_{\mathscr{L}}$ . Further,

[TABLE]

Again this implies (19). ∎

5.3. Real places

Now suppose that $\mathfrak{p}_{\ell}$ is a real infinite place. Although the arguments in §5.2 apply to $\mathfrak{p}_{\ell}$ , we can obtain a stronger improvement by analyzing this case separately. Replacing $\rho_{j}$ by $-\rho_{j}$ as necessary, we may assume $\rho_{j}^{(\ell)}>0$ for all $j$ with $1\leq j\leq t$ .

The mere existence of a real place forces $w=2$ and $0\leq b_{2,0}\leq 1$ . We set $\kappa_{0}:=\pi\sqrt{-1}$ and define the real numbers

[TABLE]

As the $\kappa_{j}\in\mathbb{R}$ , we may revisit (15); this time we obtain

[TABLE]

as no adjustments are required to accommodate the branch cut of the logarithm. Set

[TABLE]

and again let $\mathscr{L}$ be generated by the columns of the matrix $A$ in (18). When $\mathfrak{p}_{\ell}$ is an infinite real place, we define:

[TABLE]

Lemma 5.3.

Suppose that $\mathfrak{p}_{\ell}$ is a real infinite place. With notation and definitions as above, suppose $C$ is chosen so that $m_{\mathscr{L}}^{2}>T^{2}+S$ . Then $B\leq K_{1}^{\mathrm{LLL}}(\ell)$ .

Proof.

If we define

[TABLE]

then we obtain

[TABLE]

Observing that the vector

[TABLE]

and that $|b_{2,j}|\leq B$ for $j>0$ , $|b_{2,0}|\leq 1$ , the remainder of the proof now follows the logic of Lemma 5.2 exactly. ∎

5.4. Implementation

The function minimal_vector is used in the implementation to compute a value for $m_{\mathscr{L}}^{2}$ . In cx_LLL_bound, we have implemented the reduction step for the infinite places applying the above idea. As in the finite case, the parameter $C$ is chosen inside the function and changed as necessary to meet the bound $m_{\mathscr{L}}^{2}>T^{2}+S$ (keeping in mind, of course, that the definitions of $S$ and $T$ depend on the particular place $\mathfrak{p}_{\ell}$ ). Notice that the proof of Lemmas 5.2 and 5.3 depend on obtaining true rounding in obtaining the coefficients of the matrix $A$ . In our implementation, we increase precision until this is assured. Similar to the case where $\mathfrak{p}_{\ell}$ is finite, the improvement of Lemma 5.2 needs only the assumption that $\mathfrak{p}_{\ell}$ is the extremal place and that some bound $K_{1}$ on the exponents is known. So we apply Lemma 5.2 repeatedly until no further improvement to $K_{1}^{\mathrm{LLL}}(\ell)$ is possible. Once this has been done for each infinite place, we set

[TABLE]

Consequently, we have the following bound which may be passed to the sieve in the next section.

Lemma 5.4.

Assume that for each $\ell$ , a value $u=u(\ell)$ or $C=C(\ell)$ exists for which the hypotheses of one of the Lemmas 5.1, 5.2, or 5.3 are met. Then the maximum exponent $B$ appearing in any solution $(\tau_{1},\tau_{2})$ of the $S$ -unit equation (1) satisfies

[TABLE]

We use the proof as an opportunity to summarize the algorithm, up to the sieving step of the next section.

Proof.

There is nothing to show if the solution set is empty, so let us assume otherwise. We know the $S$ -unit equation has only finitely many solutions. Keeping the notation of (6), let $(\tau_{1},\tau_{2})$ be a solution where $B=|\mathbf{a}_{i}|$ is maximized. One of the places in $S$ , say $\mathfrak{p}_{\ell}$ , is extremal. If $\mathfrak{p}_{\ell}$ is finite, then the work in §4.2 demonstrates ${B\leq\max\{4,w,K_{0}(\ell)\}}$ by applying one of the corollaries deduced from Yu’s bound. If $\mathfrak{p}_{\ell}$ is infinite, then the work in §4.3 demonstrates ${B\leq\max\{4,w,K_{1}(\ell)\}}$ by applying the theorem of Baker-Wüstholz. This establishes an absolute bound on $B$ .

For each possible $\ell$ , the techniques of this section attempt to replace this absolute bound with a smaller bound. There is no mathematical proof that the lattice reduction techniques will succeed, i.e., that there will exist appropriate values $u$ and $C$ for which Lemmas 5.1, 5.2, or 5.3 apply. However, when they do exist, the improved bound is provably correct by the same lemmas. Here, such success is presumed for every $\ell$ , and (20) holds. ∎

In practice, if the hypotheses of Lemma 5.4 are not established, then one only has the weaker bounds coming from linear forms of logarithms – these are simply too large to allow for a provably complete search. However, the sieve described in the next section can still be used up to any prescribed bound $B_{0}$ ; it will find all solutions satisfying $|\mathbf{a}_{i}|\leq B_{0}$ .

6. Further Reducing the Search Space: Sieving

The approach taken here, for sieving against primes outside of $S$ , is based on an algorithm described by Smart in [35]. Smart credits Tzanakis and de Weger with this approach [41]; Tzanakis reports that these ideas date back to Andrew Bremner.

6.1. Setup for the sieve

Recalling the notations of §2.5, we define for any $m>0$ ,

[TABLE]

This finite set will provide a useful search space for exponent vectors in a way we will make more precise below. There is an obvious surjective map $\pi_{m}\colon A_{K,S}\to A_{K,S,m}$ . Despite the fact that this map is the identity (and not a reduction map) in the [math]th coordinate, we will refer to this as the reduction modulo $m$ map, and call an element $\mathbf{a}\in A_{K,S,m}$ an exponent vector modulo $m$ .

Let $\tau\in\mathscr{O}_{K,S}^{\times}$ . The exponent vector for $\tau$ (relative to ${\bm{\uprho}}$ ) is $\Phi_{\bm{\uprho}}^{-1}(\tau)$ . That is, it is the unique $\mathbf{a}\in A_{K,S}$ such that $\tau={\bm{\uprho}}^{\mathbf{a}}$ . Given any bound $B$ for the exponent vector of a $\tau\in X_{K,S}$ , we obtain a finite subset of $\mathscr{O}_{K,S}^{\times}$ that contains every solution of the $S$ -unit equation. Unfortunately, this is usually still too large of a search space to be practical (see §7), so we must sieve this finite set (or rather, the equivalent finite set of exponent vectors) prior to the exhaustive search. The sieve attempts to provide an efficient solution to the following problem:

Problem 6.1.

Find a small set $Y_{K,S}$ satisfying $E_{K,S}\subseteq Y_{K,S}\subseteq A_{K,S}$ .

If we can find a small enough superset $Y_{K,S}$ in a fast enough way, the $S$ -unit equation solutions can then be found by brute force search over $Y_{K,S}$ .

Suppose $\mathbf{a}\in A_{K,S}$ . We call $\mathbf{b}\in A_{K,S}$ a complement vector for $\mathbf{a}$ if $\uprho^{\mathbf{a}}+\uprho^{\mathbf{b}}=1$ . If a complement vector exists, it must be unique; the existence of a complement vector is equivalent to $\mathbf{a}\in E_{K,S}$ , and a pair of complement exponent vectors correspond to a solution of the $S$ -unit equation.

Suppose $q\in\mathbb{Z}$ is a prime number. We say $q$ avoids $S$ if $q\not\in\mathfrak{p}$ for all ideals $\mathfrak{p}\in S$ . If $q$ splits completely in $\mathscr{O}_{K}$ , then there are $d_{K}$ prime ideals above $q$ in $\mathscr{O}_{K}$ , say $\mathfrak{q}_{0},\dots,\mathfrak{q}_{d_{K}-1}$ . We let $\mathbb{F}_{\mathfrak{q}_{j}}$ denote the residue field of $\mathfrak{q}_{j}$ . Since $q$ is completely split, we of course have $\mathbb{F}_{\mathfrak{q}_{j}}\cong\mathbb{F}_{q}$ for all $j$ .

Suppose $\tau\in\mathscr{O}_{K,S}$ , and $q$ is a rational prime number which splits completely in $\mathscr{O}_{K}$ and which avoids $S$ . The residue field vector for $\tau$ (with respect to $q$ ) is

[TABLE]

where $\tau+\mathfrak{q}_{j}\in\mathbb{F}_{\mathfrak{q}_{j}}$ is the reduction of $\tau$ modulo $\mathfrak{q}_{j}$ . The residue field vector depends on the ordering of the primes $\mathfrak{q}_{j}$ above $q$ ; we fix one ordering once and for all whenever we consider residue field vectors with respect to $q$ .

Notice that we have the following commutative diagram, whose horizontal rows are exact.

[TABLE]

Suppose $\mathbf{a}\in A_{K,S,q-1}$ . Since any two lifts $\mathbf{a}^{\prime}$ , $\mathbf{a}^{\prime\prime}$ of $\mathbf{a}$ to $A_{K,S}$ differ by a multiple of $(q-1)$ , we see that $\Phi_{\uprho}(\mathbf{a}^{\prime})$ and $\Phi_{\uprho}(\mathbf{a}^{\prime\prime})$ differ by a perfect $(q-1)$ th power, and so determine the same residue field vector. In other words, the dashed arrow in the diagram corresponds to a well-defined map $A_{K,S,q-1}\to\prod\mathbb{F}_{\mathfrak{q}_{i}}^{\times}$ , and so the notion of a residue field vector for $\mathbf{a}$ is well-defined. With this in mind, we abuse notation slightly and also write $\operatorname{rfv}_{q}\mathbf{a}:=\operatorname{rfv}_{q}\Phi_{\bm{\uprho}}(\mathbf{a}^{\prime})$ , where $\mathbf{a}^{\prime}\in A_{K,S}$ is any lift of $\mathbf{a}$ .

Lemma 6.1.

Suppose $\tau\in X_{K,S}$ and set $\eta=1-\tau$ . Then

(a)

$\operatorname{rfv}_{q}\tau+\operatorname{rfv}_{q}\eta=(1,1,\dots,1)\in\prod_{i}\mathbb{F}_{\mathfrak{q}_{i}}$ . 2. (b)

$\operatorname{rfv}_{q}\tau\in\prod_{i}\mathbb{F}_{\mathfrak{q}_{i}}^{\times}$ . 3. (c)

no entry of $\operatorname{rfv}_{q}\tau$ is $1$ .

Proof.

Since $\tau+\eta=1$ , it follows that for any $j$ , $\tau+\eta\equiv 1\pmod{\mathfrak{q}_{j}}$ , verifying (a). As $q$ avoids $S$ , $\tau\not\in\mathfrak{q}_{j}$ for every $j$ . This proves (b). Since (b) holds for both $\eta$ and $\tau$ , (c) follows from (a). ∎

Suppose $\mathbf{a}$ is an exponent vector modulo $q-1$ ; i.e., $\mathbf{a}\in A_{K,S,q-1}$ . We call $\mathbf{b}\in A_{K,S,q-1}$ a $(q-1)$ -complement vector for $\mathbf{a}$ if

[TABLE]

Existence of a $(q-1)$ -complement vector is a necessary, but not sufficient, condition for $\mathbf{a}$ to lift to the exponent vector of a unit in a solution to the $S$ -unit equation. Further, any particular $\mathbf{a}$ may have more than one $(q-1)$ -complement vector associated to it. We set

[TABLE]

6.2. Execution of the sieve

The strategy for the sieve is to play the sets $E_{K,S}(q-1)$ off of one another for multiple values of $q$ . Choose a finite list $Q$ of rational prime numbers

[TABLE]

each of which splits completely in $K$ and avoids $S$ , and such that

[TABLE]

Any true solution to the $S$ -unit equation corresponds to exponent vectors found in the set $E_{K,S}$ , and such vectors must reduce modulo $(q_{j}-1)$ to vectors in $E_{K,S}(q_{j}-1)$ for each $q_{j}\in Q$ . Conversely, given a choice $\mathbf{a}_{i}\in E_{K,S}(q_{i}-1)$ for each $0\leq i<k$ , there is at most one vector $\mathbf{a}\in A_{K,S}$ such that $\pi_{q_{i}-1}(\mathbf{a})=\mathbf{a}_{i}$ for each $i$ , while also satisfying $|\mathbf{a}|\leq B$ . Define $\pi_{Q}$ to be the product of the maps $\pi_{q_{i}-1}$ :

[TABLE]

Certainly we have

[TABLE]

Because lifts from $\prod_{i}E_{K,S}(q_{i}-1)$ to $E_{K,S}$ are unique when they exist, $\prod_{i}E_{K,S}(q_{i}-1)$ provides a reasonable proxy for the search space. We seek to replace each $E_{K,S}(q_{i}-1)$ with a subset $Y_{i}\subseteq E_{K,S}(q_{i}-1)$ such that we still have

[TABLE]

Suppose $q_{i},q_{j}$ are distinct primes in $Q$ , and suppose $\mathbf{a}_{i}\in Y_{i}$ , $\mathbf{a}_{j}\in Y_{j}$ . We say $\mathbf{a}_{i}$ and $\mathbf{a}_{j}$ are compatible if there exists $\mathbf{a}\in A_{K,S}$ such that $\pi_{q_{i}-1}(\mathbf{a})=\mathbf{a}_{i}$ and $\pi_{q_{j}-1}(\mathbf{a})=\mathbf{a}_{j}$ . Notice that for any $i\neq j$ , an element $\hat{\mathbf{a}}\in E_{K,S}$ reduces modulo $q_{i}-1$ and $q_{j}-1$ to produce a compatible pair of exponent vectors.

When $\mathbf{a}_{i}$ and $\mathbf{a}_{j}$ are compatible, we further call the pair complement compatible if there exist $\mathbf{b}_{i}\in Y_{i}$ and $\mathbf{b}_{j}\in Y_{j}$ such that

•

$\mathbf{b}_{i}$ is $(q_{i}-1)$ -complementary to $\mathbf{a}_{i}$ ,

•

$\mathbf{b}_{j}$ is $(q_{j}-1)$ -complementary to $\mathbf{a}_{j}$ ,

•

$\mathbf{b}_{i}$ and $\mathbf{b}_{j}$ are compatible.

Lemma 6.2.

Suppose the sets $Y_{i}\subseteq E_{K,S}(q_{i}-1)$ satisfy condition (21). Further, suppose $\mathbf{a}_{i}\in Y_{i}$ , and set

[TABLE]

If there exists $j\neq i$ such that $Y_{j}$ contains no vectors which are complement compatible to $\mathbf{a}_{i}$ , then

[TABLE]

In other words, under the given condition, we will lose no true solutions by removing $\mathbf{a}_{i}$ from $Y_{i}$ .

Proof.

Towards a contradiction, suppose $\mathbf{a}\in E_{K,S}$ satisfies

[TABLE]

There is a unique $\mathbf{b}\in E_{K,S}$ satisfying $\Phi_{\uprho}(\mathbf{a})+\Phi_{\uprho}(\mathbf{b})=1$ . Set

[TABLE]

Then $\mathbf{a}_{i}$ and $\mathbf{a}_{j}$ are compatible by definition. But since $\mathbf{a}_{i}$ and $\mathbf{a}_{j}$ cannot be complement compatible, the vectors $\mathbf{b}_{i}$ and $\mathbf{b}_{j}$ cannot be compatible. This is impossible, since $\mathbf{b}\in E_{K,S}$ . Thus, no such $\mathbf{a}$ exists and the claim holds. ∎

The algorithm based on this lemma is the following.

Algorithm 6.2 (Sieve).

Assume that $K$ , $S$ are fixed and a representation of $\mathscr{O}_{K,S}^{\times}$ has been computed.

**INPUT: **

$Q=[q_{0},q_{1},\dots,q_{k-1}]$

**OUTPUT: **

$Y_{0},Y_{1},\dots,Y_{k-1}$ satisfying (21).

1.

Set $Y_{i}\longleftarrow E_{K,S}(q_{i}-1)$ for each $i$ . 2. 2.

Loop over $i\in\{0,1,\dots,k-1\}$ :

(a)

Loop over $\mathbf{a}_{i}\in Y_{i}$ :

i.

If $Y_{i}$ contains no $(q_{i}-1)$ -complement vector for $\mathbf{a}_{i}$ , remove $\mathbf{a}_{i}$ from $Y_{i}$ 2. ii.

Loop over $j\in\{0,1,\dots,i-1,i+1,\dots,k-1\}$ :

$\bullet$

If there are no $\mathbf{a}_{j}\in Y_{j}$ which are complement compatible with $\mathbf{a}_{i}$ , then remove $\mathbf{a}_{i}$ from $Y_{i}$ . 3. 3.

Did Step $2$ remove any elements from any set $Y_{i}$ ?

•

If YES, return to Step 2.

•

If NO, then STOP.

Once the sieve has been completed, we may find all solutions to the $S$ -unit equation by doing an exhaustive search over $\pi_{Q}^{-1}(\prod_{i}Y_{i})$ .

7. Experimental Observations and Computational Choices

In developing this code and in pursuit of applications, we have computed a very large number of examples. Some observations and discussion may be enlightening to a reader who wishes to solve the $S$ -unit equation for their own application.

Our implementation provides the function sieve_below_bound(K, S, B), which returns all solutions to the $S$ -unit equation in $\mathscr{O}_{K,S}^{\times}$ up to a specified bound $B$ (the maximum absolute value of an entry in an exponent vector). This may be useful in settings where an exhaustive list of solutions is not needed. For example, in the field $K_{g}$ with ${g(x)=x^{3}-3x+1}$ , and $S_{\mathrm{fin}}=\{\mathfrak{p}:\mathfrak{p}\mid 2\}$ , the provable LLL-reduced exponent bound is $101$ . However, all solutions actually satisfy the exponent bound $5$ , and the command sieve_below_bound(K, S, 5) executes in under $2$ seconds.

7.1. Sieving vs. simple exhaustion

Once a bound has been reduced as much as possible by LLL, this search space must be somehow exhausted. This general problem can be solved in multiple ways. Those appearing in the literature can be generally described by the following three ideas:

(1)

simple (non-number theory-based) exhaustion, 2. (2)

sieve by reducing the problem modulo primes not in $S$ , and 3. (3)

sieve by reducing the problem modulo powers of primes in $S$ .

Idea (1) could be looked at through the more general lens of efficient programming, and a good programmer may be able to develop their own code to exhaust the search space effectively. The current implementation uses idea (2), inspired by Smart’s earlier exposition in [35] and is described here in Section 6. Item (3) paraphrases an interesting idea which is first due to de Weger in the case $K=\mathbb{Q}$ [12] and which was generalized to arbitrary number fields $K$ for $S$ -unit equations arising from Thue and Thue-Mahler equations by Tzanakis and de Weger [40, 41, 42]. Wildanger [43] and Smart [38] worked out the details of the full generalization, which was later simplified by Evertse and Győry [15]. This is an extremely promising and potentially effective method of reducing the search space, and has been implemented recently in special cases by several people, including Koutsianas [23], Bennett, Gherga, and Rechnitzer [6], von Känel and Matschke [30], and others. Future work will certainly focus on including this sieving technique for our functions.

In all these methods, we begin with the same search space as in (1), and the computational complexity of a brute force search is easy to estimate. Let $B$ be a bound for the maximum absolute value of an exponent in a solution to the $S$ -unit equation. Since we are searching for a pair $(\tau_{1},\tau_{2})\in\left(\mathscr{O}_{K,S}^{\times}\right)^{2}$ , the size of our search space is given by

[TABLE]

Thus a naïve brute force search has complexity $O\left(w^{2}(2B+1)^{2t}\right)$ . In practice, a simple exhaustive search can be carried out by checking, for each element $\tau_{1}$ of $A_{K,S}$ , whether $1-\tau_{1}$ is an $S$ -unit. Assuming this check has constant time for a fixed $K$ and $S$ , we get the less extreme complexity of $O\left(w(2B+1)^{t}\right)$ .

In carrying out computations, we find that the resources required to sieve a search space vary greatly, even for number fields of the same degree and $S$ -unit groups of the same rank. For example, we give the run time for three fields $K_{g}$ , where $S_{f}$ is the set of primes above $3$ in $K_{g}$ , in Table 7.1. The column $N$ gives the total number of distinct solutions found. In each case, the LLL-reduced bound is below $40$ , so complete sets of solutions are found in each case. Computations were performed in a paid account on the CoCalc platform in late 2018.

The resources required depend on the size of the search space, but also can vary greatly based on the particular list of primes $Q$ chosen for the sieve, and even the order of those primes! In many cases, the sieve greatly reduces the time required to exhaust the space. In others, a brute force search of the reduced search space can actually be a better choice, as the sieving computation can take a mysteriously long time. Finding a way to understand and predict these difficulties is a priority for future work. The implementation of idea (3) could also make this unnecessary. In all cases, it is worthwhile to find the smallest reduced bound possible, whether as input for the built-in sieve or for use in a brute force search.

7.2. Finite place vs. infinite place bounds

In general, we find that the LLL-reduced bounds corresponding to $\mathfrak{p}_{\ell}$ infinite are smaller than the bounds for $\mathfrak{p}_{\ell}$ finite. To illustrate this, let $\mathscr{K}$ be the set of $85$ number fields $K$ satisfying

[TABLE]

If $N\in\mathbb{Z}$ , we set

[TABLE]

For any choice of $K\in\mathscr{K}$ and $S=S_{K,N}$ where $N\in\{2,3,6\}$ , we have computed the LLL-reduced bounds under the assumption that $\mathfrak{p}_{\ell}$ is finite and under the assumption that $\mathfrak{p}_{\ell}$ is infinite. Complete bound data is available by email request to authors Malmskog or Rasmussen. Here we will consider only the case $S=S_{K,2}$ . Now, let $B_{1}(K)$ and $B_{2}(K)$ be the bounds obtained in §5 under the assumption that $\mathfrak{p}_{\ell}$ is a finite or infinite place, respectively. In Figure 1, we plot both $B_{1}(K)$ and $B_{2}(K)$ against the root discriminant of $K$ (which ranges from $1.74$ to $26.56$ in $\mathscr{K}$ .) The bound $B_{1}(K)$ usually exceeds $B_{2}(K)$ , on average by a factor of $\approx 3.00$ .

Because the disparity between these bounds is so large, we would prefer to use $B_{2}(K)$ . Generally, we have no control over whether $\mathfrak{p}_{\ell}$ is finite or infinite. However, if $S$ contains only one finite place, a small trick allows us to use $B_{2}(K)$ . If $(\tau_{1},\tau_{2})\in\mathscr{O}_{K,S}^{\times}$ is a solution to the $S$ -unit equation, note that $(\frac{1}{\tau_{1}},\frac{-\tau_{2}}{\tau_{1}})$ and $(\frac{1}{\tau_{2}},\frac{-\tau_{1}}{\tau_{2}})$ are also $S$ -unit equation solutions. We define the solution cycle of $\tau_{1}$ to be

[TABLE]

The following result is a restatement of [26, Lemma 6.3].

Lemma 7.1.

Let $K$ be a number field, and suppose $S$ is a finite set of places of $K$ containing all infinite places and at most one finite place, (i.e. $\left|S_{\mathrm{fin}}\right|=1$ ). Let $(\tau_{1},\tau_{2})$ be a solution to the $S$ -unit equation over $K$ . Then at least one element of $C(\tau_{1})$ belongs to a solution with $\mathfrak{p}_{\ell}$ corresponding to an infinite place.

This implies that under the hypothesis of the lemma, some representative of each solution cycle has an exponent vector bounded by $B_{2}(K)$ ; recovering the entire solution cycle from one representative is trivial. Thus, we can determine all solutions to the $S$ -unit equation.

It may seem that the hypothesis of Lemma 7.1 – that there is only one finite place in $S$ – is a rather specialized condition. However, many interesting arithmetic applications involve searching for objects with “good” behavior away from one prime $p$ . In such cases, we take $S=S_{K,p}$ . Should $p$ ramify in $K$ , the condition $|S_{\mathrm{fin}}|=1$ is equivalent to $p$ being totally ramified, and this is not so uncommon when $[K:\mathbb{Q}]$ is small. Here, with $S=S_{K,2}$ , the lemma applies for $72$ of the $85$ number fields in $\mathscr{K}$ .

To illustrate the utility of Lemma 7.1, consider the ratio of the sizes of the search spaces for two bounds $B_{1}(K)$ and $B_{2}(K)$ , given by

[TABLE]

This quantifies the potential savings when the better bound may be used. For $S=S_{K,2}$ , Figure 2 plots the savings $R(K)$ against the root discriminant of $K$ for the $72$ fields $K$ in $\mathscr{K}$ for which $\left|S_{\mathrm{fin}}\right|=1$ .

8. Applications

A major application of solving $S$ -unit equations is in enumerating solutions to Shafarevich-type problems, for example finding complete lists of curves of a given type with particular reduction properties. The blueprint for this implementation came from Smart’s 1997 enumeration of all genus 2 curves over $\mathbb{Q}$ with good reduction away from $p=2$ [36], building off earlier work with Merriman [27]. In 2017, Malmskog and Rasmussen used these methods to determine all Picard curves defined over $\mathbb{Q}$ with good reduction away from $p=3$ [26]. The same year, Koutsianas produced a new algorithm that uses solutions to the $S$ -unit equation to find all elliptic curves over an arbitrary number field having good reduction outside $S$ [23]. In the remainder of this article, we provide some new applications of the implementation.

8.1. Asymptotic Fermat

Let $K/\mathbb{Q}$ be a number field. We consider the nontrivial solutions $(a,b,c)\in K^{3}$ to the Fermat equation:

[TABLE]

For fixed $p$ , it follows from the work of Faltings that $\mathcal{C}_{p}(K)$ is finite, but it is reasonable to ask whether $\bigcup_{p}\mathcal{C}_{p}(K)$ is finite or infinite. Finiteness is equivalent to the condition that $\mathcal{C}_{p}(K)=\varnothing$ for sufficiently large $p$ . We say $K$ satisfies asymptotic Fermat if there exists a bound $B_{K}$ such that $p>B_{K}$ implies $\mathcal{C}_{p}(K)=\varnothing$ .

There are several number fields $K$ known to satisfy asymptotic Fermat: Jarvis-Meekin [22] demonstrate that $K=\mathbb{Q}(\sqrt{2})$ satisfies asymptotic Fermat with $B_{K}=4$ . Freitas-Siksek give an explicit family of real quadratic fields of density $\geq\frac{5}{6}$ which satisfy asymptotic Fermat. They also report that the real quartic field, $K=\mathbb{Q}(\sqrt{2+\sqrt{2}})$ satisfies asymptotic Fermat.

In [16], Freitas and Siksek find a condition on a totally real field $K$ which guarantees that $K$ satisfies asymptotic Fermat. For the remainder, suppose $K$ is totally real. Define

[TABLE]

Theorem 8.1 (Freitas-Siksek).

Let $K/\mathbb{Q}$ be a totally real number field, with either $[K:\mathbb{Q}]$ odd or $T$ nonempty. Suppose that for every solution $(\tau_{1},\tau_{2})$ to the $S$ -unit equation, there is some $\mathfrak{p}\in T$ such that $\max\{|\operatorname{ord}_{\mathfrak{p}}(\tau_{1})|,|\operatorname{ord}_{\mathfrak{p}}(\tau_{2})|\}\leq 4\operatorname{ord}_{\mathfrak{p}}(2)$ . Then $K$ satisfies asymptotic Fermat.

Remark 8.1.

We note that Freitas-Siksek’s result is actually stronger, and they provide additional conditions under which $K$ must satisfy asymptotic Fermat. Also, more recent work of Şengün-Siksek [33] provides similar criteria for arbitrary number fields. However, the above formulation is sufficient for our application.

The reader may recall that Wiles’s classic proof of Fermat’s Last Theorem proceeds by taking a hypothetical solution $(a,b,c)$ and noting that the associated Frey elliptic curve is forced to satisfy an impossible set of constraints (that the curve is not modular). Freitas and Siksek’s approach is similar. Given a solution to $\mathcal{C}_{p}$ over $K$ , they produce an elliptic curve $E/K$ (related to, but distinct from, the Frey curve) whose $j$ -invariant is arithmetically constrained. However, the $j$ -invariant is determined by the $\lambda$ -invariants of $E$ ; these $\lambda$ are guaranteed to arise as solutions to the $S$ -unit equation over $K$ . The result above follows from a delicate analysis of how these constraints interact.

We report a new list of cubic number fields $K/\mathbb{Q}$ which satisfy asymptotic Fermat. Using the implementation of the algorithm described in this paper, we find all solutions to the $S$ -unit equation ( $S$ as above), and verify the condition of Freitas-Siksek (this last step is trivial once all solutions have been determined).

Let $\mathscr{K}_{X}$ denote the set of totally real cubic number fields in which $2$ is totally ramified and which have absolute discriminant $\Delta_{K}$ satisfying $|\Delta_{K}|\leq X$ . Table 8.1 lists all the fields of $\mathscr{K}_{X}$ for $X=2000$ . For each $K\in\mathscr{K}_{X}$ , we solved the appropriate $S$ -unit equation, and by applying Theorem 8.1, verified that $K$ satisfies asymptotic Fermat. Our results are not effective, as Theorem 8.1 does not provide the bound $B_{K}$ .

For each $K\in\mathscr{K}_{2000}$ , $f_{K}$ denotes a minimal polynomial for $K/\mathbb{Q}$ ; $\Delta_{K}$ is the absolute discriminant of $K$ . Because $2$ is totally ramified, Lemma 7.1 guarantees that every solution cycle will contain a solution with the extremal place $\mathfrak{p}_{\ell}$ infinite. Consequently, each solution cycle will contain at least one solution $(\tau_{1},\tau_{2})$ satisfying

[TABLE]

(Finding the remaining solutions in the solution cycle is trivial even if they do not satisfy this bound.)

Finally, $N(S,K)$ indicates the number of distinct solutions $(\tau_{1},\tau_{2})$ to the $S$ -unit equation found. (These are unordered solutions, so that $(\tau_{1},\tau_{2})$ and $(\tau_{2},\tau_{1})$ are not considered distinct.) The reader should note that the two trivial solutions over $\mathbb{Q}$ , $(-1,2)$ and $(\frac{1}{2},\frac{1}{2})$ , are counted in each field $K$ .

8.2. Cubic Ramanujan-Nagell equations

In 1913, Ramanujan conjectured that the only solutions of the Diophantine equation $x^{2}+7=2^{n}$ over the natural numbers satisfy $x\in\{1,3,5,11,181\}$ [31]. This was settled in 1948 by Nagell [28]. The more general family of equations,

[TABLE]

are called Ramanujan-Nagell equations, and the literature for solving such equations is very rich (see for example [10, 9, 11, 7]). Very recently cubic Ramanujan-Nagell equations, have attracted the attention of mathematicians [5]. These are equations of the form

[TABLE]

We consider the particular example

[TABLE]

If $q=2$ , a more general version of (22) is solved in [5]. Here, we prove the following theorem.

Theorem 8.2.

Let $q$ be a prime with $3<q\leq 500$ . All integer solutions of the cubic Ramanujan-Nagell equation (22) with $k,n>0$ are listed in Table 8.2.

Our method also works for the equation $x^{3}+p^{k}=q^{n}$ , where $p,q$ are different odd primes, and the proof is similar to the case $p=3$ .

Proof.

Let $K$ be the splitting field for $f(x)=x^{3}+3$ . We observe $K$ is unramified outside $\{3,\infty\}$ . In fact, $K$ has class number $1$ and is totally ramified at $3$ . Let $\mathfrak{p}=\pi\mathscr{O}_{K}$ be the unique prime in $K$ above $3$ . Let $S$ be the set of all places of $K$ above $3$ , $q$ , or $\infty$ .

Suppose $(q,x,k,n)$ is a solution to (22). Let $\beta$ be a root of $f(x)$ , and let $\zeta$ denote a primitive cube root of unity. Define

[TABLE]

Then we must have $\alpha_{0}\alpha_{1}\alpha_{2}=q^{n}$ and $\mathfrak{a}_{0}\mathfrak{a}_{1}\mathfrak{a}_{2}=q^{n}\mathscr{O}_{K}$ . For $i\neq j$ ,

[TABLE]

Since $(3,q)=1$ , we see $\operatorname{ord}_{\mathfrak{p}}\alpha_{i}=0$ for each $i$ . Also, it follows that the $\mathfrak{a}_{i}$ are pairwise coprime. Thus, if $\mathfrak{q}\mid q\mathscr{O}_{K}$ , then exactly one $\mathfrak{a}_{i}$ is divisible by $\mathfrak{q}$ , and $\operatorname{ord}_{\mathfrak{q}}\alpha_{i}=n$ . Now fix $i^{\prime}\in\{0,1,2\}$ so that $\operatorname{ord}_{\mathfrak{q}}\alpha_{i^{\prime}}=n$ for at least one $\mathfrak{q}\mid q$ . Choose $j^{\prime}\neq i^{\prime}$ and set

[TABLE]

Then $(\tau_{1},\tau_{2})$ is a solution to the $S$ -unit equation and $\operatorname{ord}_{\mathfrak{q}}\tau_{1}=n$ for some $\mathfrak{q}\mid q$ . Choose a root of unity $\rho_{0}$ and a basis $\rho_{1},\dots,\rho_{t}$ for the torsion-free part of $\mathscr{O}_{K,S}^{\times}$ . Choose $b_{i,j}\in\mathbb{Z}$ such that

[TABLE]

There exists $B$ such that $|b_{i,j}|\leq B$ . Define

[TABLE]

and set $c_{q}:=\max\{c_{\mathfrak{q}}:\mathfrak{q}\mid q\}$ . By design,

[TABLE]

With these bounds established, the solutions to (22) may now be determined by exhaustion. ∎

As a final remark, we observe that we may choose ${\bm{\uprho}}$ so that $c_{3}=c_{q}=1$ . Let $\mathfrak{q}_{1},\dots,\mathfrak{q}_{g}$ be the prime ideals in $K$ above $q$ . As $\mathscr{O}_{K}$ is a PID, we may choose $\lambda_{i}\in\mathscr{O}_{K}$ such that $\mathfrak{q}_{i}=\lambda_{i}\mathscr{O}_{K}$ . Let $\xi_{1},\xi_{2}$ generate the torsion-free part of $\mathscr{O}_{K}^{\times}$ . The choice ${\bm{\uprho}}=[\rho_{0},\xi_{1},\xi_{2},\pi,\lambda_{1},\dots,\lambda_{g}]$ now gives $c_{3}=c_{q}=1$ .

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Baker. Linear forms in the logarithms of algebraic numbers. I, II, III. Mathematika 13 (1966), 204-216; ibid. 14 (1967), 102-107; ibid. , 14:220–228, 1967.
2[2] A. Baker and H. Davenport. The equations 3 x 2 − 2 = y 2 3 superscript 𝑥 2 2 superscript 𝑦 2 3x^{2}-2=y^{2} and 8 x 2 − 7 = z 2 8 superscript 𝑥 2 7 superscript 𝑧 2 8x^{2}-7=z^{2} . Quart. J. Math. Oxford , 20(2):129-137, 1969.
3[3] A. Baker and G. Wüstholz. Logarithmic forms and group varieties. J. Reine Angew. Math. , 442:19–62, 1993.
4[4] A. Baker and G. Wüstholz. Logarithmic forms and Diophantine geometry , volume 9 of New Mathematical Monographs . Cambridge University Press, Cambridge, 2007.
5[5] M. Bauer and M. A. Bennett. Ramanujan-Nagell cubics. Rocky Mountain J. Math. , 48(2):385–412, 2018.
6[6] M. A. Bennett, A. Gherga, and A. Rechnitzer. Computing elliptic curves over ℚ ℚ \mathbb{Q} . Math. Comp. , 88(317):1341–1390, 2019.
7[7] M. A. Bennett and C. M. Skinner. Ternary Diophantine equations via Galois representations and modular forms. Canad. J. Math. , 56(1):23–54, 2004.
8[8] A. Brumer. On the units of algebraic number fields. Mathematika , 14:121–124, 1967.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

A robust implementation for solving the SSS-unit equation and several applications

Abstract.

1. Introduction

1.1. Overview

Acknowledgments

2. Notation

2.1. SSS-units in number fields

2.2. Absolute Values and Completions

2.3. Height functions

2.4. ppp-adic logarithms

2.5. Solutions to the SSS-unit equation

3. The Bounds of Baker-Wüstholz and Yu

3.1. Statement of Yu’s Bound

Theorem 3.1** (Yu, [47, pg. 190]).**

3.2. The constants Ω\OmegaΩ and Ω′\Omega^{\prime}Ω′

Corollary 3.2**.**

Proof.

3.3. The constant C1∗C_{1}^{*}C1∗​

3.4. A Remark about implementation

Corollary 3.3**.**

3.5. Bound of Baker-Wüstholz

Theorem 3.4** (Baker-Wüstholz, [3, pg. 20]).**

3.6. Obtaining the initial bound

Lemma 3.5** (Pethő and de Weger [29, Lemma 2.2]).**

4. Initial Exponent Bounds

4.1. An upper bound at the extremal place

Remark 4.1**.**

4.2. Case I: pℓ\mathfrak{p}_{\ell}pℓ​ is finite

Lemma 4.1**.**

Proof.

4.3. Case II: pℓ\mathfrak{p}_{\ell}pℓ​ is infinite

Lemma 4.2**.**

5. LLL Reduction

5.1. Finite places

Lemma 5.1**.**

5.2. Complex places

Lemma 5.2**.**

Proof.

5.3. Real places

Lemma 5.3**.**

Proof.

5.4. Implementation

Lemma 5.4**.**

Proof.

6. Further Reducing the Search Space: Sieving

6.1. Setup for the sieve

Problem 6.1**.**

Lemma 6.1**.**

Proof.

6.2. Execution of the sieve

Lemma 6.2**.**

Proof.

Algorithm 6.2** (Sieve).**

7. Experimental Observations and Computational Choices

7.1. Sieving vs. simple exhaustion

7.2. Finite place vs. infinite place bounds

Lemma 7.1**.**

8. Applications

8.1. Asymptotic Fermat

Theorem 8.1** (Freitas-Siksek).**

Remark 8.1**.**

8.2. Cubic Ramanujan-Nagell equations

Theorem 8.2**.**

Proof.

A robust implementation for solving the $S$ -unit equation and several applications

2.1. $S$ -units in number fields

2.4. $p$ -adic logarithms

2.5. Solutions to the $S$ -unit equation

Theorem 3.1 (Yu, [47, pg. 190]).

3.2. The constants $\Omega$ and $\Omega^{\prime}$

Corollary 3.2.

3.3. The constant $C_{1}^{*}$

Corollary 3.3.

Theorem 3.4 (Baker-Wüstholz, [3, pg. 20]).

Lemma 3.5 (Pethő and de Weger [29, Lemma 2.2]).

Remark 4.1.

4.2. Case I: $\mathfrak{p}_{\ell}$ is finite

Lemma 4.1.

4.3. Case II: $\mathfrak{p}_{\ell}$ is infinite

Lemma 4.2.

Lemma 5.1.

Lemma 5.2.

Lemma 5.3.

Lemma 5.4.

Problem 6.1.

Lemma 6.1.

Lemma 6.2.

Algorithm 6.2 (Sieve).

Lemma 7.1.

Theorem 8.1 (Freitas-Siksek).

Remark 8.1.

Theorem 8.2.