This paper investigates the action of the Markoff group on solutions to the Markoff equation modulo primes and composites, showing it often acts as the full symmetric or alternating group, with implications for group actions on finite simple groups.
Contribution
It proves that for most primes, the Markoff group acts as the full symmetric or alternating group on solutions, and extends this transitivity to many composite moduli, connecting to automorphism group actions.
Findings
01
The group acts as the full symmetric or alternating group for most primes.
02
Transitivity extends to solutions modulo many composite numbers.
03
Connections established with automorphism groups of free groups and simple groups.
Abstract
The Markoff group of transformations is a group Γ of affine integral morphisms, which is known to act transitively on the set of all positive integer solutions to the equation x2+y2+z2=xyz. The fundamental strong approximation conjecture for the Markoff equation states that for every prime p, the group Γ acts transitively on the set X∗(p) of non-zero solutions to the same equation over Z/pZ. Recently, Bourgain, Gamburd and Sarnak proved this conjecture for all primes outside a small exceptional set. In the current paper, we study a group of permutations obtained by the action of Γ on X∗(p), and show that for most primes, it is the full symmetric or alternating group. We use this result to deduce that Γ acts transitively also on the set of non-zero solutions in a big class of composite moduli.…
Tables2
Table 1. Table 1: The structure of rot 1 ∈ Q p subscript rot 1 subscript 𝑄 𝑝 \mathrm{rot}_{1}\in Q_{p} when p ≡ 1 ( 4 ) 𝑝 1 4 p\equiv 1\left(4\right) ,
as follows from Lemma 2.2 .
In the rightmost column, every set { x , − x } 𝑥 𝑥 \left\{x,-x\right\} is counted
once.
type of
# ’s up to sign
cycle-structure for
parabolic
1
a single -cycle
hyperbolic (including )
,
For every , there are
hyperbolic such that
has cycles of length each. (If is odd,
, if is even, .)
elliptic
For every , there are
elliptic such that
has cycles of length each. ()
Table 2. Table 2: The structure of rot 1 ∈ Q p subscript rot 1 subscript 𝑄 𝑝 \mathrm{rot}_{1}\in Q_{p} when p ≡ 3 ( 4 ) 𝑝 3 4 p\equiv 3\left(4\right) ,
as follows from Lemma 2.3
type of
# ’s up to sign
eigenvalues of
cycle-structure of
hyperbolic
For every , there are
hyperbolic such that
has cycles of length each. ()
elliptic (exc. ): &
For every , there are
elliptic such that
has cycles of length each. (If is odd,
, if is even, .)
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Full text
The Markoff Group of Transformations
in Prime and Composite Moduli
Chen Meiri and Doron Puder
with an Appendix by Dan Carmon
Abstract
The Markoff group of transformations is a group Γ of affine
integral morphisms, which is known to act transitively on the set
of all positive integer solutions to the equation x2+y2+z2=xyz.
The fundamental strong approximation conjecture for the Markoff equation
states that for every prime p, the group Γ acts transitively
on the set X∗(p) of non-zero solutions to the same
equation over \nicefracZpZ. Recently, Bourgain,
Gamburd and Sarnak proved this conjecture for all primes outside a
small exceptional set.
In the current paper, we study a group of permutations obtained by
the action of Γ on X∗(p), and show that for
most primes, it is the full symmetric or alternating group. We use
this result to deduce that Γ acts transitively also on the
set of non-zero solutions in a big class of composite moduli.
Our result is also related to a well-known theorem of Gilman and Evans,
stating that for any finite non-abelian simple group G and r≥3,
the group Aut(Fr) acts on at least one
“Tr-system” of G as the alternating or symmetric group.
In this language, our main result translates to that for most primes
p, the group Aut(F2) acts on a particular
T2-system of PSL(2,p) as the alternating
or symmetric group.
The Markoff surface X is the affine surface in A3
defined by the equation111Sometimes the Markoff equation is written as x2+y2+z2=3xyz.
However, these two equations are equivalent in the sense that their
integer solutions are related bijectively by (x,y,z)⟷(3x,3y,3z).
This bijection holds also for solutions in \nicefracZpZ
for every prime p=3.
[TABLE]
The Markoff triples M††margin:
M
is the set
of positive integer solutions to Equation (1), such
as (3,3,3). The Markoff group of automorphisms of X
is the group Γ††margin:
Γ
generated by permutations
of the coordinates and the Vieta involutions R1, R2 and
R3††margin:
Ri
where R3(x,y,z)=(x,y,xy−z)
and R1 and R2 are defined analogously. It is easy to see
that M is invariant under Γ and Markoff proved that
Γ acts transitively on M [Mar79, Mar80].
Let Δ be the group generated by Γ and the involutions
that replace two of the coordinates by their negatives. Then the set
X(Z) of integer solutions to (1)
has two Δ-orbits: {(0,0,0)} and
its complement X∗(Z)=defX(Z)∖{(0,0,0)}.
Prime Moduli
If p is a prime number, then X(\nicefracZpZ)
is the finite set of solutions to (1) in \nicefracZpZ,
and we denote X∗(p)=X(\nicefracZpZ)∖{(0,0,0)}††margin:
X∗(p)
.
The strong approximation conjecture for the Markoff equation (1)
states that for every prime p, the reduction mod p of the set
of Markoff triples M→X∗(p) is onto. This
is clearly equivalent to Γ acting transitively on X∗(p).
Recently, Bourgain, Gamburd and Sarnak proved this conjecture for
all primes outside of a small exceptional set:
Let E be the set of primes for which Γ
does not act transitively on X∗(p). For any ε>0,
the number of primes p≤T with p∈E is at most Tε,
for T large enough.
Moreover, for any ε>0, the largest Γ-orbit in
X∗(p) is of size at least ∣X∗(p)∣−pε,
for p large enough (whereas ∣X∗(p)∣∼p2).
Let Γp††margin:
Γp
be the finite permutation
group induced by the action of Γ on X∗(p).
In the current work we study the nature of this group. The first step
here is to notice that Γp preserves a block structure as
follows:
For (x,y,z)∈X∗(p) denote by [x,y,z]††margin:
[x,y,z]
the block of all solutions obtained from (x,y,z) by
sign changes, so
[TABLE]
Then Γp preserves this block structure. Let Y∗(p)††margin:
Y∗(p)
denote the set of blocks in X∗(p), and Qp††margin:
Qp
denote the permutation group induced by the action of Γ (or
Γp) on Y∗(p). Simulations suggest the
following conjecture:
Conjecture 1.2**.**
For every p≥5, the permutation
group Qp is the full alternating or symmetric group.
This conjecture was also raised, independently, in [CGMP16, Conjecture 1.3],
where the authors also state precisely for which primes one can expect
the alternating group (p≡3mod16) and for which the full symmetric
group (p≡3mod16). If this conjecture holds, then roughly
speaking (we give the precise formulation in Theorem 1.6
below), Γ acts transitively on the solutions of (1)
modulo n, for every square free.
Here we prove this conjecture for most primes. More particularly,
we prove it for every p≡1(4) outside the exceptional
set from Theorem 1.1, and for density-1 of the primes p≡3(4):
Theorem 1.3**.**
If p≡1(4) and Qp is transitive,
then Qp is the full alternating or symmetric group on Y∗(p).
Namely, Qp is the full alternating or symmetric group for all
p≡1(4) outside the exceptional set from Theorem
1.1. In fact, our proof yields that for every p≡1(4),
the group Γ acts as the full alternating or symmetric group
on the large component described in Theorem 1.1. In the
case p≡3(4), our proof is more involved and requires
one further assumption:
Theorem 1.4**.**
Let p
be a prime. Assume that:
•
p≡3(4).
•
Qp* is transitive.*
•
The order of 23+5∈Fp2 is at least
32p+1.
Then Qp is the full alternating or symmetric group on Y∗(p).
The number 23+5 is related to the special solution
[3,3,3]∈Y∗(p): its order inside Fp2
gives the length of the cycle of the transformation [x,y,z]↦[x,z,xz−y]
containing the element [3,3,3]. For details see Sections
2 and 4.1.
As shown in Appendix A, the condition regarding
the order of 23+5 is satisfied for density-1 of
the primes222A set of primes A has *density 1 *if limn→∞∣Pn∣∣A∩Pn∣=1,
where Pn={1<p≤n∣p\leavevmodeis\leavevmodeprime}.
In fact, the set of primes for which 23+5 has order
at least 32p+1 satisfies something slightly stronger than
density 1 – see Appendix A., hence
Corollary 1.5**.**
For density-1 of all
primes p≡3(4), the group Qp is the full alternating
or symmetric group on Y∗(p).
Composite Moduli
Let n be a positive integer which is square-free, so n=p1⋯pk
where p1,…,pk are distinct primes. Let X(n)††margin:
X(n)
denote the set of solutions to the Markoff equation (1)
in \nicefracZnZ. By the Chinese Remainder
Theorem, X(n)=X(p1)×…×X(pk),
and let X∗(n)=X∗(p1)×…×X∗(pk)††margin:
X∗(n)
be the set of solutions which are non-zero modulo any of the primes
composing n. The action of Γ on X(n)
is the diagonal action on the X(pi), and
the subset X∗(n) is invariant under this action.
Denote the corresponding permutation group ††margin:
ΓnΓn.
Is the action on X∗(n) transitive? It turns out that
this would follow from Conjecture 1.2 and
indeed holds true for the cases of that conjecture we establish:
Theorem 1.6**.**
*Let n=p1⋯pk
be a product of distinct primes. If for every j=1,…,k, Qpj≥Alt(Y∗(pj)),
then Γ acts transitively on X∗(n).
In particular, if conjecture 1.2 holds,
then Γ acts transitively on X∗(n) for every
square-free n.*
Corollary 1.7**.**
Let P
denote the set of primes that satisfy the assumptions of Theorem 1.3
or of Theorem 1.4.
Then for every set of distinct primes p1,…,pk∈P,
Γ acts transitively on X∗(p1⋯pk).
Bourgain, Gamburd and Sarnak already proved Corollary 1.7
for primes p≡1(4) for which Γp is transitive.
This result should appear in the series announced in [BGS16].
We stress that our proof is entirely different: while Bourgain, Gamburd
and Sarnak improve their techniques from the proof of Theorem 1.1
so that the argument work for several primes simultaneously, our proof
is group-theoretic and uses Theorem 1.1 as a black box.
Both proofs rely on solutions containing the parabolic elements ±2
– see Figure 1 and Section
2.
For n=p1⋯pk as above, we use the notation Y∗(n)=Y∗(p1)×…×Y∗(pk)††margin:
Y∗(n)
for the set of blocks in X∗(n) and Qn††margin:
Qn
for the permutation group induced by the action of Γ on Y∗(n).
Note that these blocks are given by sign changes modulo every prime
separately and are usually of size 4k each (if all primes are
odd). It is quite straight-forward to prove that under the assumptions
of Theorem 1.6,
Γ acts transitively on Y∗(n), using composition
factors of Qn. It requires some further argument to show that
Γ acts transitively on the full set X∗(n).
We elaborate in Section 5.
Remark 1.8* (Regarding the classification of finite simple groups).*
At this point we would like to remark on the dependence of our results
on the Classification of Finite Simple Groups (CFSG)††margin:
CFSG
.
We use the classification only in the proof of Theorem 1.4:
we first give an elementary proof that for a prime p satisfying
the assumptions in the theorem, Qp is a primitive permutation
group333Recall that a permutation group G≤Sym(m)
is called primitive if it does not preserve any non-trivial block-structure.
In particular, if m≥3, G must be transitive., and then rely on (results depending on) the CFSG to deduce that
Qp is the full alternating or symmetric group. If we rely on
Theorem 1.6
to deduce Corollary 1.7,
the latter also becomes partly dependent on the CFSG. This can be
avoided, however, and to this aim we also give a proof that Γ
acts transitively on X∗(n) assuming only that Qp1,…,Qpk
are primitive permutation groups, without using the CFSG (see Theorem
1.9 below). To sum
up, the only results depending on the CFSG are Theorem 1.4,
Corollary 1.5, and the
part of Theorem 1.11 relating to primes
p≡3(4). In contrast, Theorems 1.3
and 1.6 and
Corollary 1.7
do not depend on the CFSG. We illustrate this in Figure 1.
Indeed, the following result does not depend on the CFSG:
Theorem 1.9**.**
*Let n=p1⋯pk
be a product of distinct primes. If Qp1,…,Qpk
are *primitive permutation groups, then Γ acts transitively
on X∗(n).
T2-systems
Let G be a finitely generated group and Fr the free group
on r generators. A normal subgroup N⊴Fr is
said to be G-defining if \nicefracFrN≅G.
Denote by Σr(G)††margin:
Σr(G)
the set of G-defining normal subgroups in Fr. Consider the
action of Aut(Fr) (in fact, of Out(Fr))
on Σr(G). The orbits of this action are called
Tr-systems of G.
The following theorem is due to Gilman (for r≥4) and Evans (who
extended to r=3):
Theorem 1.10**.**
[Gil77, Eva93]** Let G be
a finite non-abelian simple group and r≥3. Then Aut(Fr)
acts on at least one Tr-system of G as the alternating or
symmetric group.
In fact, Gilman and Evans provide more information about the special
Tr-system on which Aut(Fr) acts as
the full alternating or symmetric group, and show it is especially
large. Gilman also showed that for G=PSL(2,p)
with p≥5 prime, there is only one Tr-system for r≥3.
Namely, he proved that Aut(Fr) acts transitively
on Σr(G). Theorem 1.10 says, of
course, that the permutation group in this case is the alternating
or symmetric group. For more details we refer the reader to the beautiful
surveys [Pak01, Lub11].
When r=2, the action of Aut(F2) on Σ2(G)
is not transitive for any finite non-abelian simple group G. In
fact, the number of T2-systems tends to infinity as ∣G∣→∞
[GS09]. The main reason for this phenomenon
is that if {a,b} are a set of generators of F2,
and φ:F2↠G an epimorphism, then
the set of conjugacy classes of444Here, [a,b] denotes the commutator aba−1b−1.
φ([a,b]) and of φ([a,b])−1
is a well-defined invariant of the G-defining subgroup N=kerφ,
which is also invariant under Aut(F2).
We elaborate more in Section 6.
Our result sheds more light on the case of T2-systems for G=PSL(2,p).
If A,B∈SL(2,p) and we denote x=tr(A),
y=tr(B) and z=tr(AB),
then
[TABLE]
In Section 6 it is explained why the map (A,B)↦(tr(A),tr(B),tr(AB))
yields a bijection between the elements in Σ2(PSL(2,p))
with associated trace −2 and the elements of Y∗(p).
In this language, the main result of [BGS17] –
Theorem 1.1 above – says that outside the exceptional
set of primes, these elements form a single T2-system. See [MW13]
for an extensive survey of the connection between the Markoff equation
(1) and T2-systems of PSL(2,p).
Through this connection, Theorems (1.3) and (1.4)
translate to a result in the spirit of Theorem 1.10:
Theorem 1.11**.**
Assume that the prime p satisfies
the assumptions of Theorem 1.3 or of Theorem 1.4.
Then Aut(F2) acts on the trace-(−2)T2-system of PSL(2,p) as the full alternating
or symmetric group.
The paper is organized as follows. Section 2
gives some more notation and collects some results from [BGS17]
we use here. In the short Section 3
and longer Section 4 we prove
Theorem 1.3 for p≡1(4) and Theorem
1.4 for p≡3(4),
respectively. Section 5
is dedicated to proving the transitivity of Γ in certain composite
moduli: first assuming the groups Qp contain the alternating
group (in Section 5.1), and then assuming
only that Qp is primitive (Section 5.3).
In Section 6 we give some background on T-systems
and prove Theorem 1.11. Finally, Appendix
A, by Dan Carmon, shows that the assumption in
Theorem 4 regarding the order
of 23+5∈Fp2 holds for most primes.
Acknowledgments
We are indebted to Peter Sarnak for his encouragement, and for stimulating
discussions, enlightening suggestions and clever advice. We would
also like to thank Zeev Rudnick and Pär Kurlberg for beneficial comments,
and to Dan Carmon for writing the useful Appendix A.
We have benefited much from the mathematical open source community,
and in particular from SageMath. Author Meiri was supported by BSF
grant 2014099 and ISF grant 662/15. Author Puder was supported by
the Rothschild fellowship, by the NSF under agreement No. DMS-1128155
and by the ISF grant 1071/16. Author Carmon was supported by the European
Research Council under the European Union’s Seventh Framework Programme
(FP7/2007-2013) / ERC grant agreement no 320755.
2 Preliminaries
Before proving our main results, let us describe some further notation
and collect further results from [BGS17] that we
use below.
Further notation
•
We already introduced above the notation [x,y,z] for
the block of the solution (x,y,z) in X∗(p),
so [x,y,z]∈Y∗(p). We also use this
notation for a composite (square-free) modulo n: here [x,y,z]
is the element (block) in Y∗(n) containing the solution
(x,y,z).
•
Some elements in Γ are permutations of the three coordinates
of solutions. We denote these elements by τ(12)††margin:
τ(12),τ(123)
for the permutation exchanging the first and second coordinates, by
τ(123) for the cyclic permutation and so on. By
abuse of notation, we use the same notation for the corresponding
elements in Γ, Γp, Qp, Γn and
Qn.
•
The analysis in [BGS17], as well as in the current
work, relies heavily on three “rotation” elements
rot1,rot2,rot3∈Γ††margin:
roti
.
They are defined by
[TABLE]
(the indices are taken modulo 3). For example, (x,y,z)↦rot1(x,z,xz−y).
The rotation rotj fixes the j-th coordinate and its action
on X∗(p) and on Y∗(p) is completely
analyzed in [BGS17] – see Lemmas 2.2
and 2.3 below. Again, by abuse
of notation we write roti for the rotation element in the different
groups Γ, Γp, Qp, Γn and Qn.
•
Following [BGS17], we denote the “conic sections”
by Cj(a) ††margin:
Cj(a)
, j=1,2,3.
These are defined as
[TABLE]
When we write Cj(±a)††margin:
Cj(±a)
,
we mean the conic section in Y∗(p):
[TABLE]
•
For every prime p we let i††margin:
i
denote a square root
of −1 (in Fp or in Fp2).
•
For x∈\nicefracZpZ we use the standard
Legendre symbol (px)††margin:
(px)
to denote the image of x under the character of order 2. Namely,
[TABLE]
•
The notation ∣x∣ is used to denote the order of the
group element x∈G in the group G.
Rotation elements
The action of rot1 on the conic section C1(x)⊆X∗(p)
is a linear map on the last two coordinates given by the matrix
[TABLE]
The eigenvalues of this matrix are given by 2x±x2−4.
This leads to the following definitions and lemmas from [BGS17]:
Definition 2.1**.**
•
An element x∈Fp if called hyperbolic††margin:
hyperbolic
*
*if (x2−4) is a square in Fp∗.
•
An element x∈Fp if called elliptic††margin:
elliptic
*
*if (x2−4) is a non-square in Fp∗.
•
An element x∈Fp if called parabolic††margin:
parabolic
*
*if (x2−4)=0 in Fp, namely, if x=±2.
Notice that this categorization of the elements is invariant under
sign change x↦−x. The following lemmas are based on Lemmas
3-5 of [BGS17] which describe the action of roti
on X∗(p). We adapt them below in order to describe
the action of roti on Y∗(p) and add some further
details, all follow easily from Section 2.1 in [BGS17].
We state the lemmas for C1(±x), but the same statements
holds, evidently, for C2(±x) and for C3(±x).
∣C1(±2)∣=p; The permutation induced
by rot1 on C1(±2) consists of a single p-cycle.
•
There are 4p−1 hyperbolic elements up to sign. For x
hyperbolic, ∣C1(±x)∣=2p−1.
Let ω±1∈Fp be the eigenvalues of the matrix
(2), so x=ω+ω−1. The permutation
induced by rot1 on C1(±x) consists of 2dp−1
cycles of length d each, where d=2max(∣ω∣,∣−ω∣)
and ∣ω∣ is the order of ω in the multiplicative
group Fp∗. The solutions in C1(x)
have the form (x,α+β,αω+βω−1)
for α,β∈Fp∗ with αβ=x2−4x2,
and
[TABLE]
•
There are 4p−1 elliptic elements up to sign. For x
elliptic, ∣C1(±x)∣=2p+1. Define
ω as for hyperbolic elements by x=ω+ω−1, only
now ω∈Fp2∖Fp. The permutation
induced by rot1 on C1(±x) consists of 2dp+1
cycles of length d each, where d=2max(∣ω∣,∣−ω∣)
and ∣ω∣ is the order of ω in the multiplicative
group Fp2\leavevmode∗. Moreover, ωp+1=1, i.e. ∣ω∣\leavevmode∣\leavevmode(p+1).
The solutions in C1(x) have the form (x,A+Ap,Aω+Apω−1)
with A∈Fp2∗ and Ap+1=x2−4x2,
and
When p≡3(4), our results are somewhat weaker and
the proofs more involved. The main reason for that is the lack of
solutions with the parabolic elements ±2:
There are no solutions in Y∗(p) involving the parabolic
elements ±2, nor the elliptic element [math].
•
There are 4p−3 hyperbolic elements up to sign. For x
hyperbolic, the size and structure of C1(±x) and
the action of rot1 on C1(±x) have the same
properties as for x hyperbolic when p≡1(4) (see
Lemma 2.2).
•
There are 4p−3 non-zero elliptic elements up to sign.
For x elliptic, the size and structure of C1(±x)
and the action of rot1 on C1(±x) have the
same properties as for x elliptic when p≡1(4)
(see Lemma 2.2).
the order of rot1∈Qp in its action on C1(±x).
Namely, the solutions with first coordinate ±x in Y∗(p)
belong to cycles of length dp(±x).
3 Alternating Group for p≡1(4)
This section contains the proof of Theorem 1.3, which
states that if p≡1(4) and Qp is transitive,
then Qp contains the entire alternating group Alt(Y∗(p)).
As mentioned above, the existence of parabolic elements when p≡1(4)
allows a rather short argument in this case.
Assume p≡1(4), and let rot1∈Qp be
the rotation element defined on Page • ‣ 2. This element
has one p-cycle, while all its other cycles have length coprime
to p (see Table 1). Thus its power σ=rot1\leavevmode∣rot1∣/p∈Qp
is a p-cycle. As ∣Y∗(p)∣=4p(p+3)≥p+3,
it is now sufficient to show, by Jordan’s Theorem (Theorem 3.1
above), that Qp is primitive in Sym(Y∗(p)).
We need to show that the group Qp preserves no non-trivial block
structure. Assume there is a block structure {B1,…,Bm}
preserved by Qp. So ⋃Bi=Y∗(p) and
Bi∩Bj=∅ for i=j, and for every g∈Qp
and every i, g(Bi)=Bj for some j.
Consider C1(±2)⊂Y∗(p), the
p elements contained in the cycle of size p in σ. The
set C1(±2) must be contained in a block, for otherwise
it has to be the union of several equally-sized blocks, but p is
prime. Say C1(±2)⊆B1. So B1 contains
all solutions with ±2 in the first coordinate. In particular,
it contains [2,2,2+2i] and [2,2+2i,2].
But the same argument with rot2 and rot3 shows that
B1 contains all solutions with ±2 in any coordinate. So
B1 is invariant under all three rotations and under all permutations
of coordinates, and therefore invariant under the action of the whole
group Qp. By the transitivity of Qp, B1=Y∗(p).
∎
Remark 3.2*.*
The proof of Theorem 1.1 in [BGS17] shows
that for every prime p, the large component of X∗(p)
contains all solutions with parabolic (±2) coordinates.
Thus, our proof of Theorem 1.3 applies to the general
case: the group Γ acts on the large component of Y∗(p)
as the alternating or symmetric group.
4 Alternating Group for p≡3(4)
In the case where p≡3(4), there are no parabolic
elements, and in Sections 4.1
and 4.2 we establish the primitivity
of Qp for density-1 of these primes rather than for all those
outside the exceptional set from Theorem 1.1. We also rely
on much deeper theorems, involving the classification of finite simple
groups (CFSG), to conclude in Section 4.3
that whenever Qp is primitive, it contains Alt(Y∗(p)).
Throughout this section, we assume that p≡3(4).
4.1 Primitivity of Qp when p\equiv 3$$\left(4\right)
In this subsection we prove that under the assumptions of Theorem
1.4, the permutation
group Qp is primitive. Namely,
Theorem 4.1**.**
Let p be prime with p≡3(4).
Assume that Qp is transitive and that the order of 23+5∈Fp2
is at least 32p+1. Then Qp is primitive.
To establish primitivity of Qp, one needs to show there are
no non-trivial blocks in the action of Qp on Y∗(p):
a block is a subset B⊆Y∗(p), such that for
every g∈Qp, either g.B=B or g.B∩B=∅. As
Qp is assumed to be transitive, if B is proper (B⫋Y∗(p))
and of size at least two, then the subsets {g.B∣g∈Qp}
constitute a partition of Y∗(p) which is a non-trivial
block structure preserved under the action of Qp. So proving
Qp is primitive is equivalent to showing that every proper block
is a singleton.
The proof of Theorem 4.1 relies
on the following two propositions which contain properties of blocks
in Y∗(p). We defer the proofs of these two propositions
to the next subsection, and complete the proof of Theorem 4.1
in the current subsection, assuming the two propositions.
We say that some coordinate j∈{1,2,3} is homogeneous
in a block B⊆Y∗(p) if the j-th coordinate
of every solution in B is of the same type (either all hyperbolic
or all elliptic).
Proposition 4.2**.**
Let p≡3(4).
Assume that Qp acts transitively on Y∗(p),
and let B⫋Y∗(p) be a proper Qp-block.
Then at least two of the coordinates {1,2,3} are
homogeneous in B.
The most technical ingredient of the proof of primitivity is the following.
Recall that dp(±x) denotes the length of the cycles
of rot1∈Qp containing elements of C1(±x).
Proposition 4.3**.**
Assume
that Qp is transitive and let x∈Fp∖{0,±2}
satisfy dp(±x)≥16p+1. Then, for every
j∈{1,2,3}, every proper Qp-block B⫋Y∗(p)
contains at most one solution with j-th coordinate ±x.
The idea of the proof of this proposition is the following: assume
there are two solutions in the block B with first coordinate ±x.
Say these are [x,y0,y1] and [x,z0,z1].
Then for every 1≤m, the block rot1\leavevmodem(B)
contains the solutions [x,ym,ym+1] and [x,zm,zm+1]
with ym and zm defined recursively by ym+1=xym−ym−1
and zm+1=xzm−zm−1. By Proposition 4.2,
at least one of the two coordinates 2,3 in every block is homogeneous,
meaning that for every m, either ym and zm have the
same type (hyperbolic or elliptic), or ym+1 and zm+1 have
the same type. Using classical results in number theory, we show such
“high correlation” between two cycles of rot1 is impossible
whenever these cycles are long enough.
Section 4.2 gives the details of the
proof, and assuming it, we finish the proof of Theorem 4.1.
We need the following corollary showing that elements of high order
in the sense of Proposition 4.3
appear in the same block and the same coordinate only with other elements
of the same type and the same order:
Corollary 4.4**.**
Assume
that Qp is transitive and that x∈Fp∖{0,±2}
satisfies dp(±x)≥16p+1. If B⫋Y∗(p)
is a proper Qp-block containing some solution with first coordinate
±x, and another solution with first coordinate ±x′, then
dp(±x)=dp(±x′). In particular,
x and x′ are of the same type (both hyperbolic or both elliptic).
Proof.
Note that rot1\leavevmodedp(±x)(B)=B.
By Proposition 4.3,
rot1\leavevmodem(B)=B for 1≤m<dp(±x).
Hence, dp(±x′) is some multiple of dp(±x).
In particular, the assumption of Proposition 4.3
holds for x′, and by symmetry, dp(±x) is a
multiple of dp(±x′). Hence dp(±x′)=dp(±x).
∎
Assume that Qp is transitive and ω=23+5∈Fp2\leavevmode∗
has order at least 32p+1. We need to show that Qp
is primitive. We use the special symmetric solution [3,3,3]∈Y∗(p).
Whenever ω∈Fp2 has high order in the multiplicative
group Fp2∗, the cycle of rot1∈Qp
containing the solution [3,3,3] is long. More concretely,
3=ω+ω−1, and by Lemma 2.3
and Table 2, dp(±3) is either
∣ω∣ or 2∣ω∣, where
∣ω∣ is the order of ω in the multiplicative
group Fp2\leavevmode\leavevmode∗. So dp(±3)≥16p+1.
Assume that [a,b,c] and [3,3,3] are two
distinct solutions lying in the same proper Qp-block B⫋Y∗(p).
By Lemma 2.3, dp(±3)≥16p+1,
and by Corollary 4.4,
dp(±a)=dp(±b)=dp(±c)=dp(±3).
As [3,3,3] is the only solution of the form [x,x,x]
or [x,x,−x], we can assume without loss of generality
that {±b}={±c}. Since τ(2\leavevmode3)
stabilizes [3,3,3], we have τ(2\leavevmode3)(B)=B,
so the two distinct solutions [a,b,c] and [a,c,b]
both belong to B. This contradicts Proposition 4.3:
dp(±a)=dp(±3) is large and thus
a cannot appear twice in the same coordinate in the same block.
∎
As mentioned in Section 1, the assumptions in
Theorem 4.1 hold for density-1
of the primes p≡3(4). Indeed, relying on strong
results of Ford [For08], Dan Carmon proves in
Proposition A.1 in Appendix A
that under some assumptions, the order of a quadratic integer modulo
primes is high for density-1 of the primes. From Proposition A.1
we deduce:
Corollary 4.5**.**
For density-1 of all primes,
the element ω=23+5∈Fp2 has
order at least 32p+1 in the multiplicative group Fp2\leavevmode\leavevmode∗,
in which case dp(±3)≥16p+1.
Combining Theorem 1.1 with Corollary 4.5
shows why the assumptions in Theorem 4.1
hold for density-1 of all primes p≡3(4), hence:
Corollary 4.6**.**
For density-1
of all primes p≡3(4), the group Qp is primitive
in its action on Y∗(p).
Remark 4.7*.*
It is conceivable that there
is a stronger version of Proposition 4.3
which states there cannot be correlation between two long cycles of
rot1∈Qp even with two different first coordinates. Were
we able to prove this, we could omit the condition about the order
of 23+5 in the statements of Theorems 1.4
and 4.1 and assume only that Qp
is transitive to conclude that it is primitive and, moreover, contains
Alt(Y∗(p)). (This would make
Theorem 1.4
completely parallel to Theorem 1.3 dealing with p≡1(4).)
In Remark 4.12
below we explain the obstacle to proving this more general version
of Proposition 4.3.
4.2 Properties of blocks in the action of Qp on Y∗(p)
In the current subsection we prove the two propositions that were
stated without proof in the previous subsection. Proposition 4.2
is proved in Section 4.2.1, and
Proposition 4.3
proved in Sections 4.2.2 (the hyperbolic
case) and 4.2.3 (the elliptic case).
4.2.1 Homogeneity of coordinates in blocks
Lemma 4.8**.**
The subgroup H=⟨rot1,rot2,rot3⟩≤Γ
has index at most 2 in Γ.
Proof.
By definition, Γ is generated by the three Vieta involutions
and permutations of coordinates. Since R3=rot1⋅τ(2\leavevmode3)
and likewise for R1 and R2, since τ(1\leavevmode3\leavevmode2)=rot3⋅rot1
and since S3=⟨(12),(132)⟩,
we obtain that Γ=⟨rot1,rot2,rot3,τ(1\leavevmode2)⟩=⟨H,τ(1\leavevmode2)⟩.
It is easy to check that τ(1\leavevmode2)rotjτ(1\leavevmode2)∈H
for j=1,2,3, so H⊴Γ and Γ=H⋅⟨τ(1\leavevmode2)⟩.
This finishes the proof.
∎
Recall that Proposition 4.2 says
that if Qp acts transitively on Y∗(p), and
if B⫋Y∗(p) is a proper block of the action
of Qp on Y∗(p), then at least two of the coordinates
{1,2,3} are homogeneous in B.
Assume that some coordinate, say j=1, is not homogeneous in B.
We need to show that the second and third coordinates are homogeneous.
The element rot1(p−1)/2 fixes every solution
with first coordinate hyperbolic, while rot1(p+1)/2
fixes every solution with first coordinate elliptic. Hence B is
invariant under both elements, and thus by rot1.
By the same argument, if all three coordinates are not homogeneous,
B is invariant under Hp=⟨rot1,rot2,rot3⟩≤Qp.
By Lemma 4.8, [Qp:Hp]≤2,
and transitivity implies there are at most two blocks in the action:
B and B′=γ(B) for some γ∈Qp. But
the block containing [3,3,3] is also invariant under
τ(1\leavevmode2), hence is invariant under the whole of
Qp – a contradiction.
Thus at least one coordinate – the second or the third
– is homogeneous. Notice that rot1, which stabilizes
B, moves the third coordinate of the solutions to the second. Hence
both the second and third coordinates must be homogeneous.
∎
Remark 4.9*.*
In fact, the proof of the last lemma yields something slightly stronger.
Denote the type of a solution in Y∗(p) by some triple
in {h,e}3, depending on whether every coordinate
is hyperbolic or elliptic. Then, every block B as above contains
either only solutions of the same type (homogeneous in all coordinates),
or only solutions of exactly two types: one type is (h,h,h)
or (e,e,e), and the other differs from the first type
in one coordinate (the sole non-homogeneous coordinate).
4.2.2 No correlation between two long rot1-cycles with the
same first hyperbolic coordinate
We now prove Proposition 4.3
stating that if Qp is transitive and dp(±x)≥16p+1,
then ±x cannot appear twice in the same coordinate in the same
proper Qp-block B⫋Y∗(p). What we
actually prove is the lack of correlation between two long enough
cycles of rotj with the same j-th coordinate (including
the case of two different offsets of the same cycle). The proof of
Proposition 4.3
is split to the case where x is hyperbolic (in the current subsection)
and the case it is elliptic (given in Section 4.2.3).
We use the following classical number-theoretic result:
Assume that x is hyperbolic with dp(±x)≥16p+1,
and that there are two elements in the proper Qp-block B⫋Y∗(p)
with ±x in the first coordinate. The same arguments holds, evidently,
for every coordinate j=1,2,3.
Assume that [x,y0,y1] and [x,z0,z1]
belong to B. By Lemma 2.3,
x=ω+ω−1 with ω∈Fp∗ and we
can assume ∣ω∣=2d≥32p−1: otherwise, replace
x with −x and ω with −ω. Write y0=α+β
with α,β∈Fp∗ so that αβ=x2−4x2
and y1=αω+βω−1 (see Lemma 2.3).
The cycle of rot1 containing [x,y0,y1]
is
[TABLE]
with
[TABLE]
The set {ωj}0≤j≤2d−1 is the same
as the set {sm}s∈Fp∗ where
m=2dp−1 (with every element in {ωj}
covered by 2dp−1 different values of s). So as sets,
[TABLE]
The same holds for the cycle of rot1 containing [x,z0,z1]
with γ,δ∈Fp∗ in the role of α,β,
so that zj=γωj+δω−j. We may assume
that γ=±α, for otherwise [x,y0,y1]=[x,z0,z1].
Moreover, if sm=ωj then fα,β(s)=yj
and fγ,δ(s)=zj.
Notice that yj and zj are of different types (one hyperbolic
and the other elliptic) if and only if
[TABLE]
Since [x,yj,yj+1] and [x,zj,zj+1]
both belong to the block rot1\leavevmodej(B), we derive
from Proposition 4.2 that (5)
cannot hold for two consecutive values of j. In the parametrization
given by s∈Fp∗, this means that
[TABLE]
cannot hold for any s∈Fp∗.
Write
[TABLE]
and k1(s)=gα,β(s)gγ,δ(s)
and k2(s)=gαω,βω−1(s)gγω,δω−1(s).
Now (6) is equivalent to
[TABLE]
Denote by N(−1,−1) the number of s∈Fp
for which (7) holds. Our goal
is to show that N(−1,−1)>0, whence (7)
has some solution s=0, yielding a contradiction (note that s=0
is not a solution to (7)). Note
that k1(s),k2(s)=0 for every s∈Fp:
indeed, gα,β(0)=β2=0, and if 0=s∈Fp
and gα,β(s)=0 then fα,β(s)=±2
is yj for some j, but there are no solution in X∗(p)
containing ±2 when p≡3(4). Therefore (pk1(s)),(pk2(s))=0
for s∈Fp and
We use Theorem 4.10 to estimate the MB’s. First,
we show that none of k1,k2 and k1k2 are squares
in Fp[x]. The roots of
[TABLE]
satisfy
[TABLE]
As x is hyperbolic and p\equiv 3$$\left(4\right), we have that
x2−4−4 is not a square in Fp, so 1
and x2−4−4 are linearly independent over Fp,
and α±1±x2−4−4 are four distinct
values for Sm, different from zero. Moreover, the polynomial
sm−ξ is separable for 0=ξ∈Fp2 because
m=2dp−1<p. So gα,β(s), which
is of degree 4m, has 4m distinct roots in Fp,
and in particular is not a square in Fp[x].
This analysis shows that gα,β and gγ,δ
have a common root if and only if α=±γ. Since α=±γ
by assumption, k1=gα,βgγ,δ and k2=gαω,βω−1gγω,δω−1
are both separable of degree 8m. Finally, k1k2, of degree
16m, is also not a square in Fp[x]:
for α=±αω and if α=±γω then
αω=±γ.
Theorem 4.10 yields that M{1},M{2}≤(8m−1)p
and M{1,2}≤(16m−1)p.
From (10) we now obtain
[TABLE]
∎
4.2.3 No correlation between two long rot1-cycles with the
same first elliptic coordinate
The general proof strategy for the elliptic case is the same as for
the hyperbolic case, albeit with a few extra technical details. In
the hyperbolic case, we used a parametrization of the elements of
a cycle of rot1 as a function over Fp∗, which
allowed us to use Weil’s bound (Theorem 4.10 above). In
the elliptic case, a similar approach requires that we go over the
elements in the cyclic subgroup of size p+1 in Fp2∗.
The following lemma allows us to parametrize this subgroup as a function
over Fp:
Lemma 4.11**.**
*The multiplicative subgroup††margin:
H
H≤Fp2∗
of order p+1 satisfies*
[TABLE]
(where i=−1∈Fp2).
Proof.
Note that (θ+iη)p=θ−iη (recall that
p≡3(4) so ip=i4k+3=i3=−i). So (θ+iη)p+1=(θ+iη)(θ−iη)=θ2+η2.
This gives the first equality in (11). A straightforward
computation yields the second equality.
∎
We assume that x is elliptic with
dp(±x)≥16p+1, and assume that there are
two elements in the proper Qp-block B⫋Y∗(p)
with ±x in the first coordinate. We use the notation H for
the subgroup of order p+1 in Fp2\leavevmode∗, as in
Lemma 4.11. Assume that [x,y0,y1] and
[x,z0,z1] both belong to B. By Table 2,
x=ω+ω−1 with ω∈H, and we can assume that
∣ω∣=2d≥32p+1, for otherwise replace ω
by −ω and x by −x. Let A∈Fp2 satisfy
that Ap+1=x2−4x2, that y0=A+Ap and that
y1=Aω+Apω−1 (see Lemma 2.3).
The cycle of rot1 containing [x,y0,y1]
is
[TABLE]
with
[TABLE]
The set {ωj}0≤j≤2d−1 is the same
as the set {hm}h∈H where m=2dp+1,
with every element in {ωj} covered by m
different values of h. So as sets,
[TABLE]
The same holds for the cycle of rot1 containing [x,z0,z1]
with C∈Fp2 in the role of A, so that zj=Cωj+Cpω−j.
We may assume that C=±A, for otherwise [x,y0,y1]=[x,z0,z1].
Moreover, if hm=ωj then fA(h)=yj
and fC(h)=zj.
As in the proof of the hyperbolic case, we derive from Proposition
4.2 that
[TABLE]
cannot hold for any h∈H. To be able to use Theorem 4.10,
we want to reparametrize (12) as polynomials
in s∈Fp, using Lemma 4.11. Denote
[TABLE]
where
[TABLE]
Let also k1=gAgC and k2=gAωgCω.
Then (12) is equivalent to
[TABLE]
As in the proof of the hyperbolic case, denote by N(−1,−1)
the number of s∈Fp for which (13)
holds. Our goal is to get a contradiction by showing that N(−1,−1)>0.
Note that gA(s)=0 for s∈Fp because
gA(s)=(1+s2)2m(yj2−4)
for some yj as above, and s=±i and yj=±2.
Thus ki(s)=0 neither, and (pki(s))∈{1,−1}.
As in equations (8)-10
in the hyperbolic case, we get that
[TABLE]
where for ∅=B∈{1,2}, we define MB=def∑s∈Fp(p∏j∈Bkj(s)).
We use Theorem 4.10 to estimate the MB’s. First,
we show that k1,k2∈Fp[x]. Notice
that
[TABLE]
so
[TABLE]
The last expression shows that gA(s)∈Fp2[s].
Its degree is 4m: indeed, the leading coefficient is
[TABLE]
and for m even this coefficient equals (A+Ap)2−4=y02−4
which is not zero since y0=±2 (see Lemma 2.3).
For m odd, this coefficient is
[TABLE]
which is not zero because Ap+1−1=x2−44 is not a
square in Fp when x is elliptic.
As Fp2=Fp+iFp, we can write
gA=gA′+igA′′, where gA′,gA′′∈Fp[s].
By definition, for every s∈Fp, we have h=h(s)∈H,
and
[TABLE]
so gA′′(s)=0 for every s∈Fp. Since
deg(gA′′)≤4m<p, we conclude that gA′′ is
the zero polynomial, hence gA(s)=gA′(s)∈Fp[s]
and so k1,k2∈Fp[x].
Next, we wish to show that k1, k2 and k1k2 are
not squares in Fp[x]. There is
a one-to-one correspondence between the roots of gA in Fp
and the roots of
[TABLE]
in Fp given by
[TABLE]
because ±i is never a root of gA (recall that gA(s)∈Fp[s]
has the form from (16)) and −i never a root
of rA (because rA(h)=h2m(fA(h)2−4),
−i∈H and thus fA(−i)=yj for some yj
as above, and yj=±2). It is easier to analyze the roots
of rA than those of gA: if h is a root of rA
then
[TABLE]
where κ(x)=Ap+1=x2−4x2. Now note
the following:
•
The four possible values of hm are distinct and different from
zero (this follows from κ(x)=0,1).
•
Because (m,p)=1, the four polynomials hm−A±1±1−κ(x)
are separable, so rA has 4m distinct roots in Fp,
and so does gA.
•
If A=±C, the 4m roots of rA are distinct from the
4m roots of rC: certainly A1+1−κ(x)=±C1+1−κ(x),
and if A1+1−κ(x)=±C1−1−κ(x)
we obtain
[TABLE]
with ξ=1+1−κ(x)1−1−κ(x)∈Fp
because 1−κ(x)=x2−4−4 is a square in
Fp. Then ξ=±1, that is, C=±A –
a contradiction. Hence k1=gAgC and k2=gAωgCω
are separable of degree 8m each.
•
Finally, if C=±A, the polynomial k1k2=gAgAωgCgCω
is not a square in Fp[x]: it
is separable unless A=±Cω or Aω=±C, but the
two cannot hold simultaneously.
We can now apply Theorem 4.10 to obtain the same bounds
on the MB’s as in the hyperbolic case, and from (14)
we now obtain
[TABLE]
∎
Remark 4.12*.*
As we noted
in Remark 4.7 above, it is conceivable
that a stronger version of Proposition 4.3
holds. Let us point to the phase in the current argument that fails
in this more general setting. The simplest case to consider if that
of x,x′∈Fp both hyperbolic of maximal order, so dp(x)=dp(x′)=2p−1.
Assume that x=ω+ω−1 and x′=ω′+ω′−1,
and that ω′=ωr. Then, in the notation of Section 4.2.2,
if yj=αs+βs−1, then yj′=α′sr+β′s−r,
and our goal is to show that (αs+βs−1)
and (α′sr+β′s−r) cannot be of the same
type (hyperbolic/elliptic) for too many values of s∈Fp∗.
The problem is that r can be of any order, and is generically of
order ≥p. For polynomials of such degree Weil’s Theorem
4.10 is useless.
4.3 Deducing Alternating group from primitivity
Finally, in this section, we show how to deduce that Qp≥Alt(Y∗(p))
whenever Qp is primitive. Throughout this section we denote
the symmetric group Sym(n) by Sn and
Alt(n) by An††margin:
Sn,An
.
Here we use the following result of Guralnick and Magaard, classifying
primitive subgroups of Sn containing an element with at least
n/2 fixed points. This theorem relies heavily on the CFSG. We adjust
the statement of the theorem to our needs – the original
statement in [GM98] is more detailed. In the
statement we use the notation Soc(G) for the
socle of the group G (see Section 5.3
for details), and the standard notation G1≀G2 for the
wreath product of two groups.
Let G≤Sn be a primitive
group, and let x∈G have at least n/2 fixed points. Then one
of the following holds:
G=Aff(2,k)* is the affine
group acting on F2\leavevmodek and x is a transvection555To be sure, x is a transvection when Aff(2,k)
is embedded in GL(2,k+1) as the matrices with
bottom row (0,…,0,1). and is, in particular, an involution. In this case x has exactly
n/2 fixed points.*
2. 2.
There are r≥1, m≥5 and
1≤k≤m/4 such that n=(km)r, the group Sm
acts on the set Δ of k-subsets of {1,…,m}
in the natural way, G≤Sm≀Sr acts on Δr and
Soc(G)=Am\leavevmoder.
3. 3.
For some r≥1, n=6r, the group
S6 acts on Δ={1,…,6} by applying
an outer automorphism666Namely, for some fixed φ∈Aut(S6)∖Inn(S6),
the permutation σ∈S6 acts on Δ by σ.i=φ(σ)(i)., G≤S6≀Sr acts on Δr and Soc(G)=A6\leavevmoder.
4. 4.
The group G is some variant of an orthogonal
group over the field of two elements acting on some collection of
1-spaces or hyperplanes, and the element x is an involution.
The following lemma helps us rule out Case 2
of the above theorem with r=1.
Lemma 4.14**.**
Consider the embedding ι:Sm↪Sn
given by the natural action of the symmetric group Sm on the
set Δ of n=(km)k-subsets of m, for some
2≤k≤4m. If, for some π∈Sm, the image ι(π)
has a cycle of size divisible by q and a cycle of size divisible
by s for some distinct primes q and s, then ι(π)
also has a cycle of size divisible by qs.
Proof.
Assume that {a1,…,ak}∈Δ belongs
to a cycle α of length divisible by q in ι(π).
Assume that in π, the elements a1,…,ak belong to
t distinct cycles: the elements a1,…,aℓ1 belong
to the cycle σ1, the elements aℓ1+1,…,aℓ2
belong to the cycle σ2, and so on (each σj may
contain additional elements not from {a1,…,ak}).
Let o1 be the smallest power of σ1 that maps {a1,…,aℓ1}
to itself. Define o2,…,ot analogously. Then, q\leavevmode∣lcm(o1,…,ot).
In particular, q\leavevmode∣\leavevmodeoi for some i, and so q\Big{|}\left|\sigma_{i}\right|.
Without loss of generality, assume q\Big{|}\left|\sigma_{1}\right|,
so that a1 belongs to a cycle σ=σ1 of π
of size divisible by q. Likewise, assume that b1 belongs
to a cycle τ of π of size divisible by s.
Denote A={1,…,m}∖(σ∪τ)
(namely, A consists of the elements not belonging to the cycle
σ nor to τ). Assume first that σ=τ. If
∣A∣≥k−2, then a k-subset containing a1,
b1 and k−2 elements from A belongs to a cycle of ι(π)
of size divisible by qs. If ∣A∣<k−2, then, as k≤4m,
at least one of σ or τ has more than k element. Assume
without loss of generality it is σ. Consider the k-subset
{b1,a1,π(a1),π2(a1),…,πk−2(a1)}.
This subset belongs to a cycle of ι(π) of size
lcm(∣τ∣,∣σ∣),
which, in particular, is a multiple of qs.
Finally, assume σ=τ. Then qs\Big{|}\left|\sigma\right|.
If the length of σ is at least k+1, the k-subset {a1,π(a1),π2(a1),…,πk−2(a1),πk−1(a1)}
belongs to a cycle of ι(π) of size dividing qs.
If ∣σ∣≤k then A contains more than k−1
elements, and the k-subset containing a1 and k−1 elements
from A belongs to a cycle of ι(π) of size dividing
qs.
∎
Proposition 4.15**.**
Let p≡3(4)
be prime. If Qp is primitive, then Qp≥Alt(Y∗(p)).
Proof.
Consider rot1∈Qp. Among the 4p(p−3)
elements in Y∗(p), 8(p−1)(p−3)
belong to cycles of length at least 3 and dividing 2p−1,
and 8(p+1)(p−3) belong to cycles
of length at least 3 and dividing 2p+1 (see Table 2).
Since gcd(2p−1,2p+1)=1, the
permutation σ=rot1(p+1)/2 fixes exactly
8(p+1)(p−3)>2∣Y∗(p)∣
elements of Y∗(p). Thus Qp satisfies the assumptions
in Theorem 4.13. We can now rule out all options
except for Qp=Alt(Y∗(p)) or Qp=Sym(Y∗(p)).
Cases 1 and 4 are immediately
ruled out because the permutation σ∈Qp is not an involution.
Case 2 with r≥2 and Case 3
are immediately ruled out because ∣Y∗(p)∣=4p(p−3)
is not a proper power nor equal to six. It remains to consider Case
2 with r=1.
Let q be some prime factor of 2p−1, and let s be
some prime factor of 2p+1. By Table 2,
rot1 contains cycles of size divisible by q (indeed, even
of size q exactly), and of size divisible by s. However, it
does not contain any cycle of size divisible by qs. Using Lemma
4.14, this rules out Case 2
from Theorem 4.13 with r=1 and k≥2.
The remaining case, that of Case 2
with r=k=1, is precisely the case that the group in question is
either An or Sn.
∎
This finishes the proofs of Theorem 1.4
and of Corollary 1.5: Theorem
1.4 is now a
consequence of Theorem 4.1 and
Proposition 4.15, while
Corollary 1.5 follows from
Corollary 4.6 and
Proposition 4.15.
5 Strong Approximation for Square Free Composite Moduli
In this section we derive our main application of the results on the
groups Qp and show that Γ acts transitively on X∗(n)
for various square-free composite values n=p1⋯pk. First,
in Section 5.1, we prove that if Qpj≥Alt(Y∗(pj))
for every j=1,…,k, then Γ acts transitively on Y∗(n).
In Section 5.2 we strengthen this
result to showing that, moreover, Γ acts transitively on X∗(n),
namely, that strong approximation for the Markoff equation holds in
modulo n, thus proving Theorem 1.6.
At this point, we are able to prove Theorem 1.4
that Qp≥Alt(Y∗(p)) for p≡3(4)
satisfying the assumptions in the statement of Theorem 1.4,
only while relying on the classification of finite simple groups (CFSG)
– see Section 4.3.
However, the CFSG is not necessary for establishing the transitivity
of Γ on X∗(n) when n=p1⋯pk
and p1,…,pk are distinct primes satisfying the assumptions
in Theorems 1.3 or 1.4
(this is Corollary 1.7).
In Section 5.3 we give
an alternative proof for the transitivity of Γ on X∗(n),
which uses only the primitivity of Qp, as in Theorem 4.1,
thus proving Theorem 1.9.
The point is that we want to provide a proof of the transitivity on
X∗(n) which can be potentially understood in full,
from basic principles, by a motivated reader. This is practically
impossible if one relies on the CFSG.
5.1 Transitivity of Γ on Y∗(n)
Here we prove the following lemma:
Lemma 5.1**.**
*Let n=p1⋯pk
be a product of distinct primes. If Qpj≥Alt(Y∗(pj))
for j=1,…,k, then Γ acts transitively on Y∗(n).
Moreover, Qn, which is a subgroup of Sym(Y∗(p1))×…×Sym(Y∗(pk)),
contains Alt(Y∗(p1))×…×Alt(Y∗(pk)).*
Proof.
We prove the proposition by induction on k, the case k=1 being
trivial. Assume k≥2. It is enough to show that for every j=1,…,k,
[TABLE]
Recall that Y∗(3)=∅, so we may assume 3∤n.
Without loss of generality we assume that j=k. We first prove (17)
assuming pk≥5. Note that ∣Y∗(pk)∣≥5
(see Lemmas 2.2 and 2.3),
and so Alt(Y∗(pk)) is simple.
This group is never a composition (Jordan-Hölder) factor of Alt(Y∗(pℓ))
when pk=pℓ, because777For p odd the size of ∣Y∗(p)∣ is 4p(p±3)
as given in Section 2, and ∣Y∗(2)∣=4. ∣Y∗(pk)∣=∣Y∗(pℓ)∣.
Now consider the normal series
[TABLE]
The group Qpk is a quotient of Qn, and so Alt(Y∗(pk))
a composition factor of Qn, and thus a composition factor of
one of the quotients in (18). But the upper
quotient is isomorphic to Qp1⋯pk−1, which by the
induction hypothesis has composition factors Alt(Y∗(pℓ))
for ℓ=k,pℓ=2 and possibly some copies of \nicefracZ2Z
coming from \nicefracSym(Y∗(pℓ))Alt(Y∗(pℓ))
or copies of \nicefracZ2Z and \nicefracZ3Z
coming from Sym(Y∗(2)). The middle
quotient is either trivial or \nicefracZ2Z.
Thus Alt(Y∗(pk)) must be a
composition factor of the bottom quotient, so 1×…×1×Alt(Y∗(pk))≤Qn.
Finally, if pk=2, note that ∣Y∗(2)∣=4.
The composition factors of Alt(4) are one copy
of \nicefracZ3Z and two copies of \nicefracZ2Z.
By an argument as above, the factor \nicefracZ3Z
must belong to the bottom quotient in (18).
Denote
[TABLE]
It is easy to check that H⊴Qn. For every gk∈Alt(Y∗(2))
there are g1,…,gk−1 with gj∈Sym(Y∗(pj))
such that (g1,…,gk)∈Qn, thus H′⊴Alt(Y∗(2))≅Alt(4).
But the only normal subgroup of Alt(4) containing
the composition factor \nicefracZ3Z is Alt(4)
itself.
∎
5.2 Transitivity of Γ on X∗(n)
We now finish the proof of Theorem 1.6
and prove that if n=p1⋯pk is a product of distinct
primes with Qpj≥Alt(Y∗(pj))
for every 1≤j≤k, then Γ acts transitively on X∗(n).
We want the proof of this section to work in a slightly greater generality
than the assumption that Qpj≥Alt(Y∗(pj)),
so that it applies also for the next section, where we do not rely
on the CFSG. This is part of the motivation for the following notation:
Notation 5.2*.*
Let n=p1⋯pk
be a product of distinct primes for which Qpj is primitive.
We assume further that
•
The primes are ordered by the order of the rotations roti in
the groups Qpj, which is
[TABLE]
For instance, 7 comes before 5. We break potential ties by putting
the larger prime first: for example, we put 11 before 5.
•
Without loss of generality, 2,5,7,11∣n and so the first four
primes are 2,7,11,5 (in that order). This assumption is possible
because in these four cases, computer simulations indicate that Qp=Sym(Y∗(p))
is the full symmetric group, so our assumptions always hold.
Furthermore, for j=1,…,k,
•
Let Mj=p1⋯pj††margin:
Mj
denote the product
of the first j primes.
•
Let ††margin:
ΩjΩj⊴Γ denote
the kernel of the action of Γ on Y∗(Mj).
Note than Ωj+1⊴Ωj.
•
Let ††margin:
ΛjΛj⊴Γ
denote the kernel of the action of Γ on X∗(Mj).
Note that Λj+1⊴Λj⊴Ωj.
Finally, for every prime p, we let πp:Γ→Qp††margin:
πp
denote the projection.
In Section 5.3 we shall
prove the following lemma without relying on the CFSG:
Lemma 5.3**.**
Let
n=p1⋯pk with Qpj primitive for j=1,…,k
as in Notation 5.2. Then, for every j=2,…,k,
the image of Ωj−1 in Qpj contains a subgroup Hj≤Sym(Y∗(pj))
satisfying:
Hj* is transitive on Y∗(pj)*
2. 2.
Hj* is isomorphic to a direct product of non-abelian simple groups888Note that we assume j≥2. Indeed, this does not hold for p1=2:
there are no simple non-abelian subgroups inside Sym(Y∗(2))≅Sym(4).T1×…×Tm for some m=m(j)∈Z≥1.*
In particular, Ωj−1 acts transitively on Y∗(pj)
and Γ acts transitively on Y∗(n).
Note that if we assume that Qpj≥Alt(Y∗(pj)),
the conclusion of Lemma 5.3
follows immediately from Lemma 5.1:
indeed, for p≥5, Alt(Y∗(p))
is indeed transitive on Y∗(p) and is a product of
a single non-abelian simple group. So Lemma 5.3
is already proven relying on the CFSG, or if one assumes that pj≡1(4)
for j=1,…,k. In the remaining part of this subsection we rely
only on the conclusion of Lemma 5.3.
We assume Notation 5.2 throughout.
Lemma 5.4**.**
For j=2,…,k, the group
Λj−1 acts transitively on Y∗(pj).
Proof.
Consider the normal series
[TABLE]
and its projection on Qpj via πj:Γ↠Qpj.
By Lemma 5.3,
πj(Ωj−1)≥Hj where Hj acts transitively
on Y∗(pj) and is a direct product of non-abelian
simple groups. As Ωj−1 fixes Y∗(Mj−1)=Y∗(p1)×…×Y∗(pj−1),
its action on X∗(Mj−1) fixes every 4-block
and only permutes elements inside the 4-blocks, hence the image
of Ωj−1 in ΓMj−1 is a subgroup of Sym(4)∣Y∗(p1)∣+…+∣Y∗(pj−1)∣.
Hence this image is solvable of order 2α⋅3β
for some α,β∈Z≥0, so all its composition
factors are either \nicefracZ2Z or \nicefracZ3Z.
We deduce that the quotient \nicefracΩj−1Λj−1
has only composition factors \nicefracZ2Z
and\or \nicefracZ3Z. Let
[TABLE]
be a normal series with quotients \nicefracZ2Z
and\or \nicefracZ3Z. Note
that the index [Hj:πj(N1)∩Hj]
is at most 3, but as Hj is a direct product of non-abelian
simple groups, it has no proper subgroups of index999To be sure, the reason that H=T1×…×Tm with
T1,…,Tm finite non-abelian simple groups has no subgroups
of index 2 or 3 is that the normal subgroups of H are B1×…×Bm
where Bi∈{1,Ti} for every i (this is
standard: if N⊴H and N∩T1=1, then 1=[N,T1]⊴T1,
and so [N,T1]=T1 and N≥T1). In particular,
since the smallest non-abelian simple group is Alt(5),
any proper normal subgroup of H is of index at least 60. If
K≤H has index 2 or 3, then its core, ∩h∈HhKh−1,
is proper normal subgroup of index at most 6, which is impossible. ≤3, hence πj(N1)≥Hj. By induction,
the same argument shows that πj(Nℓ)≥Hj
for every ℓ, and, in particular, πj(Λj−1)≥Hj.
∎
Lemma 5.5**.**
For j=5,…,k (so pj≥13),
Λj−1 acts transitively on X∗(pj).
Proof.
Our strategy is to find a triple (x,y,z)∈X∗(pj)
and elements in Λj−1 mapping (x,y,z) to
the other elements in its 4-block: (x,−y,−z), (−x,y,−z)
and (−x,−y,z). Together with the transitivity of Λj−1
on Y∗(pk) established in Lemma 5.4,
this would complete the proof. As in other places in this paper, we
deal separately with the case pj≡1(4) and the
case pj≡3(4), the argument in the former case
being simpler.
Case 1: p=pj≡1(4)
Take some x∈Fp hyperbolic of maximal order
(namely, the rot1-cycles in C1(x) are of length
p−1≥12 each). Since [math] has order 4, x=0 and (0,x,ix)∈X∗(p).
Let (r,s,t)∈X∗(p) be another solution
with r elliptic. As all rot1-cycles in C1(0)
have length 4 and (p+1≡2mod4), we get that rot1\leavevmodep+1
fixes all four elements in [r,s,t] while mapping (0,x,ix)↦(0,−x,−ix).
By Lemma 5.4, there is some g∈Λj−1
mapping [0,x,ix]↦[r,s,t]. The element
h1=g−1⋅rot1\leavevmode−(p+1)⋅g⋅rot1\leavevmodep+1
is in Λj−1 (as Λj−1⊴Γ)
and maps (0,x,ix)↦(0,−x,−ix).
Since x is maximal hyperbolic, its order is (p−1)
which is divisible by 4. Hence −x is also maximal hyperbolic.
Let now (r′,s′,t′)∈X∗(p) be a solution
with s′ elliptic. Note that (4p2−1≡0modp+1)
while (4p2−1≡2p−1modp−1).
Thus rot2\leavevmode(p2−1)/4 fixes all four elements
in [r′,s′,t′] while mapping (0,x,ix)↦(0,x,−ix)
and (0,−x,−ix)↦(0,−x,ix). By Lemma
5.4, there is some g′∈Λj−1
mapping [0,x,ix]↦[r′,s′,t′]. The element
h1=(g′)−1⋅rot1\leavevmode−(p2−1)/4⋅g′⋅rot1\leavevmode(p2−1)/4
is in Λj−1 and maps (0,x,ix)↦(0,x,−ix)
and (0,−x,−ix)↦(0,−x,ix).
Case 2: p=pj≡3(4)
In Proposition 5.6
below, we prove there is a solution (x,y,z)∈X∗(p)
with both x and y elliptic of order divisible by 4. In this
case, −x has the same order as x, say this order is 4m and
note that 4m∣(p+1). Let (r,s,t)∈X∗(p)
be another solution with r hyperbolic. As p−1≡2(4),
there is a number q with (q≡2mmod4m) and (q≡0modp−1).
We get that rot1\leavevmodeq fixes all four elements in [r,s,t]
while mapping (x,y,z)↦(x,−y,−z) and
(−x,−y,z)↦(−x,y,−z). By Lemma 5.4,
there is some g∈Λj−1 mapping [x,y,z]↦[r,s,t].
The element h1=g−1⋅rot1\leavevmode−q⋅g⋅rot1\leavevmodeq
is in Λj−1 and maps (x,y,z)↦(x,−y,−z)
and (−x,−y,z)↦(−x,y,−z). In the same
fashion, we find an element of Λj−1 mapping (x,y,z)↦(−x,y,−z)
and we are done.
∎
Modulo Proposition 5.6 which
we prove at the end of this subsection, we can now complete the proofs
of Theorem 1.6
and Corollary 1.7:
We use Notation 5.2. We need to show
that Γ acts transitively on X∗(n). We prove
that Γ acts transitively on X∗(Mj) for
j=1,…,k (recall that Mk=n). For j=4 we verified by
computer that Γ is transitive on X∗(2⋅5⋅7⋅11).
For j≥5, we use induction and assume that Γ acts transitively
on X∗(Mj−1). From Lemma 5.5
it follows that Γ is transitive on X∗(Mj).
∎
We complete the subsection with the proposition we use in the proof
of case 2 in Lemma 5.5:
Proposition 5.6**.**
For every prime p=3,11
with p≡3(4), there is a solution (x,y,z)∈X∗(p)
with two coordinates elliptic of order divisible by 4.
In the proof of Proposition 5.6
we use notation as in Section 4.2.3.
As 4∣(p+1), if ω∈H is not a square then 4∣∣ω∣.
Thus, it is enough to find a solution (x,y,z)∈X∗(p)
with x,y elliptic and the corresponding ωx,ωy
not squares in H.
Lemma 5.7**.**
Assume y=ω+ω−1
is elliptic (so ω∈H). Then ω is a square in H
if and only if y+2 is a square in Fp.
Proof.
Note that y+2=ω+ω−1+2=(ω1/2+ω−1/2)2.
If ω1/2∈H then ω1/2+ω−1/2∈Fp.
On the other hand, if ω1/2∈/H, then ω(p+1)/2=−1
and so ω1/2+ω−1/2∈/Fp, because
[TABLE]
(the last inequality stems from (ω1/2+ω−1/2)2=y+2=0).
∎
Fix x∈Fp elliptic of maximal order (p+1). So
4∣∣ωx∣=p+1. By Lemma 5.7,
it is enough to find y,z∈Fp such that (x,y,z)∈X∗(p)
is a solution, y is elliptic and y+2 is a non-square. Since
y elliptic means that y2−4=(y+2)(y−2)
is not a square, we need to find y,z with (x,y,z)∈X∗(p)
and y+2 a non-square and y−2 a square.
Imitating the notation from Section 4.2.3,
assume x=ω+ω−1 with ω∈H, choose some A∈Fp2
for which Ap+1=x2−4x2, and let fA(h)=Ah+Aph−1
for h∈H. Then,
[TABLE]
Recall the parametrization of H∖{−i} by
elements from Fp described in Lemma 4.11: h(s)=1+s22s+i(1−s2)=s−i−i(s+i).
Define g1,g2∈Fp[s] as follows:
[TABLE]
It is not hard to see that gj(s)∈Fp[s]:
indeed, A+Ap,i(A−Ap)∈Fp. We now
show that for large enough p, there is some s∈Fp
for which
[TABLE]
Denote by N(−1,1) the number of s∈Fp
for which (21) holds. Our goal is to show that for large
enough p, N(−1,1)>0. As in the proof of Proposition
4.3, g1
and g2 have no zeros inside Fp because there
are no solutions in X∗(p) involving ±2. So
[TABLE]
For ∅=B⊆{1,2}, let MB=def∑s∈Fp(p∏j∈Bgj(s))
and then (22) becomes
[TABLE]
Note that
[TABLE]
where gA(s) is defined as in Equation (16)
in Section 4.2.3 for m=1. As our analysis
in Section 4.2.3 shows, all roots of
gA, except for ±i, have multiplicity 1. Thus, none
of g1, g2 or g1g2 is a square in Fp[x].
Now g1 and g2 have each at most 4 distinct roots and by
Theorem 4.10, M{1},M⌊2⌋≤3p.
Their product g1g2 has at most 6 distinct roots, hence
by Theorem 4.10M{1,2}≤5p.
From (23) we get
[TABLE]
So for p>112=121 we have N(−1,1)>0 and we are
done.
For all primes p with p≡3(4), p≤121 and
p=3,11, we verified by a computer there is a solution (x,y,z)∈X∗(p)
with x,y elliptic and of order divisible by 4. For example,
one can take (3,3,3)∈X∗(7), (6,6,8)∈X∗(19),
(3,3,3)∈X∗(23) and (4,4,9)∈X∗(31).
∎
5.3 Transitivity without the classification
In this section we prove Theorem 1.9
concerning the transitivity of Γ in square free composite
moduli without relying on the CFSG. We are going to use some strong
results from the theory of permutation groups, mostly revolving around
O’Nan-Scott theorem. While strong, the proofs of these results are
completely contained in the book [DM96] and
are not more than a few-page-long each. We stress that if all primes
in the decomposition of n are 2 or (1mod4), then
already the proof in the previous sections does not rely on the CFSG.
More concretely, let n=p1⋯pk be a product of distinct
primes, and we assume that Qpj is a primitive permutation
group in its action on Y∗(pj) for every j=1,…,k.
Our goal is to show then that Γ acts transitively on X∗(n).
It is enough to prove Lemma 5.3
above, as we already showed in Section 5.2
how it yields the conclusion we seek. Throughout this subsection we
assume Notation 5.2.
The CFSG-free proof of Lemma 5.3
uses the important concept of the socle:
Definition 5.8**.**
A minimal normal subgroup of a non-trivial group
G is a normal subgroup K=1 of G which does not contain
properly any other non-trivial normal subgroup of G. The socle
of G, denoted **††margin:
Soc(G)Soc(G),
is the subgroup generated by the set of all minimal normal subgroups
of G. Note that Soc(G) is generated by normal subgroups
of G and thus Soc(G)\trianglelefteqslantG.
For example, if m≥5 then Soc(Sym(m))=Soc(Alt(m))=Alt(m).
In contrast, Soc(Sym(4))=Soc(Alt(4))={1,(12)(34),(13)(24),(14)(23)}.
Theorem 5.9** (See [DM96, Theorems 4.3B, Corollary 4.3B and Theorem 4.7A]).**
Let G≤Sym(n)
be a primitive subgroup. Then exactly one of the following holds:
For some prime p and some integer d, the group G is permutation
isomorphic101010Two permutation groups are permutation isomorphic if they are the
same permutation groups except for, possibly, the labeling of the
points in the sets they act on.* to a subgroup of the affine group Aff(p,d)
acting on Fp\leavevmoded, so, in particular, n=pd. In
this case, Soc(G) is a regular111111A permutation group H≤Sym(n) is called regular
if it is sharply transitive. Namely, it is transitive and free. In
other words, it is transitive and of order n. The name originates
from the observation that such subgroups are obtained as the (left
or right) regular representation of order-n groups. elementary abelian subgroup of order pd.*
2. 2.
Soc(G)=K1×K2* where K1,K2\trianglelefteqslantG
are minimal normal subgroups of G, which are regular, non-abelian
and permutation isomorphic to each other. Moreover121212For G a group and K≤G a subgroup, CG(K)={g∈G∣gk=kg\leavevmode∀k∈K}
is the centralizer of K in G., CG(K1)=K2 and CG(K2)=K1.
In addition, K1≅K2≅Tm for some finite simple
non-abelian group T and some m∈Z≥1.*
3. 3.
Soc(G)* is a minimal normal subgroup of G. Moreover,
CG(Soc(G))=1 and Soc(G)≅Tm
for some finite simple non-abelian group T and some m∈Z≥1.*
If G≤Sym(n)
is a primitive permutation group and 1=H\trianglelefteqslantG
is a non-trivial normal subgroup, then H is transitive.
Corollary 5.11**.**
*If p≥5 is prime and
Qp is primitive, then ††margin:
Soc(p)Soc(p)=defSoc(Qp)
acts transitively on Y∗(p) and is a direct product
of non-abelian simple groups.*
Proof.
Transitivity follows from Theorem 5.10
and the fact that the socle is a normal subgroup. Case (1) of Theorem
5.9 is ruled out because ∣Y∗(p)∣=4p(p±3)
is not a prime power (or, alternatively, because Aff(p,d)
has no non-identity elements fixing more than half of the points,
such as rot1p(p+1)/2∈Qp). So either Qp
falls into case (2) or it falls into case (3).
∎
We also use the following result giving strong limitations on primitive
groups:
Theorem 5.12** (See [DM96, Theorems 5.3A and 5.5B]).**
Let GSym(n),
G=Alt(n), be a primitive permutation group.
If G is not 2-transitive then ∣G∣<exp{4n(lnn)2}.
2. 2.
If n≥216 and G is 2-transitive and contains a section131313A section of a group is some quotient of a subgroup.*
isomorphic to Alt(k), then k<6lnn.*
Lemma 5.13**.**
Let
p and q be distinct primes with Qp and Qq primitive,
and such that p precedes q in the order defined in Notation
5.2. Then Qpq≥1×Soc(q)
(sitting inside Sym(Y∗(p))×Sym(Y∗(q))).
Proof.
Recall that the primes are sorted by the order of rotation elements.
So if op (oq, respectively) is the order of rot1
in Qp (Qq, respectively) then op≤oq.
Case 1: op<oq
If the inequality is strict, then the image of g=rot1op∈Γ
in Qp is the identity whereas its image g in Qq
is not. By Corollary 5.11, Soc(q)
falls under one of cases (2) or (3) from
Theorem 5.9.
Assume first that Soc(q) falls under case (3). Since
CQp(Soc(q))=1, there is some h∈Soc(q)
not commuting with g∈Qq, so e=[g,h]=ghg−1h−1∈Soc(q)∩πq(ker(Γ↠Qp)).
Since Soc(q) is a minimal normal subgroup of Qq,
it is generated by the conjugates of [g,h]
in Qq, all of which also belong to πq(ker(Γ↠Qp)).
Thus Soc(q)≤πq(ker(Γ↠Qp)).
Now assume that Soc(q) falls under case (2). Since
regular subgroups of Sym(n) are obtained as
the (left or right) regular representation of a group of order n,
every element of a regular permutation group has all its cycles with
equal length. Since rot1∈Qq contains cycles of coprime
lengths, no non-trivial power of it can belong to a regular subgroup,
so g=rot1op∈/K1∪K2. So there
are h1∈K1 and h2∈K2 not commuting with g.
Consider h=h1h2∈K1×K2=Soc(q).
Then [g,h]=([g,h1],[g,h1])∈K1×K2=Soc(q)
belongs also to πq(ker(Γ↠Qp))
but not to K1∪K2. The only normal subgroups of Qp
which are contained in K1×K2 are 1,K1,K2 and
K1×K2. Hence K1×K2 is generated by the
conjugates in Qq of [g,h], all of which
belong to πq(ker(Γ↠Qp)).
Thus Soc(q)≤πp(ker(Γ↠Qp)).
Case 2: op=oq
We are left with the rare case141414In fact, the only such case with p<1,000,000 is p=5.
that op=oq, as in p=11 and q=5. In this case p>q,
p≡3(4), q≡1(4) and (p2−1)=q(q2−1).
In particular, as Qq is primitive, it contains the full alternating
group Alt(Y∗(q)) by the CFSG-free
Theorem 1.3. We claim that Qp has no composition
factor isomorphic to Alt(Y∗(q)).
Using this, we can finish as in the proof of Lemma 5.1:
indeed, consider the following normal series of Qpq
[TABLE]
Since Qq is a quotient of Qpq, Alt(Y∗(q))
is a composition factor of Qpq, so it has to be a composition
factor of one of the quotients in (24).
The rightmost quotient is Qp which we show below has no composition
factor isomorphic to Alt(Y∗(q)).
The second quotient is \nicefracZ2Z or trivial.
Thus, the leftmost quotient contains Alt(Y∗(q))
as a composition factor, namely, Qpq≥1×Alt(Y∗(q)),
and we are done as Soc(q)=Alt(Y∗(q)).
So we have left to show that Qp has no composition factor isomorphic
to Alt(Y∗(q)). This is certainly
the case if Qp≥Alt(Y∗(p))
(as in the case p=11,q=5). So assume QpAlt(Y∗(p))
and proceed using Theorem 5.12.
First, assume that Qp is not2-transitive. Asymptotically,
its order is smaller than that of Alt(Y∗(q)):
indeed, if n=∣Y∗(p)∣=4p(p−3)
then n≈q3, and so by Theorem 5.12,
[TABLE]
In fact, this asymptotic reasoning starts taking effect for q≥203,897,
but for smaller values of q there are no cases for which op=oq
except for q=5 (this was easily verified by computer).
Finally, assume that Qp is 2-transitive. Then, not only does
it not have a composition factor isomorphic to Alt(Y∗(q)),
it does not even have a section isomorphic to it: since 4p(p−3)≥216,
Theorem 5.12 says that k=4q(q+3)<6ln4p(p−3).
This is impossible when q≥13.
∎
We can now finish our CFSG-free proof of Lemma 5.3.
Assume n=p1⋯pk is a product of distinct primes with
Qp1,…,Qpk primitive and p1,…,pk
ordered as in Notation 5.2. We need to
show that for every j=2,…,k, the image of Ωj−1
in Qpj, πpj(Ωj−1) contains
a subgroup Hj≤Sym(Y∗(pj))
which is transitive and isomorphic to a direct product of non-abelian
simple groups. We show that πpj(Ωj−1)≥Soc(pj),
which is enough by Corollary 5.11.
Without loss of generality, it is enough to prove this when j=k.
As Soc(pk)≅∏i=1mTi with T1,…,Tm
non-abelian simple groups, each of them satisfies [Ti,Ti]=Ti.
Hence for any t∈Z≥1 there is a sequence of elements
g1,…,gt∈Soc(pk)
so that the nested commutator
[TABLE]
has non-trivial projection in each of the Ti’s. Choose such
a sequence of length t=k−1. By Lemma 5.13,
for every i=1,…,k−1, there is an element gi∈Γ
with πpi(gi)=1 and πpk(gi)=gi.
The element
[TABLE]
satisfies then that πpi(g)=1 for all i=1,…,k−1,
whereas πpk(g)∈Soc(pk) in not
contained in any proper normal subgroup of Soc(pk).
Hence every element of Soc(pk) is a product of conjugates
of πpk(g), and we obtain that πpk(Ωk−1)≥Soc(pk).
∎
6 T2-systems
This section explains why Theorem 1.11 is
equivalent to Theorems 1.3 and 1.4.
Namely, if we let Σ2,−2(p) denote the set of
PSL(2,p)-defining subgroups of F2 with
associated trace −2, our goal here is to show:
A one-to-one correspondence between Y∗(p) and Σ2,−2(p),
and
2. 2.
An isomorphism between Qp, the group of permutations induced
by the action of Γ on Y∗(p), and the group
of permutations induced by the action of Aut(F2)
on Σ2,−2(p).
First, let us define Σ2,−2(p) properly. For
A,B∈PSL(2,p), define
[TABLE]
where ∼ is the equivalence of changing the sign of two of the
coordinates (each of A and B is a well-defined matrix in SL(2,p)
up to a sign). Assume ⟨A,B⟩=PSL(2,p),
and let φ:F2↠PSL(2,p)
be the epimorphism mapping the generators a and b of F2
to A and B, respectively. The kernel N=kerφ is a PSL(2,p)-defining
subgroup of F2, and define
[TABLE]
Recall that Σ2(G) denotes the set of G-defining
subgroups of F2.
Claim 6.1*.*
The map Tr:Σ2(PSL(2,p))→\nicefracFp\leavevmode3∼
is well-defined.
Proof.
Let G=PSL(2,p). Given N∈Σ2(G),
all epimorphisms F2↠G with kernel N are
obtained one from the other by post-composition with some automorphism
from Aut(G). But every automorphism of G
is obtained by a conjugation by some element from PGL(2,p).
Evidently, such conjugation does not effect the image of Tr on
the images of the generators a and b of F2.
∎
Recall that tr([A,B])=Q(trA,trB,trAB)
where Q(x,y,z)=x2+y2+z2−xyz−2. Thus, for N∈Σ2(PSL(2,p)),
the element
[TABLE]
is well-defined, and we denote††margin:
Σ2,−2(p)
[TABLE]
Note that, by definition, for every N∈Σ2,−2(p)
the triple Tr(N) is (an equivalence class up to sign
changes of) a solution to the Markoff equation (1)
over \nicefracZpZ.
Claim 6.2*.*
The map \mathrm{Tr}\Big{|}_{\Sigma_{2,-2}\left(p\right)} is a bijection from
Σ2,−2(p) to Y∗(p).
Proof.
Consider the map Tr:SL(2,p)×SL(2,p)→Fp\leavevmode3
defined as in (25). By [Mac69, Theorems 2 and 3],
if (x,y,z)∈Fp\leavevmode3 is the image of some
generating pair in SL(2,p), then every two
pairs in Tr−1((x,y,z)) are
conjugated one to the other by an element g∈SL(2,Fp).
Since these pairs are generating, this conjugation by g is an automorphism
of SL(2,p). As every automorphism of SL(2,p)
is also an automorphism of PSL(2,p), we obtain
that
[TABLE]
is injective.
By [Mac69, Thm 1], the map Tr
is surjective. The analysis in [MW13, Section 11]
shows that the only triple (x,y,z)∈Fp\leavevmode3
with Q(x,y,z)=−2 which does not correspond to generating
pairs is151515To see that (0,0,0) is not associated with a generating
pair, note that if A∈PSL(2,p) has trace [math],
then A is an involution. If both A and B are involutions,
then ⟨A,B⟩ is a dihedral group, which
is a proper subgroup of PSL(2,p). (0,0,0). This completes the proof of the claim.
∎
We have left to show the isomorphism of Qp and the permutation
group induced by Aut(F2) on Y∗(p)≅Σ2,−2(p).
Recall that Qp=⟨τ(12),τ(23),R3⟩.
For F2=F(a,b), Aut(F2)
is generated by the following Nielsen moves161616We deliberately copy the notation for these Nielsen moves from [MW13].:
r:(a,b)↦(a−1,b), s:(a,b)↦(b,a)
and t:(a,b)↦(a−1,ab). The induced
action of these three automorphisms on Y∗(p) is easily
seen to be the same action given by R3, τ(12)
and τ(23), respectively.
Appendix
Appendix A On the order of a quadratic integer modulo most primes
By Dan Carmon
Throughout this appendix, we use the notation f≪g to mean that
there exists an absolute constant C>0 for which f≤Cg for
all valid values of the implicit variables. The similar notation f≪ag
means there exists a function C=C(a)>0 for which f≤Cg. The
notation f≍g is shorthand for “f≪g and g≪f”.
The main claim
Let a∈Q(D) be a fixed quadratic integer with
norm 1 and absolute value ∣a∣>1 (e.g. a=23+5).
For primes p∤D, consider the residue aˉ=(amodp),
as an element of either Fp or Fp2,
depending on whether D is a quadratic residue modulo p. In both
cases there are two possible choices for aˉ, but its order
op(a), which is the smallest positive integer satisfying aˉop(a)=1∈Fp2
is well-defined. Let π(x)=#{p≤x:p – prime}††margin:
π(x)
be the prime counting function. We prove the following claim:
Proposition A.1**.**
For any constant C≥1,
[TABLE]
where δ is the Erdős-Tenenbaum-Ford constant,
[TABLE]
In particular, the set of primes with op(a)>Cp has relative
density 1.
Proof outline
Proposition A.1 follows from the combination
of two sub-lemmas:
Lemma A.2**.**
Let α=α(x) tend to infinity arbitrarily
slowly with x, and let y=αx. Then
[TABLE]
Lemma A.3**.**
Let α,y be as in the previous Lemma. Define
z=Cx, and u0=logxlogα. Suppose further
that α∈(C24,Cx).
Then
[TABLE]
Indeed, since a has norm 1, op(a) is always a factor of either
p−1 when D is a quadratic residue modulo p, or of p+1
when D is a non-quadratic residue, i.e. p≡±1(modop(a))
in either case. Thus op(a)≤Cx implies that p is
either included in the set of the first lemma if op(a)≤y,
or in the set of the second lemma if op(a)∈(y,z]. Choosing
the optimal value
[TABLE]
yields the claimed value in the right hand side of both lemmas.
The following proof is an adaptation of an argument from Erdős
and Murty [EM99, Introduction], in which only integral values
a and a specific choice of α were considered.
For every k≥1 define Ak=Dak−a−k.
Note that Ak is always an integer, with ∣Ak∣<∣a∣k,
and that op(a)=k implies p∣Ak. Define
[TABLE]
so that op(a)≤y implies p∣By. We now observe that
This lemma is a direct application of results due to Ford [For08].
We cite the relevant definitions and theorems. Ford’s main object
of study is the function
[TABLE]
We are particularly interested in the specialized function
[TABLE]
where Pλ={p+λ:p – prime} is a set of shifted
primes, and more specifically only for λ=±1.
In [For08, Theorem 1], Ford estimates H(x,y,z)
for all possible choices of y≤z≤x. The relevant case for
our choice of y,z is the third subcase of case (v), wherein x,y,z
are all large, y≤x, and z∈[2y,y2], all of which
are immediately validated for our values, due to the constraint on
α. For this case, the theorem states
[TABLE]
where u is the number satisfying z=y1+u, or equivalently
[TABLE]
In [For08, Theorem 6], Ford estimates H(x,y,z;Pλ),
for any fixed non-zero λ. The behaviour of the function is
determined by whether z is greater or lesser than y+(logy)2/3.
The constraint on α implies z≥2y, so we are certainly
in the regime of z≥y+(logy)2/3, in which the theorem yields
[TABLE]
Combining the estimates (33),(34),(35)
yields (28), proving the lemma.
∎
Bibliography17
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[BGS 16] Jean Bourgain, Alexander Gamburd, and Peter Sarnak. Markoff triples and strong approximation. Comptes Rendus Mathematique , 354(2):131–135, 2016.
2[BGS 17] Jean Bourgain, Alexander Gamburd, and Peter Sarnak. Markoff surfaces and strong approximation: 1. ar Xiv preprint ar Xiv:1607.01530, 2017+.
3[CGMP 16] Alois Cerbu, Elijah Gunther, Michael Magee, and Luke Peilen. The cycle structure of a Markoff automorphism over finite fields. preprint ar Xiv:1610.07077, 2016.
4[DM 96] John D. Dixon and Brian Mortimer. Permutation groups . Springer Science & Business Media, 1996.
5[EM 99] Pál Erdős and M. Ram Murty. On the order of a (mod p). CRM Proceedings and Lecture Notes , 19:87–97, 1999.
6[Eva 93] Martin J Evans. T-systems of certain finite simple groups. Mathematical Proceedings of the Cambridge Philosophical Society , 113(1):9–22, 1993.
7[For 08] Kevin Ford. The distribution of integers with a divisor in a given interval. Annals of mathematics , 168(2):367–433, 2008.
8[Gil 77] Robert Gilman. Finite quotients of the automorphism group of a free group. Canad. J. Math , 29(3):541–551, 1977.