This paper proves that certain non-isotrivial ordinary abelian surfaces over global function fields are infinitely often isogenous to a product of elliptic curves at various places, under specific conditions on real multiplication.
Contribution
It establishes the existence of infinitely many places where the abelian surface splits into elliptic curves, extending understanding of abelian surface reductions over function fields.
Findings
01
Infinitely many places with isogeny to elliptic curve products
02
Conditions on real multiplication are crucial
03
Advances knowledge of abelian surface behavior over function fields
Abstract
Let A be a non-isotrivial ordinary abelian surface over a global function field with good reduction everywhere. Suppose that A does not have real multiplication by any real quadratic field with discriminant a multiple of p. We prove that there are infinitely many places modulo which A is isogenous to the product of two elliptic curves.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgebraic Geometry and Number Theory
Full text
Reductions of abelian surfaces over global function fields
Davesh Maulik
,
Ananth N. Shankar
and
Yunqing Tang
Abstract.
Let A be a non-isotrivial ordinary abelian surface over a global function field with good reduction everywhere. Suppose that A does not have real multiplication by any real quadratic field with discriminant a multiple of p. We prove that there are infinitely many places modulo which A is isogenous to the product of two elliptic curves.
Let p be an odd prime and let A2 denote the moduli stack of principally polarized abelian surfaces over Fp.
We view A2 as (the special fiber of the canonical integral model of) a GSpin Shimura variety and let Z(m) denote the Heegner divisors in A2; more precisely, Z(m) parametrizes abelian surfaces with a special endomorphism s such that s∘s is the endomorphism given by multiplication by m (see §2.2).
Theorem 1**.**
Assume p≥5.
Let C be an irreducible smooth quasi-projective curve with a finite morphism C→A2,Fp. Assume that the generic point of C corresponds to an ordinary abelian surface.
(1)
If the image of C is not contained in any Heegner divisor Z(m), and if C is projective, then there exist infinitely many Fˉp-points on C which correspond to non-simple abelian surfaces.
2. (2)
If the image of C is contained in some Z(m) such that p∤m, then there exist infinitely many Fˉp-points on C which correspond to abelian surfaces isogenous to self-products of elliptic curves
In Theorem 1(2), note that the elliptic curve may vary for these points. An equivalent statement is that there exist infinitely many Fˉp-points on C which correspond to abelian surfaces whose Néron–Severi ranks are strictly larger than that of the generic point of C. Note that in the case (2), any irreducible component of Z(m)⊂A2 is an irreducible component of a Hecke translate of some Hilbert modular surface associated to the real quadratic field F=Q(m) (if m is a square number, then we obtain a Hecke translate of the self-product of the modular curve).
Remark*.*
The assumption that the generic point is ordinary is necessary (especially if we formulate the theorem in terms of the Néron–Severi rank). For instance, in the case (2), we may take C to be an irreducible component of the non-ordinary locus. If p is inert in F, then all the points on C are supersingular and the Néron–Severi rank does not jump. If p is split in F, then the only points where the Néron–Severi rank jumps are the finitely many supersingular points.
Remark*.*
We make the (technical) assumption that C is projective in (1) because the Heegner divisors Z(m) are all non-compact and we plan to remove this assumption in future work. On the other hand, the Hilbert modular surfaces considered in (2) do contain compact special divisors (see the second half of §2.2 for the definitions of special divisors in the Hilbert case, and §4.3.3 for a criterion of when these special divisors are compact) whose Fp points parameterize abelian surfaces isogenous to a self-product of elliptic curves. By working exclusively with these compact special divisors, we no longer need assume that C is projective.
Remark 2*.*
A modification of our argument shows that with the same assumption in (1), for a fixed real quadratic number field F, there are infinitely many ordinary Fˉp-points on C such that the corresponding abelian surfaces admit real multiplication by F. Here we need to assume p≥7 if p is ramified in F. Otherwise, p≥5 is enough.
To prove Theorem 1(1), we consider the intersection number of C and Z(ℓ2), where ℓ is a varying prime number. If we consider Z(ℓ) with ℓ≡3mod4 instead, we prove
Theorem 3**.**
Suppose we have the same assumptions as in Theorem 1(1). Then there are infinitely many ordinary Fˉp-points on C such that, for each of these points, the corresponding abelian surface admits real multiplication by the ring of integers of some real quadratic field (note that the quadratic fields may vary for these points).
It would be interesting to find Fˉp-points of complex multiplication by maximal orders, but our current method only asserts real multiplication by maximal orders.
1.2. Previous work and heuristics
Theorem 1 is a generalization of [CO06, Proposition 7.3], where Chai and Oort proved Theorem 1(2) with A1×A1 taking the place of a Hilbert modular surface. Their proof crucially uses the product structure of the Shimura variety, as well as the product structure of the Frobenius morphism. Following the discussion in §7 of [CO06], Theorem 1 is related to a bi-algebraicity conjecture. See §1.4 for more details.
We offer the following heuristic for Theorem 1(1).
Using Honda and Tate’s classification of Fqn-isogeny classes of abelian varieties in terms of Weil-qn numbers, the number of Fq-isogeny classes of abelian varieties is seen to equal qn(3/2+o(1)). Similarly, the number of splitFqn-isogeny classes in A2 is seen to equal qn(1+o(1)). If we treat the map from C(Fqn) to the set of Fqn-isogeny classes as a random map, we expect that the number of Fqn points of C which are not simple is around qn/2(1+o(1)). Letting n approach infinity, this heuristic suggests that infinitely many points of C(Fq) that are split.
There are analogous questions in other settings. For the case of equicharacteristic [math], these results are well known (for instance, the density of Noether–Lefschetz loci is discussed in [Voisin]). In mixed characteristic, the analogue of Theorem 1(2) is treated in [Ch], [Ananth17]. The major difference between Theorem 1 and these other cases is that the ordinary generic point assumption is crucial since the result is simply false otherwise (as remarked in §1.1).
Indeed, this difference hints at the key difficulty in our setting, which is that the local intersection number at a supersingular point is of the same magnitude as the total intersection number, which makes the approach more complicated than that of [Ananth17]; we discuss this in more detail in §1.3.
1.3. Proof of the main results
We view both Hilbert modular surfaces and the Siegel three-fold as GSpin Shimura varieties attached to a quadratic space (V,Q). In each setting, we have a notion of special endomorphisms and special divisors and, for simplicity, we use the same notation Z(m).
The main idea of the proof is to compare the global and local intersection numbers of C.Z(m)111Although C is not a substack of A2, we may define C.Z(m) as the degree of the pull back of Z(m) via C→A2,Fˉp when C is projective. for appropriate sequences of m and show it is not possible for finitely many points to account for the total global intersection as m increases.
More precisely,
(1)
The global intersection number I(m):=C.Z(m) is controlled by Borcherds theory [Bor98] (see also [Davesh] and [HMP]).
2. (2)
We prove that as m→∞, the total local contribution from supersingular points is at most 1211I(m) by studying special endomorphisms.222Indeed, the ratio depends on p and it goes to 1/2 as p→∞.
3. (3)
We prove that the local contribution from a non-supersingular point is o(I(m)) as m→∞.
This allows us to conclude that, as m→∞, more and more points of C contribute to the intersection C.Z(m). In order to prove Theorem 1(1), the sequence of m will consist only of squares, and in order to prove Theorem 3, the sequence will consist only of primes. Note that in A2, the Heegner divisor Z(m) for square m parametrizes abelian surfaces which are not geometrically simple, thereby allowing us to deduce Theorem 1(1). Similar arguments allow us to deduce part Theorem 1(2), and also Theorem 3.
Compared to the number field situation, the main difficulty of the positive characteristic function field case is that the local contributions at supersingular points are of the same magnitude as the global contribution. More precisely, taking the Hilbert case as an example, Borcherds theory implies that the generating series of Z(m) is a non-cuspidal modular form of weight 2;
on the other hand, the theta series attached to the special endomorphism lattice at a supersingular point is also a non-cuspidal weight 2 modular form since the lattice is of rank 4.
Therefore, even without considering higher intersection multiplicities, the local intersection number of C.Z(m) at a supersingular point is also of the same magnitude as the growth rate of Fourier coefficients of an Eisenstein series of weight 2.
Bounding the local contribution from a supersingular point
Let A→C denote the family of principally polarized abelian surfaces induced from a morphism C→A2,Fp,
and let SpfFp[[t]]→C denote the formal neighborhood of a supersingular point. For a special endomorphism s such that s∘s=m, we say that s is of norm m.
The local contribution to C.Z(m) from this supersingular point equals ∑n=0∞rn(m), where rn(m) is the number of special endomorphisms of Amodtn+1 with norm m. Therefore, in order to bound the local contribution, it suffices to prove that, as
n→∞,
there are many special endomorphisms of Amodtn which decay rapidly enough (see 5.1.1, Theorem 5.1.2, Theorem A.0.1 for precise statements).
A similar decay result appears in the mixed characteristic setting (see [Ananth17]), by a straightforward application of Grothendieck–Messing theory. In the equicharacteristic case, however, proving our decay results is much more involved. In particular, we need to use Kisin’s description [Kisin, §1.4, 1.5] of the F-crystal associated to a certain automorphic vector bundle Lcris, whose F-invariant part is the lattice of special endomorphisms, in order to prove the required decay. See §3.1.5 and the proof of Theorem 5.1.2 for more details.
We will focus on the Siegel case from now on. Let L0 denote the lattice of special endomorphisms of Amodt, and let Ln⊂L0 be the lattice of special endomorphisms of Amodtn+1. These lattices are of rank 5 and are equipped with natural quadratic forms such that Amodtn+1 admits a special endomorphism of norm m if and only if m is represented by Ln. Broadly speaking, we can bound the local contribution by using geometry-of-numbers techniques. To obtain the desired estimate, we choose the sequence m as follows. We first prove the existence of a rank 2 sublattice Pn⊂Ln that has the following property: for all m bounded by an appropriate function of n, the abelian surface Amodtn+1 has a special endomorphism of norm m only if the quadratic form restricted to Pn represents m. This fact follows from the existence of a rank 3 submodule of special endomorphisms which decay rapidly (Theorems 5.1.2 and A.0.1). Furthermore, the discriminant of Pn goes to infinity as n→∞.
Therefore, the density of numbers (or primes, or prime-squares) represented by the binary quadratic form Pn approaches zero, as
n→∞.
We now pick a sequence of prime-squares m none of which are represented by Pn defined by the finitely many supersingular points on C.
The non-ordinary locus is singular at superspecial points. This allows us to prove the existence of a special endomorphism that decays “more rapidly than expected” (see 5.1.1(3)). Consequently, by the explicit formula of Eisenstein series in these cases by [BK01], we prove that the sum of local contributions at supersingular points is at most 11/12 of the global contribution.
We remark that our proof is more involved than the proof of [CO06, Proposition 7.3] because the intersection theory on Hilbert modular surfaces and Siegel three-folds is more complicated than that on the product of j-lines.
1.4. Additional remarks
The key difference between the number field and function field situation is the following. Let A be an abelian surface over OK, where K is a local field. The Zp-module of special endomorphisms of A[p∞] has rank ≤3. This rank equals three if and only if A can be realized as the limit point (in the analytic topology) of a sequence of CM points. This can happen in the mixed characteristic case, but not in the equicharacteristic p case unless A is defined over a finite field.333Ordinary abelian varieties which have CM are defined over finite fields. Thus, we have a rank 3 decay in the Decay Lemmas (Theorems 5.1.2 and A.0.1).
In the setting of higher dimensional GSpin Shimura varieties, for the same reason, we expect that generalizations of the Decay Lemma will only yield a rank 3Zp-module that decays rapidly. This has the consequence of the existence of formal curves, such that the module of special endomorphisms of the p-divisible group over these formal curves have large rank. An interesting bi-algebraicity question is whether such formal curves can be algebraic without being special. In the ordinary case, Chai has the following conjecture:
Let X be a subvariety in a mod p Shimura variety passing through an ordinary point P. Assume that the formal germ of X at P is a formal torus in the Serre–Tate coordinates. Then X is a Shimura subvariety.
1.5. Organization of paper
In §2, we recall the notion of special endomorphisms, special divisors and crystalline realization Lcris of the automorphic vector bundle of special endomorphisms.
In §3, we recall the lattices of special endomorphisms of a supersingular point and compute Lcris on its deformation space.
In §4, we recall Borcherds theory and the explicit formula for the Fourier coefficients of vector-valued Eisenstein series due to Bruinier–Kuss; we use them to compare the global intersection number and the modt local intersection number at a supersingular point. Sections §5 and §6 are the key technical part of the paper. We prove the decay theorems for special endomorphisms, which we will use to bound the higher local intersection multiplicities at supersingular points. Section §7 provides the outline of the main proofs and by geometry-of-numbers arguments, we prove Theorem 1(2) in §8 and prove Theorem 1(1) and Theorem 3 in §9.
In order to get the main idea of the proof, the reader may focus on Theorem 1(2) and start from §§7,8 and refer back to §§3-5 when necessary.
1.6. Notation
We write f≍g if f=O(g),g=O(f). Throughout the paper, p is an odd prime.
Acknowledgement
We thank Johan de Jong, Keerthi Madapusi Pera, Arul Shankar, Salim Tayou, and Jacob Tsimerman for helpful discussions.
D.M. is partially supported by NSF FRG grant DMS-1159265. Y.T. is partially supported by the NSF grant DMS-1801237.
We would like to thank the referee for valuable suggestions on the exposition of the paper.
2. Special endomorphisms
In this section, we first introduce quadratic lattices (L,Q) such that the associated GSpin Shimura varieties will be A2 and certain Hilbert modular surfaces related to the Heegner divisors Z(m). The definition of special endomorphisms and Heegner divisors are given in §2.2.
2.1. The global lattice L
For a quadratic Z-lattice (L,Q), let C(L) (resp. C+(L)) denote the (resp. even) Clifford algebra of L. Let (−)′ denote the standard involution on C(L) fixing all elements in L given by (v1⋯vn)′=vn⋯v1 for vi∈L. Let V denote L⊗Q endowed with the quadratic form Q. There is a bilinear form [−,−] on V given by [x,y]:=Q(x+y)−Q(x)−Q(y).
Let LS be the rank 5Z-lattice endowed with the quadratic form Q(x)=x02+x1x2−x3x4 for x=(x0,⋯,x4)∈Z5. This quadratic form has signature (3,2) and LS is an even lattice, maximal among Z-valued sublattices in LS⊗Q. For p>2, LS is self-dual at p. A direct computation shows that C+(LS)≅M4(Z).
Let
[TABLE]
Then δ:=v0⋯v4∈C(LS) lies in the center of C(LS) and δ′=δ,δ2=1. Therefore, there is an isomorphism between quadratic spaces given by LS≃δLS⊂C+(LS). (See for instance [KR3, App. A].)
Given a vector x∈LS such that Q(x)=m,m∈Z>0, the orthogonal complement x⊥⊂LS endowed with the restriction of Q on x⊥ is a quadratic Z-lattice of signature (2,2) and let LH⊂x⊥⊗Q be a maximal lattice containing x⊥. If m is not a perfect square, let F denote the real quadratic field Q(m). A direct computation shows that there is an isomorphism LH⊗Q≅Q2⊕F such that Q((a,b,γ))=ab+NmF/Qγ. The assumption p∤m and p>2 implies that x⊥ and hence LH are self-dual at p.
Now let (L,Q) have signature (n,2), and let p be a prime such (L,Q) is self-dual at p. As in [AGHMP, §4.1 §4.2], [KR3, §1], there is a GSpin Shimura variety M attached to (L,Q) and this Shimura variety also admits a smooth integral model M over Z(p) since L is self-dual at p; the Shimura variety (and its integral model) recovers the moduli space of principally polarized abelian surfaces when L=LS (see Remark 2.2.2 for details) and it is a Hilbert modular surface when L=LH (see for instance [HY, §2.2, §3.1]). We may write ML and ML to emphasis on the dependence on L.
We first introduce the notion of special endomorphisms when L=LS and M is the moduli space of principally polarized abelian surfaces. Given an M-scheme S, let AS denote the pullback of the universal principally polarized abelian surface on M via S→M; let † denote the Rosati involution on AS.
Definition 2.2.1**.**
A special endomorphism of AS is an element s∈End(AS) such that s†=s and Trs=0, where Tr is the reduced trace on the semisimple algebra End(AS)⊗Q.
Remark 2.2.2*.*
Our definition of special endomorphisms is essentially the same as the one given by Kudla–Rapoport ([KR3, Def. 2.1, Eqn. (2.21)]). Indeed, as in [KR3, §§1-2], the moduli problem indicates that every M-scheme S gives rise to a principally polarized abelian scheme BS over S with ι:C+(L)↪End(BS) and a polarization such that the induced Rosati involution † satisfies ι(c)†=ι(cT), where (−)T is the transpose on C+(L)≃M4(Z) (see condition (iii) and the first paragraph of [KR3, p.701]);
moreover, for each ℓ=p, there is an isomorphism C+(L)⊗Zℓ≃Tℓ(BS), where Tℓ denotes the ℓ-adic Tate module, compatible with the C+(L)-action (it acts on itself via left multiplication; see [KR3, p.703]).444Although Kudla–Rapoport uses abelian schemes up to isogeny to give the moduli interpretation, one may translate it into abelian schemes up to isomorphism; see also [AGHMP17, §2.2]. Therefore, via ι, we have BS≅AS4, where AS is an abelian surface and by the compatibility of the polarization with ι (see also [KR3, Eqn. (1.9), (1.10)]), and the polarization on BS is induced by the self-product of a principal polarization on AS. Hence M parameterizes principally polarized abelian surfaces.
Moreover, an element sB in End(BS)≅M4(End(AS)) commuting with ι(C+(L)) is of form diag(s,s,s,s), where an endomorphism s of AS. In the sense of Kudla–Rapoport, such sB is special if and only if it is traceless and fixed by the Rosati involution on BS; this is equivalent to that s is traceless and fixed by the Rosati involution on AS. Therefore, our definition is the same as that of Kudla–Rapoport.
Definition 2.2.3**.**
Let D denote the Dieudonné crystal over MFp (i.e., the first relative crystalline homology of the universal family of principally polarized abelian surface over MFp). Let Lcris⊂End(D) denote the sub-crystal of trace [math] elements fixed by the Rosati involution.555Note that Frobenius is not an endomorphism on End(D), due to the existence of negative slopes. However, we will abuse terminology, and still treat End(D) and Lcris as F-crystals in the sense that we will view Frobenius as a map from End(D) to End(D)[1/p], while remembering the integral structure, and similarly for Lcris.
By definition, when S is a MFp-scheme, an element s∈End(AS) is a special endomorphism if and only if the crystalline realization of s∈End(DS) lies in Lcris,S.
Definition 2.2.4**.**
For the p-divisible group AS[p∞], we say s∈End(AS[p∞]) is a special endomorphism if the image of s in End(DS) lies in Lcris,S.
Remark 2.2.5*.*
In [MP16, §4.14], there is a definition of Lcris as a direct summand of the endomorphism of the first relative crystalline cohomology of the Kuga–Satake abelian scheme over MFp. More precisely, the left multiplication of GSpin(V,Q)⊂C+(V)× acting on C(V) induces a variation of Hodge structures on C(V) over M; this gives rise to the Kuga–Satake abelian scheme AKS over M and the Kuga–Satake abelian scheme extends over M. The 8-dimensional abelian scheme considered by Kudla–Rapoport is a sub abelian scheme of AKS via the natural embedding C+(V)⊂C(V). (Note that in [KR3], γ∈GSpin(V,Q) acts on C+(V) by the right multiplication by γ and C+(V) acts on C+(V) by left multiplication, which is opposite to the convention in [MP16]. This difference is due to the different choices of the symplectic pairing on C+(V) and C(V) in [KR3, (1.9)] and [MP16, §1.6]. If we use the symplectic pairing in [MP16] for the discussion in [KR3], then we obtain similar results as in [KR3] but with the convention consistent with that in [MP16].)
Let DKS denote the Dieudonné crystal of AKS over MFp; Madapusi Pera defined Lcris⊂End(DKS) by the crystalline realization of the absolute Hodge cycle induced by the GSpin(V,Q)-invariant idempotent which realizes V⊂End(C(V)) as a direct summand. Since the element δ given in §2.1 lies in the center of C(L), then it induces an isomorphism End(C(L))⊃L≅δL⊂End(C+(L)) compatible with GSpin(V,Q)-action.
Therefore, δ induces an isomorphism between the crystals Lcris in our sense and the one in the sense of Madapusi Pera; in particular, the notions of special endomorphisms coincide under the identification via δ. Also, for a special endomorphism s in both cases, s∘s is a scalar multiple Q(s) on the suitable abelian scheme; since δ2=1, hence Q(s) remains the same for images of s under various identification of special endomorphisms. By [MP16, Lem. 5.2], Q(s)>0 for all nonzero special endomorphism s.
Definition 2.2.6**.**
For m∈Z>0, the special divisorZ(m) is the Deligne–Mumford stack over
M with functor of points Z(m)(S)={s∈End(AS) special ∣Q(s)=m} for any M-scheme S. We use the same notation for the image of Z(m) in M. By for instance [AGHMP, Prop. 4.5.8], Z(m) is flat over Z(p) and hence Z(m)Fp is still a divisor of MFp; we denote Z(m)Fp by Z(m).
Lemma 2.2.7**.**
Every Fˉp-point of Z(m2) corresponds to a geometrically non-simple abelian surface.
Proof.
Let s be a special endomorphism of an abelian surface A such that s∘s=[m2]. Then (s−[m])∘(s+[m])=0. Since Trs=0, then s±[m]=0 and hence s±[m] are not invertible. Then ker(s−[m]) defines a non-trivial sub abelian scheme of A.
∎
We now discuss the case when L=LH. We keep the same notation as in §2.1. For simplicity, we first discuss the case when LH=x⊥, where x∈LS and Q(x)=m with p∤m; for the general case, the following discussion still holds true when replacing endomorphisms with suitable elements in End⊗Q (see the end of this subsection). When LH=x⊥⊂LS, the Shimura variety (and its integral model) MLH defined by LH is naturally a sub-Shimura variety of MLS, the moduli space of principally polarized abelian surfaces, and hence a point on MLH corresponds to a polarized abelian surface with real multiplication by O:=Z[x]/(x2−m). Let σ denote the ring automorphism on O satisfying xσ=−x. As before, let S be a MLH-scheme, and let AS denote the abelian surface over S with real multiplication by O.
Definition 2.2.8** ([HY, §3.1 p.26]).**
A special endomorphism (resp. special quasi-endomorphism) of AS is an element s∈End(AS) (resp. s∈End(AS)⊗Q) such that s†=s and s∘f=fσ∘s for all f∈O.
We still use D to denote the pullback to MLH,Fp the Dieudonné crystal over MLS,Fp in 2.2.3; since the abelian surfaces over MLH admit an O-action, the Dieudonné crystal D is also endowed with an O-action.
Definition 2.2.9**.**
Let Lcris⊂End(D) denote the sub-crystal of elements v fixed by Rosati involution and s∘f=fσ∘s for all f∈O. For the p-divisible group AS[p∞], we say s∈End(AS[p∞]) is a special endomorphism if the image of s in End(DS) lies in Lcris,S.
Remark 2.2.10*.*
By Remark 2.2.5 and [AGHMP17, Prop. 2.5.1, Prop. 2.6.4], in order to show that the above definitions of special endomorphisms and Lcris can be identified with those by Madapusi Pera, we only need to show that for an endomorphism s (of either the abelian surface or of its Dieudonné crystal D) fixed by the Rosati involution is traceless and orthogonal to x if and only if s∘x=−x∘s. To see this, note that if Trs=0, then s⊥x if and only if Q(s+x)−Q(s)−Q(x)=s∘x+x∘s=0; on the other hand, if s∘x=−x∘s, then x−1∘s∘x=−s and hence Trs=0.
2.2.11*.*
In general (i.e. when x⊥⊊LH), we may still use the same definition for Lcris and special endomorphisms of p-divisible groups, as x⊥ is self-dual at p and hence x⊥⊗Zp=LH⊗Zp. On the other hand,
we consider special quasi-endomorphismss∈End(AS)⊗Q which satisfy the following integrality condition: the ℓ-adic realizations of s lie in LH⊗Zℓ⊂End(Tℓ(AS)⊗Qℓ) for all ℓ=p and the crystalline realizations of s lie in Lcris,S. As in 2.2.6, the special divisorZ(m) is the Deligne–Mumford stack over MLH with Z(m)(S) given by
[TABLE]
for any M-scheme S. By the proof of [AGHMP, Prop. 4.5.8], where they used [MP16, Prop. 5.21], Z(m) is flat over Zp. We use Z(m) to denote the image of Z(m)Fp in MLH,Fp, which is a divisor in MLH,Fp.
2.3. Lattices of special endomorphisms of supersingular points
For a fixed supersingular point, let A denote the abelian surface attached to this point.
Definition 2.3.1**.**
Let L′′ denote the Z-lattice of special endomorphisms of A (resp. special quasi-endomorphisms when L=LH). Let L′′⊂L′⊂L′′⊗Q be a Z-lattice which is maximal at all ℓ=p and L′′⊗Zp=L′⊗Zp. Let Q′ denote the natural quadratic form on L′ given by s∘s=[Q′(s)]∈End(A)⊗Q. By the positivity of the Rosati involution, Q′ is positive definite (see for instance [MP16, Lem. 5.12]).
Even though there seem to be choices involved here, we will see that for our computation, these choices do not matter and the result will only depend on the Ekedahl–Oort stratum where the supersingular point lies in. The information of L′⊗Zp will be provided in §3.
Lemma 2.3.2**.**
(L′⊗Zℓ,Q′)≅(L⊗Zℓ,Q)* for ℓ=p.*
Proof.
Both lattices shall be maximal at ℓ and by [HP, Rem. 7.2.5], (L′⊗Qℓ,Q′)≅(L⊗Qℓ,Q). Then we conclude by the fact that there is a unique isometry class of Zℓ-maximal sublattices of a given Qℓ-quadratic space (see for instance, [HP, Thm. A.1.2]).
∎
Remark 2.3.3*.*
Actually, for the case of Hilbert modular surfaces, the essential part of the above lemma is [HY, Prop. 3.1.3]. For the A2 case, we can explicitly compute L′′ as follows and it is maximal. By [Eke, Prop. 5.2], for any ℓ=p, there is a unique class (up to GL4(Zℓ)-conjugation) of principal polarizations on the Tate module Tℓ(A). Therefore, to compute L′′⊗Zℓ, we may assume that A=E2 and endowed with the product principal polarization, where E is a supersingular elliptic curve. Hence the quadratic form on the lattice L′′, which is the trace [math] part of H2(A), is given by x02+Nm, where Nm is the quadratic form given by the reduced norm on the quaternion algebra End(E).
3. The F-crystals Lcris on local deformation spaces of supersingular points
In this section, we compute the lattices (L′′⊗Zp in 2.3.1) of special endomorphisms of supersingular points with the natural quadratic forms following Howard–Pappas [HP, §§5-6].666One may also carry out the computation following Ogus [Ogus79, §3]. In conjunction with [Kisin, §1], we then obtain Lcris (see 2.2.3 and 2.2.9) on the formal neighborhoods of supersingular points in the Shimura variety M.
As a direct consequence, we obtain the local equation of the non-ordinary locus in §3.4. These are the key inputs to §§5-6; in particular, we use the explicit descriptions of this section to prove our decay results.
3.1. A brief review of the work of Howard–Pappas and Kisin
Since both [HP] and [Kisin] apply to GSpin Shimura varieties of any dimension, we first recall their results in the general setting.
Let (V,Q) denote a quadratic Q-vector space of signature (n,2) and let L⊂V be a maximal even lattice which is self-dual at p. Let M denote the smooth canonical integral model over Zp of the GSpin Shimura variety attached to (L,Q) in [Kisin].
Set k=Fp,W=W(k),K=W[1/p].
In this section, we consider a fixed supersingular point P∈M(k). In the case of abelian surfaces considered in §2 (with L=LS or LH), P supersingular means the corresponding abelian surface over P is supersingular. This in turn is equivalent to the action of the crystalline Frobenius φ on Lcris,P(W) being pure, with slope [math]. In the general setting, let D denote the Dieudonné crystal of the universal Kuga–Satake abelian variety over MFp and let Lcris⊂End(D) denote the sub crystal corresponding to L⊂C(L) defined in [MP16, §4.14].777Note that in the cases L=LH,LS in §2, we still take D to be the Dieudonné crystal of the universal abelian surfaces, not that of the Kuga–Satake abelian varieties. Let φ denote the crystalline Frobenius on DP(W) and Lcris,P(W). Then we say P is supersingular if φ acts on Lcris,P(W) with pure slope [math] (see for instance [HP, Lem. 4.2.4, §7.2.1]).
By Dieudonné theory, we have L′′⊗Zp=Lcris,P(W)φ=1. In order to compute L′′⊗Zp and the φ-action on Lcris,P(W), we introduce another free W-module LP#(W) following [HP, §6.2.1].888Note that in [HP], they use y to denote a point in M(k) and LP#(W) is denoted by Ly while Lcris,P(W) is denoted by Ly#.
Definition 3.1.1**.**
The filtration on DP(W) is given by Fil1DP(W):=φ−1(pDP(W)). We define LP#(W):={v∈Lcris,P(W)⊗WK∣vFil1DP(W)⊂Fil1DP(W)}.
3.1.2*.*
By [HP, Thm. 7.2.4], studying supersingular points and their formal neighborhood in M reduces to study the points and their formal neighborhood in the associated Rapoport–Zink spaces and hence we use results in [HP, §§5-6].
By [HP, Prop. 6.2.2], φ(LP#(W))=Lcris,P(W). In particular,
[TABLE]
Recall that in 2.3.1, we endow V′:=L′′⊗Qp with a quadratic form Q′; let [−,−]′ denote the bilinear form on V′ given by [x,y]′=Q′(x+y)−Q′(x)−Q′(y). Hence
[TABLE]
Since P is supersingular, we have n=rkWLcris,P(W)=rkZpL′′=dimQpV′.
Let ΛP⊂V′ denote the dual of L′′⊗Zp with respect to [−,−]′. Then by [HP, Prop. 6.2.2], ΛP is a vertex lattice, i.e., ΛP is a Zp-lattice in V′ such that pΛP⊂ΛP∨⊂ΛP. The typetP of ΛP is defined to be dimFp(ΛP/ΛP∨).
By [HP, Prop. 5.1.2, (1.2.3.1)], there is tmax∈2Z which only depends on n and det(V′)=det(VQp)999The determinant det(V′) is the determinant of the Gram matrix ([xi,xj]′)i,j=1,…,n+2, where {xi}i=1n+2 is a Qp-basis of V′; we view det(V′) as an element in Qp×/(Qp×)2. such that tP∈2Z and 2≤tP≤tmax. Moreover, there exists a vertex lattice Λ⊂V′ of type tmax such that ΛP⊂Λ. Indeed, the proof of [HP, Prop. 5.1.2] constructs all possible isometry classes of Λ (with the quadratic form) for all (V,Q) (note that in loc. cit., they proved that for given (V,Q), the isometry class of Λ is unique).
Therefore, given (V,Q), we first obtain the isometry class of Λ of type tmax and then all isometry classes of the lattices of special endomorphisms L′′⊗Zp attached to all supersingular points are given by the duals of the vertex lattices contained in Λ.
From Λ, we may compute all possible isomorphism classes of Lcris,P(W) and LP#(W) as rank n free W-modules endowed with a quadratic form/bilinear form and a σ-linear Frobenius φ (here we use σ to denote the Frobenius action on W) following [HP, Prop. 6.2.2, §5.3.1]. Indeed, LP#(W)⊂Λ⊗ZpW=:ΛW is the preimage of a Lagrangian LP#⊂ΛW/ΛW∨ with respect to the quadratic form pQ′modp such that
[TABLE]
where we use φ to denote the σ-linear map on ΛW given by Id⊗σ and φ(v):=φ(v) is well-defined for v∈ΛW/ΛW∨ with a lift v∈ΛW. The quadratic form and φ-action on LP#(W) are the restrictions of the quadratic forms and φ-action on ΛW. We then obtain Lcris,P(W)=φ(LP#(W)). Note that by [HP, Prop. 5.1.2], the even dimensional Fp-quadratic space (Λ/Λ∨,pQ′modp) does not have a Lagrangian defined over Fp and hence is nonsplit; see [HP14, §§3.2-3.3] for a discussion on how to find all such LP#.
Definition 3.1.3**.**
For a supersingular point P, we say P is superspecial if tP=2;101010In the settings in §2, P is superspecial if and only if the corresponding abelian surface is isomorphic to the product of two supersingular elliptic curves, which is the usual definition for an abelian surface to be superspecial.
we say P is supergeneric if tP=tmax=2.
By [HP, Prop. 5.2.2], P is superspecial if and only if
[TABLE]
By [HP, (1.2.3.1)], in the setting of §2, we have tmax≤4 and hence the supersingular points in question are either superspecial or supergeneric.
Remark 3.1.4*.*
By [MP16, Prop. 4.7 (iii) (iv)], GSpin(L,Q)W acts on DP(W) and Lcris,P(W); moreover, as W-quadratic spaces, Lcris,P(W)≅L⊗W (we use QW to denote the quadratic form on L′′⊗Zp) and for x∈Lcris,P(W),x∘x=QW(x)⋅Id∈End(DP(W)). Therefore Q′ on L′′⊗Zp is the restriction of Q on Lcris,P(W) to L′′⊗Zp. We introduce the notation Q′ to emphasize that Q′ and Q (as Zp-quadratic forms) are restrictions of QW to Zp-lattices in different Qp-subspaces. Hence GSpin(L,Q)W=GSpin(Lcris,P(W),Q′).
3.1.5*.*
We now describe the F-crystal Lcris over the formal completion MP along the supersingular point P following [Kisin, §§1.4-1.5] and [Moonen98, §4.5]; see also [HP, §§3.1.4, 3.1.6].
The Hodge filtration Fil1DP(W)modp⊂DP(k) corresponds to a cocharacter μ:Gm,k→GSpin(L,Q)k and we pick a cocharacter μ:Gm,W→GSpin(L,Q)W which lifts μ. Let UP⊂GSpin(L,Q)W denote the opposite unipotent of the parabolic subgroup defined by μ; and let UP denote the formal completion of UP along the identity. Pick coordinates and write UP=SpfW[[x1,…,xd]] such that x1=⋯=xd=0 defines the identity element in UP. Let σ denote the Frobenius action on W[[x1,…,xd]] which lifts the σ-action on W and for which σ(xi)=xip.
Let R denote OM,P, the complete local ring of M at P. Then there exists an isomorphism from SpfR to UP (and we still use σ to denote the Frobenius action on R via the identification to W[[x1,…,xd]]) such that
(1)
D(R)=DP(W)⊗WR and Lcris(R)=Lcris,P(W)⊗WR as R-modules;
2. (2)
and under the above identifications, the σ-linear Frobenius action, denoted by Frob, on D(R) and Lcris(R) is given by u⋅(φ⊗σ), where u denotes the universal W[[x1,…,xd]]-point in UP and φ is the crystalline Frobenius on DP(W) or Lcris,P(W) given in §3.1.2.
On Lcris, the GSpin(L,Q)W action factors through the quotient SO(L,Q)W. So from now on, since we will only care about Frob on Lcris, then by Remark 3.1.4, we will work with μ:Gm,W→SO(Lcris,P(W),Q′) and UP the opposite unipotent of μ in SO(Lcris,P(W),Q′).
In the rest of this section, we will apply §§3.1.2, 3.1.5 to the setting in §2 and we will work with the coordinates on UP. When L=LH, we write UP=SpfW[[x,y]] and when L=LS, we write UP=SpfW[[x,y,z]]. We will use ϵ∈Zp× to denote an element which is not a perfect square in Zp. Let Zp2 (resp. Qp2) denote W(Fp2) (resp. W(Fp2)[1/p]) and let λ∈Zp2× be an element such that σ(λ)=−λ (for instance, we can take λ to be a root in Zp2 of x2−ϵ=0). We will use {vi}i=1n+2 to denote a W-basis of Lcris,P(W) and {wi}i=1n+2 to denote a Zp-basis of ΛP∨=Lcris,P(W)φ=1; note that SpanW{wi} is a W-sublattice of Lcris,P(W).
Assume that p is inert in Q(m);111111If m∈Z is a perfect square, then by convention, we view every prime p to be split in Q[x]/(x2−m) and the discussion of the split case still holds. then we have tmax=4.
The vertex lattice with type tmax is Λ=SpanZp{e1,f1}⊕Z, where
[TABLE]
Hence Λ∨=pΛ. Set e2=(1⊗1+(1/λ)⊗λ)/2,f2=(1⊗1+(−1/λ)⊗λ)/2∈Zp2⊗ZpZ. Then, as elements in ΛW,
[TABLE]
All possible LP# are given by two families of Lagrangians in k-quadratic space spanned by eˉ1,eˉ2,fˉ1,fˉ2∈ΛW/ΛW∨ with quadratic form pQ satisfying (3.1.1):
[TABLE]
where cˉ∈k.121212Indeed, as dimkΛW/Λw∨ is small, in this case, all Lagrangians satisfy (3.1.1). There are two families and each is parametrized by P1(k) so more accurately, we shall view cˉ∈P1(k), i.e., there are two more Lagrangians Spank{fˉ1,fˉ2} and Spank{eˉ2,fˉ1}; however, since the role of ei and fi are symmetric, the computation for these two cases are equivalent to Spank{eˉ1,eˉ2} and Spank{eˉ1,fˉ2} so we may safely omit them and only take cˉ∈k. Moreover, we use σ−1(cˉ) to be the parameter here because eventually we want to work with Lcris,P(W)=φ(LP#(W)). Therefore, we have that
[TABLE]
where c∈W.131313Here we notice that φ swaps two families of LP#(W); in particular, the general formula for Lcris,P(W) is the same as that for LP#(W) (other than swapping between the two families). This observation holds true in general by [HP14, Rmk. 3.5].
Moreover, by (3.1.2), P is superspecial if and only if σ−1(c)−σ(c)∈pW, which is equivalent to the Teichmuller lift of cˉ lying in Zp2. Note that if c−c′∈pW, then c,c′ define the same Lcris,P(W). Therefore, without loss of generality, from now on, we will only work with c∈W which is the Teichmuller lifting of cˉ∈k. Hence P is superspecial if and only if there exists c∈Zp2 such that Lcris,P(W) is given by the above form.
In order to compute the F-crystal Lcris, we pick the following W-basis {v1,…,v4} of Lcris,P(W) such that the Gram matrix of [−,−]′ with respect to this basis is [0II0], where I denotes the 2×2 identity matrix.
For the first family, take
[TABLE]
for the second family, take
[TABLE]
Then on Lcris,P(W), with respect to {v1,…,v4}, we have
[TABLE]
The filtration on Lcris,P(k) is given by
[TABLE]
so we may choose μ:Gm,W→SO(Lcris,P(W),Q′) to be t↦diag(t−1,1,t,1). Then UP=SpfW[[x,y]] with the universal point
[TABLE]
where a=σ(c)−σ−1(c); we have a=0 if P is superspecial and a∈W× if P is supergeneric.
When P is superspecial, {w1=pv1+v3,w2=λ(pv1−v3),w3=v2,w4=v4} is a Zp-basis of L′′⊗Zp. Using {w1,…,w4} as a K-basis of Lcris,P(W)[1/p], we have
[TABLE]
When P is supergeneric,
{w1=v4,w2=pv1+v3+(c+σ−1(c))v4,w3=λ(pv1−v3+(c−σ−1(c))v4),w4=pv2−cv3−pσ−1(c)v1−cσ−1(c)v4} is a Zp-basis of L′′⊗Zp and with respect to this basis,
Frob=(I+pyA+xB)∘σ, where
[TABLE]
3.2.2*.*
Assume that p is split in Q(m); then we have
tmax=2 and hence every P is superspecial.
The vertex lattice with type tmax is Λ={(x1,x2,x3,x4)∈Zp4} with
[TABLE]
we have Λ∨=SpanZp{e1,e2,pe3,pe4}, where ei is the vector with xi=1 and xj=0 for j=i. Recall that we take ϵ=λ2; we then have141414There are exactly two Lagrangians and the other one is given by replacing λ by −λ. Since λ and −λ play the same role in our later computation, there is no loss of generality here. that
[TABLE]
The Gram matrix is [0II0] and on Lcris,P(W), the Frobenius φ=bσ with
[TABLE]
The filtration on Lcris,P(k) given by φ is the same as in §3.2.1 and hence we may use the same μ and u there. Therefore, on Lcris(W[[x,y]]), we have
[TABLE]
Moreover, {w1=pv1−v3,w2=λ(pv1+v3),w3=v2+v4,w4=λ(v4−v2)} is a Zp-basis of L′′⊗Zp and with respect to this basis,
[TABLE]
3.3. The Siegel case L=LS
We now compute Lcris for Theorem 1(1) and Theorem 3. In this case, we have
tmax=4.
The vertex lattice with type tmax is Λ=SpanZp{e1,f1}⊕ZS, where ZS={(x1,x2,x3)∈Zp3}
[TABLE]
for some c∈Zp×. Since detΛ=detL∈Qp×/(Qp×)2 and detL=2, we have c=−1. Let g=(1,0,0)∈ZS and Z=SpanZp{(0,1,0),(0,0,1)}⊂ZS. Then Λ/Λ∨=SpanFp{e1,f1}⊕Z/Z∨.
Note that SpanZp{e1,f1}⊕Z is exactly the same quadratic Zp-lattice which is denoted by Λ in §3.2.1; hence the same computation there applies to find Lcris,P(W)⊂Λ⊗W. More precisely, there exist v1,…,v4∈SpanW{e1,f1}⊕Z⊗W and c∈W which is the Teichmuller lift of c∈k such that
(1)
Lcris,P(W)=SpanW{v1,…,v4,v5}, where v5=g;
2. (2)
the Gram matrix of [−,−]′ with respect to {v1,…,v5} is 0I0I00002ϵ, where I is the 2×2 identity matrix;
3. (3)
The Frobenius φ on Lcris,P(W) with respect to the basis {vi} is
[TABLE]
4. (4)
P is superspecial if and only if σ2(c)=c.
We may choose μ:Gm,W→SO(Lcris,P(W),Q′) to be t↦diag(t−1,1,t,1,1). Then UP=SpfW[[x,y,z]] with the universal point
[TABLE]
acting on Lcris(W[[x,y,z]]),
where a=σ(c)−σ−1(c); note that a=0 if and only if P is superspecial.
For the proofs of Theorem 1(1) and Theorem 3, we only need to study superspecial points so we only give the matrix of Frob with respect to a basis of Lcris⊗WK consisting of elements in L′′⊗Zp when P is superspecial; we refer the reader to the appendix for the discussion when P is supergeneric.
We now assume that P is superspecial. Let w1=λ(pv1−v3),w2=pv1+v3,w3=v2,w4=v4,w5=v5. Then L′′⊗Zp=SpanZp{w1,…,w5}. We view {wi}i=15 as a K-basis of Lcris,P(W)⊗K, then the Frobenius on Lcris(W[[x,y,z]]) is given by
[TABLE]
3.4. Equation of non-ordinary locus
We now use the computation in §§3.2, 3.3 to obtain the local equation of the non-ordinary locus in a formal neighborhood of a supersingular point P using results in [Ogus01]. Although [Ogus01] only focuses on the case of K3 surfaces, the results that we recall here apply to any GSpin Shimura varieties. We follow the notation in §3.1. For a perfect field k′ of characteristic p, for P′∈M(k′), we say P is ordinary if the slopes of the crystalline Frobenius φ on Lcris,P′(W) are −1,1 with multiplicity 1 and [math] with multiplicity n.151515When L=LH,LS, the point P′ is ordinary if and only if the corresponding abelian surface over k′ is ordinary by the definition of Lcris.
The cocharacter μ defines a filtration Fili,i=−1,0,1 on Lcris,P(k), which is the Hodge filtration in [Ogus01] and in particular, dimFil1Lcris,P(k)=1,dimFil0Lcris,P(k)=n+1,dimFil−1Lcris,P(k)=n+2 and the annihilator of Fil1Lcris,P(k) in Lcris,P(k) with respect to Q is Fil0Lcris,P(k).161616See also [Ogus, p.411] for the definition. Note that here we directly work on the crystalline cohomology without using the canonical isomorphism to the de Rham cohomology. Note that our filtration is shifted by 1 when comparing to the filtration in [Ogus01] because his Frobenius is p times our Frobenius. The Hodge filtration over the mod p complete local ring R⊗Wk at P is given by FiliLcris(R⊗Wk):=FiliLcris,P(k)⊗k(R⊗k). Note that Frob(Fil0Lcris(R⊗Wk))⊂Fil0Lcris(R⊗Wk), so we have a well-defined map pFrob:gr−1Lcris(R⊗Wk)→gr−1Lcris(R⊗Wk), where gr−1Lcris(R⊗Wk):=Fil−1Lcris(R⊗Wk)/Fil0Lcris(R⊗Wk).
Lemma 3.4.1** (Ogus).**
For a supersingular point P,
The non-ordinary locus (over k) in the formal neighborhood of P is given by the equation
[TABLE]
Proof.
By [Ogus01, Prop. 11], the discussion of the conjugate filtration on [Ogus01, p.333-334], and the fact that the annihilator of Fil1Lcris(R⊗k) in Lcris(R⊗k) with respect to Q is Fil0Lcris(R⊗k), we have that the equation defining the non-ordinary locus is the projection of the conjugate filtration (denoted by Fcon2 in loc. cit.) to gr−1Lcris(R⊗k). By definition, Fcon2=pFrobLcris(R⊗k) and then the lemma follows.
∎
Corollary 3.4.2**.**
When L=LH, the local equation of the non-ordinary locus in a formal neighborhood of a supersingular point P is xy=0 if P is superspecial and is y=0 if P is supergeneric; when L=LS, the local equation is xy+z2/(4ϵ)=0 if P is a superspecial point and (x+a)y+z2/(4ϵ)=0 if P supergeneric, where a∈W(k)× depends on P.
Proof.
We will prove this corollary in the case L=LS, since the other case is handled the same way. Recall we have the basis v1…v5 of Lcris, with Fil−1=Lcris and Fil0 being spanned by v2,v3,v4 and v5. Therefore, using the explicit formulas from the previous section, we see the map
pFrob:gr−1Lcris(R⊗Wk)→gr−1Lcris(R⊗Wk) is given by pFrob(v1)=−(xy+4ϵz2+ay)v1. Our result now follows from Ogus’s description of the non-ordinary locus.
∎
4. Arithmetic Borcherds Theory, Siegel mass formula, and Eisenstein series
We use arithmetic Borcherds theory [HMP] to control the global intersection number of a curve C with special divisors. More precisely, we use the work of Bruinier and Kuss in [BK03] to study the Fourier coefficients of the Eisenstein part of the (vector-valued) modular form arising from Borcherds theory. In order to compare the global intersection number with the local contribution later in the paper, we also apply the computations in [BK03] and the Siegel mass formula to the Eisenstein part of the theta series attached to a supersingular point and reduce the question to a computation of local densities and determinants of the lattices L and L′ introduced in §2.1 and 2.3.1 (in §4.2, we will summarize the properties of L′). We use Hanke’s method in [Han04] to compute the local densities. Throughout this section, p is an odd prime such that L is self-dual at p. For a prime ℓ, we use vℓ:Zℓ\{0}→Z≥0 to denote the ℓ-adic valuation.
4.1. Arithmetic Borcherds theory and the explicit formula for the Eisenstein series
Recall the special divisors Z(m) from 2.2.6 and §2.2.11. The following modularity result is the key input to the estimate of the intersection number Z(m).C.
In order to state the result using vector-valued modular forms, for μ∈L∨/L,m∈Q>0, let Z(m,μ) denote the special divisors over Z in M defined in [AGHMP, §4.5, Def. 4.5.6]. By definition, Z(m,0) is the divisor Z(m) defined in §2.2; and roughly speaking, Z(m,μ) parametrizes abelian surfaces A with a special quasi-endomorphism s such that Q(s)=m and the ℓ-adic and crystalline realizations of s lie in the image of (μ+L)⊗Zℓ and (μ+L)⊗Zp in End(Tℓ(A)⊗Qℓ) and End(D⊗WW[1/p]) respectively, where D is the Dieudonné module of A . By the proof of [AGHMP, Prop. 4.5.8] and [MP16, Prop. 5.21], the assumption that L is self-dual at p implies that Z(m,μ) is flat over Zp. Let Z(m,μ) denote Z(m,μ)Fp. Let (eμ)μ∈L∨/L denote the standard basis of C[L∨/L]. Let ω∈Pic(MFp)Q denote the Hodge line bundle in the Q-Picard group of MFp; in other words, ω is the line bundle of weight 1 modular forms (see for instance [AGHMP, Thm. 4.4.6] for a definition of ω).
Assume (L,Q) is a maximal quadratic lattice of signature (n,2) such that L is self-dual at p.
The generating series
[TABLE]
lies in M1+2n(ρL)⊗Pic(MFp)Q. Here, ρL denotes the Weil representation on C[L∨/L] and M1+2n(ρL) denotes the space of vector-valued modular forms of Mp2(Z) with respect to ρL of weight 1+2n.171717In [Bor99], [BK01], [BK03], they work with (L,−Q) and the modular form is with respect to the dual of the Weil representation of (L,−Q), which is the Weil representation of (L,Q). Our convention is the same as the one in [HMP] and [Br17]. In particular, for any Q-linear functional α:Pic(MFp)Q→C, the vector-valued power series
[TABLE]
is the Fourier expansion of an element of M1+2n(ρL).
Proof.
By abuse of notation, we also use ω to denote the Hodge line bundle over M. By [HMP, Thm. B], the generating series ω−1e0+∑m>0,μ∈L∨/LZ(m,μ)qmeμ∈M1+2n(ρL)⊗Pic(M)Q. Since Z(m,μ) are flat over Zp, then the desired assertion follows from intersecting with MFp.
Alternatively, Borcherds [Bor99, Thm. 4.5] proved the modularity assertion for Z(m,μ)C. By [Davesh, Lemma 5.12] and the flatness of Z(m,μ), the proof of Borcherds implies the desired modularity for Z(m,μ).
∎
4.1.2*.*
In the setting of Theorem 1(2) (i.e. the case when L=LH), we work with curves C that are not necessarily proper. We therefore need a version of the above modularity result that holds for the special fiber a toroidal compactification of M. To that end, let Mtor denote a toroidal compactification of M, and let D1,…,Dk denote irreducible components of the boundary MFptor∖MFp. In [BBK, Theorem 6.2], the authors prove the modularity result for Mtor, which will directly imply the modularity result for MFptor. The constant term is still given by the Hodge line bundle, still denoted by ω, on MFptor and the special divisors Z(m,μ) are replaced by181818Our notation Z′+E is different from the notation used in loc. cit.Z′(m,μ)+E(m,μ), where Z′(m,μ) is the Zariski-closure of Z(m,μ) in MFptor, and E(m,μ) is a “correction term”, and has as its irreducible components the Di with appropriate multiplicity. Crucially, when Z(m,μ) is proper (see §4.3.3 for when this happens), the multiplicities of the Di in correction term E(m,μ) are all zero and hence E(m,μ) is trivial. Therefore, compact special divisors stay as they are in the modularity theorem for MFptor.191919We note that in [BBK], the authors work with Hilbert modular surfaces attached to real quadratic fields with prime discriminant D and state the modularity result using modular forms with level Γ0(D). However, their proof, which uses Borcherds product for the Fourier expansion and the flatness of Z(m,μ), applies for all Hilbert modular surfaces in the setting of vector-valued modular forms by using the original work of Borcherds [Bor98]. We hence deduce modularity for MFptor. Although the integral special divisors (denoted by T(n) in [BBK]) are defined by taking Zariski closure in Mtor of the special divisors on the generic fiber MQtor, this notion coincides with our definition by the flatness of the integral special divisors in both definitions.
4.1.3*.*
Recall that we have a finite morphism π:C→MFˉp.
When C is proper, for Z∈Pic(MFp)Q, we define C.Z as the degree of π∗Z∈Pic(C)Q. For Theorem 1(2), we pick a toroidal compactification Mtor of the Hilbert modular surface M and let C′ denote the smooth compactification of C and the finite morphism π extends to a finite morphism π′:C′→MFˉptor. Then for a proper divisor Z in MFp, we use C.Z to denote degC′(π∗Z); since Z is proper, C′∩Z=C∩Z so we only need to consider points in MFˉp.
4.1.4*.*
We apply Theorem 4.1.1 and §4.1.2 to α(Z):=C.Z defined n §4.1.3 for Z∈Pic(MFp)Q (and we further assume that Z is proper when L=LH). We decompose the modular form −(ω.C)e0+∑m>0,μ∈L∨/LZ(m,μ).Cqmeμ as E(q)+G(q), where E(q)∈M1+2n(ρL) is an Eisenstein series and G(q)∈M1+2n(ρL) is a cusp form. Note that the constant term of E(q) is −(ω.C)e0.
We now recall the vector-valued Eisenstein series E0(τ)∈M1+2n(ρL) which has constant term e0. This Eisenstein series has been studied in [Br02, §1.2.3], [BK01, §4], and [BK03, §3]. Here we follow [Br17, §2.1] as we use the same convention of quadratic forms. We denote an element in Mp2(Z) by (g,σ), where g=[acbd]∈SL2(Z) and σ is a choice of the square root of cτ+d. Let Γ∞′⊂Mp2(Z) denote the stabilizer of ∞. Then for n≥3, the following summation converges on the upper half plane and we define
[TABLE]
When n=2, we define E0(τ) use analytic continuation following [BK03, §3]. Write τ=x+iy and define for s∈C,
[TABLE]
which converges on the upper half plane for s with ℜs>0 (n=2 here). By [BK03, p. 1697], E0(τ,s) has meromorphic continuation in s to the entire C and it is holomorphic at s=0 and we define E0(τ) to be the value at s=0 of the meromorphic continuation of E0(τ,s). Moreover, by loc. cit., E0(τ) is holomorphic and hence lies in M1+2n(ρL) if ρL does not contain the trivial representation as a subquotient. In the proof of Theorem 1(2), we work with L=LH and this condition for ρL is always satisfied as far as the m in the statement of Theorem 1(2) is not a perfect square, i.e., M is not the product of modular curves.
We denote the q-expansion of E0(τ) as ∑m≥0,m∈Z+Q(μ)qL(m,μ)qmeμ and set qL(m):=qL(m,0) for m∈Z>0.
4.1.5*.*
We fix some notations before we state the explicit formula of qL(m) given by Bruinier–Kuss.
Given a quadratic lattice L (not necessarily the lattice LH,LS), we write det(L) for the determinant of its Gram matrix.
We have ∣L∨/L∣=∣det(L)∣.
For a rational prime ℓ, we use δ(ℓ,L,m) to denote the local density of L representing m over Zℓ. More precisely, δ(ℓ,L,m)=lima→∞ℓa(1−rkL)#{v∈L/ℓaL∣Q(v)≡mmodℓa}. [BK01, Lem. 5] asserts that the limit is stable once a≥1+2vℓ(2m). In particular, if m is representable by (L⊗Zℓ,Q), then δ(ℓ,L,m)>0.
Given 0=D∈Z such that D≡0,1mod4, we use χD to denote the Dirichlet character
χD(a)=(aD), where (⋅⋅) is the Kronecker symbol. For a Dirichlet character χ, we set σs(m,χ)=∑d∣mχ(d)ds.
Theorem 4.1.6** (Bruinier–Kuss; see also [Br17]*Thms. 2.3, 2.4).**
For L=LS, write m=m0f2, where gcd(f,2detL)=1 and vℓ(m0)∈{0,1} for all ℓ∤2detL. Then the Fourier coefficient qL(m) is
[TABLE]
where μ is the Mobius function and D=−2m0detL.202020Since detLS=2 and more generally for odd rank quadratic lattice L, we have 2∣detL, then D≡0mod4.
Proof.
When L=LS, this is [BK01, Thm. 11]. When L=LH, one modifies the proof of [BK01, Thm. 11] as follows. Using [BK03, Prop. 3.1] instead of [BK01, Prop. 2], we obtain [BK01, Prop. 3] since Shintani’s formula works in general. To express the formula in [BK01, Prop. 3] as a product of local terms, we use [Iwa97, §11.5, p. 196]. The rest of the proof, which computes the local terms at ℓ∤2detL, works in the same way (see also [Iwa97, eqns (11.71)–(11.74)]).
∎
If Z(m)=∅, then m is representable by (L,Q) and in particular for every ℓ, m is representable by (L⊗Zℓ,Q) and hence δ(ℓ,L,m)>0.
By Theorem 4.1.6, we have qL(m)<0 when Z(m)=∅.
4.2. The lattice L′ and the Siegel mass formula
4.2.1*.*
For a supersingular point P∈M(k), we defined L′′, the lattice of special endomorphisms, in 2.3.1 and picked L′⊃L′′ which is maximal at all ℓ=p and L′⊗Zp=L′′⊗Zp. Though there may be choices for L′, the local lattices L′⊗Zℓ are well-defined up to isometry. More precisely, for ℓ=p, L′⊗Zℓ is given by Lemma 2.3.2; and for ℓ=p, L′⊗Zp=L′′⊗Zp is computed in §§3.2-3.3. Note that given L, the isometry class of the quadratic lattice L′⊗Zp only depends on whether P is superspecial or supergeneric; indeed, following the notation in §3.1.2, if tP=tmax (for instance, when P is supergeneric), then ΛP is a maximal lattice with respect to pQ′ and hence its isometry class (and thus the isometry class of L′⊗Zp=ΛP∨) is unique; if tP=2, i.e., P is superspecial, then ΛP∨ is a maximal lattice with respect to Q′ and hence is unique up to isometry.
In order to compute the local intersection number of Z(m).C at P, we also need to consider sublattices L′′′ of L′ such that L′′′⊗Zℓ=L′⊗Zℓ for all ℓ=p (more precisely, we will take L′′′ to be the lattices defined in §7.2.3). In particular, detL′′′=p2adetL′ for some a∈Z≥0.
Let θL′′′(q) denote the theta series of the positive definite lattice L′′′, which is a modular form of weight rkL′/2; we decompose θL′′′(q)=EL′′′(q)+GL′′′(q), where EL′′′ is an Eisenstein series and GL′′′ is a cusp form. Let qL′′′(m) denote the m-th Fourier coefficients of EL′′′ (at the cusp ∞). The following theorem asserts that qL′′′(m) only depends on the genus of L′′′ and gives explicit formula for qL′′′(m). In particular, when we consider the theta series for L′, we have that qL′(m) is independent of the choice of L′ above and it only depends on L and whether P is superspecial or supergeneric.
Theorem 4.2.2** (Siegel mass formula).**
Notation as in §4.2.1. The Eisenstein series EL′′′ only depends on the genus of L′′′. Moreover, for m∈Z>0,
(1)
when L=LH,
[TABLE]
2. (2)
when L=LS,
[TABLE]
where we write m=m0f2 with gcd(f,2detL′)=1 and vℓ(m0)∈{0,1} for all ℓ∤2detL′ and D′=−2m0detL′
Proof.
The first assertion follows from the Siegel mass formula; see for instance [IK04, Thm. 20.9, eqn. (20.121), and pp. 479-480]. In order to obtain the formula above, we note that the proof of [BK01, Thm. 11] using [BK01, Thm. 6] also applies to L′′′ and hence we conclude that the formula in [Br17, Thms. 2.3, 2.4] also applies to L′′′ and obtain the formulae in the theorem with all L′ replaced by L′′′. Note that by the computations in §§3.2-3.3, we have p∣detL′, and hence ℓ∣2detL′′′ if and only if ℓ∣2detL′; also χ4detL′′′=χ4detL′ and χD′=χ−2m0detL′′′. Hence using L′ (instead of L′′′) for χ,D′ and the product ℓ∣2detL′ yields the same formulae.
∎
4.3. The asymptotic of qL(m)
The discussion of this subsection also applies to qL′′′(m) when m is representable by (L′′′,Q′), but we only focus on qL(m) here.
4.3.1*.*
Assume that m is representable by (L⊗Zℓ,Q) for every prime ℓ. We will also assume that, as m varies within a specified set T, there exists an absolute constant C>0 such that for all ℓ∣2detL, we have vℓ(m)≤C. As we shall see in §4.3.3, we will always be in this situation.
For a given ℓ∣2detL, as in [Br17, proof of Prop. 2.5], by [BK01, Lem. 5], we have δ(ℓ,L,m)=ℓa(1−rkL)#{v∈L/ℓaL∣Q(v)≡mmodℓa} with a=1+2C+2vℓ(2) and hence ℓa(1−rkL)≤δ(ℓ,L,m)≤ℓa.212121When rkL≥5, for a fixed ℓ, it is well known that δ(ℓ,L,m)≍1 for all m representable by (L⊗Zℓ,Q) without imposing any bound on vℓ(m); see for instance [Iwa97, pp. 198-199].
Therefore, given (L,Q), by Theorem 4.1.6, we have that ∣qL(m)∣≍mσ−1(m,χ4detL) and hence m1−ϵ≪ϵ∣qL(m)∣≪ϵm1+ϵ for L=LH; and ∣qL(m)∣≍m3/2L(2,χD)∑d∣fμ(d)χD(d)d−2σ−3(f/d) for L=LS. As in the proof of [Br17, Prop. 2.5], we have ∑d∣fμ(d)χD(d)d−2σ−3(f/d)≥1/5 and
[TABLE]
moreover, by loc. cit., L(2,χD)≥ζ(4)/ζ(2) and L(2,χD)≤∏p(1−p−2)−1=ζ(2). Hence ∣qL(m)∣≍m3/2 when L=LS.
Lemma 4.3.2**.**
We fix the same assumptions as in §4.3.1. For m≫1, we have Z(m)=∅ and the intersection number Z(m).C=−qL(m)(ω.C)+o(∣qL(m)∣). More precisely, when L=LH, the error term can be bounded by Oϵ(m1/2+ϵ) and when L=LS, the error term can be bounded by O(m5/4).
Proof.
We follow the discussion in §4.1.4. Let g(m),m∈Z>0 denote the m-th Fourier coefficients of e0-component of G(q), which is also a cusp form of weight 1+2n with respect to certain congruence subgroup of Mp2(Z) depending on L. When L=LH, by Deligne’s bound ([D73, D74]), we have ∣g(m)∣≪m1/2σ0(m)≪ϵm1/2+ϵ=oϵ(m1−ϵ)=o(∣qL(m)∣) for any 0<ϵ<1/4. When L=LS, the trivial bound yields ∣g(m)∣≪m5/4=o(m3/2) (see [Sar90, Prop. 1.3.5]).
Therefore by Theorem 4.1.1, Z(m).C=−qL(m)(ω.C)+o(∣qL(m)∣); in particular, for m≫1, Z(m).C>0 and hence Z(m)=∅.
∎
4.3.3*.*
When L=LS, recall from §2.1 that the quadratic form is Q(x)=x02+x1x2−x3x4 and hence every m∈Z>0 is representable by (L,Q). In particular, Z(m)=∅ and δ(ℓ,L,m)>0 for all ℓ. Moreover, in order to prove Theorem 1(1) and Remark 2, we will work with m∈T:={Dq2∣q prime and q=p}, where we take D=1 for Theorem 1(1) and D being the discriminant of the real quadratic field in Remark 2; and for Theorem 3, we work with m∈T:={q∣q prime and q=p,q is a quadratic residuemodp, and q≡3mod4}. In particular, for all such m, we have vℓ(m)≤2+vℓ(D) and hence the assumptions in §4.3.1 are satisfied.
When L=LH, since L is maximal and isotropic, we have that the quadratic form on L⊗Zℓ is given by xy+Q1(z), where x,y∈Zℓ,z∈Zℓ2 and Q1 is some quadratic form. Then δ(ℓ,L,m)>0 for all ℓ; indeed, by [Han04, Def. 3.1, Lem. 3.2], δ(ℓ,L,m)>0 if there exists x,y∈Z/ℓ1+2vℓ(2) such that xy≡mmodℓ1+2vℓ(2) and x≡0modℓ (by the terminology in [Han04], this construct a good type solution (taking z=0) for (L,Q)modℓ1+2vℓ(2), which can be lifted to Z/ℓk for any k≥1+2vℓ(2)). Such x,y always exists and hence every m∈Z>0 is representable by (L⊗Zℓ,Q) for all ℓ and hence by Lemma 4.3.2, there exists N∈Z>0 such that for all m>N, m is representable by (L,Q). For the proof of Theorem 1(2), we work with m in
[TABLE]
where F is the real quadratic field attached to the Hilbert modular surface and the constant C is chosen so that this set is non-empty. The existence of q implies that m=NmF/Qγ for any γ∈F and hence for any v∈LH⊗Q such that Q(v)=m, we have v⊥⊂LH⊗Q is anisotropic. Note that if Z(m) is non-compact in MFp, then Z(m) parametrizes abelian surfaces which are isogenous to the self-product of elliptic curves and then v⊥ is isotropic. Therefore, for any m∈T, we have that Z(m) is compact in MFp. Note that T⊂Z>0 is of positive density.
Lemma 4.3.4**.**
For L=LH and M>0, we have ∑1≤m≤M,m∈T∣qL(m)∣≍M2.
Proof.
By §§4.3.1,4.3.3, we have for m∈T, ∣qL(m)∣≍mσ−1(m,χ), where χ=χ4detL. We write
[TABLE]
Note that
[TABLE]
[TABLE]
The second term is the main term. First let T′:={m∈Z∣m>N,p∤m,vℓ(m)≤C,∀ℓ∣2detL} then
[TABLE]
because vℓ(df)≤C⟺vℓ(d)≤C,∀ℓ∣2detL when vℓ(f)=0,∀ℓ∣2detL and if vℓ(f)>0 for some ℓ∣2detL, then χ(f)=0.
Since 1≤d≤M/f,p∤d,vℓ(d)≤C,∀ℓ∣2detL∑d=C1f2M2+O(M), where C1 and the implicit constant only depend on C,L,p. Hence
[TABLE]
To finish the proof, we only need to show that ∣1≤f≤M1/2∑χ(f)1≤d≤M/f,df∈T′−T∑d∣=o(M2). Since M/f≥M1/2, by definition of T, #{d∣1≤d≤M/f,df∈T′−T}=o(M/f) with implicit constant independent of f and hence we obtain the desired bound. ∎
4.4. Local densities at p and the ratios of Fourier coefficients
We set the same notation as in §4.2.1. Theorem 4.1.6 and Theorem 4.2.2 reduce the comparison between qL(m) and qL′′′(m) to the computation of the local density δ(p,L′′′,m), which we now compute following [Han04, §3]. Recall that p is an odd prime and vp(m)≤1 for all m∈T defined in §4.3.3.
For an arbitrary quadratic lattice (L,Q), let α(p,L,m):=p1−rkL#{v∈L/pL∣Q(v)≡mmodp}; if we diagonalize L⊗Zp such that Q is given by ∑i=1rkLaixi2 with ai∈Zp, then we define
[TABLE]
Lemma 4.4.1** (Hanke).**
If p∤m, we have
[TABLE]
if vp(m)=1, we have
[TABLE]
where if we write (L′′′⊗Zp,Q′) into diagonal form ∑i=1rkL′′′aixi2 with ai∈Zp, we define s0=#{ai∣vp(ai)=0} and LI′′′ is the quadratic lattice with quadratic form ∑i=1rkL′′′ai′xi2, where ai′=pai if vp(ai)=0 and ai′=p−1ai if vp(ai)≥1.
Proof.
If p∤m, the assertion follows from [Han04, Rmk. 3.4.1 (a), Lem. 3.2]; If vp(m)=1, then we only have good type and bad type I solutions in the sense of [Han04, Def. 3.1, p. 360] and the assertion follows from [Han04, Lem. 3.2, p. 360, Rmk. 3.4.1 (a)].
∎
We first compute δ(p,L′,m) by Lemma 4.4.1. We always pick ϵ∈Zp×\(Zp×)2 as in §3.1.2.
4.4.2*.*
Consider L=LH and recall that p∤m,∀m∈T. Let F denote the real quadratic field attached to the Hilbert modular surface defined by LH.
(1)
Assume that p is inert in F and P is supergeneric. By §3.2.1, L′⊗Zp=Λ∨=pΛ and hence p∣Q′(v),∀v∈L′; in particular, δ(p,L′,m)=0.
2. (2)
Assume that p is inert in F and P is superspecial. By §3.2.1, Q′(v)=xy+p(z2−ϵw2), where wi are given right above (3.2.1) and v=xw3+yw4+zw1+ww2 with x,y,z,w∈Zp. Hence δ(p,L′,m)=α(p,L′,m)=1−1/p.
3. (3)
Assume that p is split in F; hence P is superspecial. By §3.2.2, L′⊗Zp=Λ∨ with Q′(v)=x2−ϵy2−pz2+ϵpw2, where v=xe1+ye2+z(pe3)+w(pe4) with x,y,z,w∈Zp. Hence δ(p,L′,m)=α(p,L′,m)=1+1/p.
4.4.3*.*
Consider L=LS.
(1)
Assume that P is superspecial. By §3.3, we have Q′(v)=xy+ϵz2+pw2−pϵu2, where v=xw3+yw4+zw5+ww2+uw1 with x,y,z,w,u∈Zp and wi are given right above (3.3.1). Hence if p∤m, then δ(p,L′,m)=α(p,L′,m)≤1+1/p by [Han04, Table 1]. If vp(m)=1, then the quadratic form of LI′ is p(xy+ϵz2)+w2−ϵu2 and hence δ(p,L′,m)=α∗(p,L′,m)+p−2α(p,LI′,m/p)=(1−p−2)+p−2(1+p−1)=1+p−3.
2. (2)
Assume that P is supergeneric. By §3.3, L′⊗Zp=Λ∨ and hence the quadratic form is pxy+ϵz2+pw2−pϵu2. If p∤m, then δ(p,L′,m)=α(p,L′,m)=0 or 2; if vp(m)=1, then the quadratic form of LI′ is pϵz2+xy+w2−ϵu2 and hence δ(p,L′,m)=α∗(p,L′,m)+α(p,LI′,m/p)=0+1+p−2=1+p−2 by [Han04, Table 1].
We now estimate δ(p,L′′′,m) for sublattices lattices L′′′ of L′ defined in §4.2.1.
Lemma 4.4.4**.**
If p∤m, then δ(p,L′′′,m)≤2.
Proof.
By Lemma 4.4.1, δ(p,L′′′,m)=α(p,L′′′,m). Write the quadratic form Q′ on L′′′ into the diagonal form ∑i=1rkL′′′aixi2 with ai∈Zp and we may assume that there exists ai such that p∤ai; otherwise δ(p,L′′′,m)=0 then we are done. Now let L′′′ denote the quadratic form ∑1≤i≤rkL′′′,p∤aiaixi2. Then by definition, α(p,L′′′,m)=α(p,L′′′,m).
Since p∣discL′, then p∣discL′′′ and rkL′′′≤rkL′′′−1≤4. Then by [Han04, Table 1], α(p,L′′′,m)≤2 and hence δ(p,L′′′,m)≤2.
∎
Lemma 4.4.5**.**
Assume that L=LS and vp(m)=1. We have δ(p,L′′′,m)≤2+2p. Moreover, if P is superspecial and [L′:L′′′]=p, then δ(p,L′′′,m)≤4.
Proof.
By Lemma 4.4.1, δ(p,L′′′,m)=α∗(p,L′′′,m)+p1−s0α(p,LI′′′,m/p)≤α(p,L′′′,m)+pα(p,LI′′′,m/p). By the proof of Lemma 4.4.4, we have α(p,L′′′,m)=α(p,L′′′,m)≤2. The same argument implies that α(p,LI′′′,m/p)≤2 if rk(L′′′)≤4. If rk(L′′′)=5, then it is isotropic and we write the quadratic form as xy+Q1(z). The equation xy+Q1(z)≡(m/p)modp has (p−1)p3 solutions in Fp5 with x=0 and has at most p4 solutions with x=0. Hence α(p,L′′′,m/p)=α(p,L′′′,m/p)<2. Therefore, δ(p,L′′′,m)≤2+2p.
If P is superspecial and [L′:L′′′]=p, then s0≥1 and hence δ(p,L′′′,m)≤α∗(p,L′′′,m)+α(p,LI′′′,m/p)≤4.
∎
The following lemma, which is the main goal of this subsection, will be used to compare the local intersection number at a supersingular point P with the global intersection number.
Lemma 4.4.6**.**
Notation as in §4.2.1 and consider m∈T (defined in §4.3.3).
(1)
If P is superspecial or L=LH, then −q(m)Lq(m)L′≤p−11.
2. (2)
If L=LS and P is supergeneric, then −q(m)Lq(m)L′≤p2−12.
3. (3)
If p∤m, then −q(m)Lq(m)L′′′≤∣(L′′′⊗Zp)∨/(L′′′⊗Zp)∣(1−p−2)2.
4. (4)
Assumption as in Lemma 4.4.5, then −q(m)Lq(m)L′′′≤∣(L′′′⊗Zp)∨/(L′′′⊗Zp)∣(1−p−1)2p; moreover, if P is superspecial and [L′:L′′′]=p, then −q(m)Lq(m)L′′′≤p2−14.
Proof.
Recall from §4.2.1 that L′′′⊗Zℓ≅L⊗Zℓ,∀ℓ=p; hence for ℓ=p, we have δ(p,L′′′,m)=δ(p,L,m) and detL′′′=pkdetL for some k∈Z≥0. Since L is self-dual at p, then p∤detL; by §3.1.2, detL′=p2bdetL for some b∈Z>0 (concretely, one may deduce this fact by the explicit formula of Q′ in §§4.4.2-4.4.3) and hence k∈2Z>0. Thus χ4detL(d)=χ4detL′(d) and χ−2m0detL(d)=χ−2m0detL′(d) if p∤d.
5. The decay lemma for supersingular points and its proof in the Hilbert case
The goal of this section is to prove that special endomorphisms “decay rapidly”. More precisely, consider a generically ordinary two-dimensional abelian scheme over Fˉp[[t]] whose special fiber is supersingular. We consider the lattice of special endomorphisms of the abelian scheme mod tN as N varies, and establish bounds for the covolume of these lattices. These bounds are exactly what we need to bound the local intersection multiplicity SpfFˉp[[t]]⋅Z(m) – see Lemma 7.2.1. The precise definitions and results are in 5.1.1 and Theorem 5.1.2.
Throughout this section, as in §3, k=Fˉp, W=W(k), K=W[1/p]. We focus on the behavior of the curve C in Theorems 1 and 3 in a formal neighborhood of a supersingular point P, so we may let C=Spfk[[t]] denote a generically ordinary formal curve in Mk which specializes to P.
As in §3.1.5, σ denote both the Frobenius on K and the Frobenius on the coordinate rings W[[x,y],W[[x,y,z]] of MP, which is the unique extension of the Frobenius action on W for which σ(x)=xp,σ(y)=yp,σ(z)=zp. For a matrix M with entries in K[[x,y]] or K[[x,y,z]], we use M(n) to denote σn(M). Also recall we set λ∈Zp2× such that σ(λ)=−λ.
We use σt to denote the Frobenius on K[[t]] which extends σ on K and sends t to tp.
5.1. Statement of the Decay Lemma and the first reduction step
The map C→Mk gives rise to a local ring homomorphism from k[[x,y]]→k[[t]] (in the Hilbert case) or k[[x,y,z]]→k[[t]] (in the Siegel case), and we denote by x(t), y(t), and z(t) the images of x, y, and z respectively. Let vt denote the t-adic valuation map on k[[t]]. Let A denote the t-adic valuation of the local equation defining the non-ordinary locus in Corollary 3.4.2. More precisely, if P superspecial, then A=vt(xy) in the Hilbert case and A=vt(xy+4ϵz2) in the Siegel case.
Definition 5.1.1**.**
Let w denote a special endomorphism of the p-divisible group at P (i.e., w is an element in L′⊗Zp; see 2.2.4 and 2.2.9).
(1)
We say that wdecays rapidly if pnw does not lift to an endomorphism modulo tAn+1 for all n∈Z≥0, where An:=[A(pn+pn−1+⋯+1+p1)]; here [x] denote the maximal integer y such that y≤x.
2. (2)
We say that a Zp-submodule of L′⊗Zpdecays rapidly if every primitive vector in the submodule decays rapidly.
3. (3)
We say that w decays very rapidly if pnw does not lift to an endomorphism modulo tAn−1+apn+1 for some constant a≤A/2, for all n∈Z≥0, where An is defined in (1) and we define A−1=[A/p].
Theorem 5.1.2** (Decay Lemma).**
Assume P is superspecial. There exists a rank 3Zp-submodule of L′⊗Zp which decays rapidly and furthermore, there is a primitive vector in this submodule which decays very rapidly.
Here we only state the decay lemma for a superspecial point since we do not need to work with supergeneric points to prove Theorems 1 and 3. We refer the reader to the appendix for a decay lemma when P is supergeneric.
For m∈Z≥0, let Sm denote Speck[t]/(tm) and let Dm denote the p-adic completion of the PD enveloping algebra of the ideal (tm,p) in W[[t]]. Let ιm denote the composite map Sm→Spfk[[t]]→Spfk[[x,y]] or Spfk[[x,y,z]].
Then by [dJ95, §2.3], there exists a functor from the category of p-divisible groups over Sm to the category Dieudonné modules over Dm. More precisely, a special endomorphism w~m of the p-divisible group over Sm which specializes to w∈L′⊗Zp gives rise to an endomorphism of the Dieudonné module which specializes to w. By functoriality of Dieudonné modules, images of special endomorphisms are horizontal sections of ιm∗Lcris(Dm) stable under the Frobenius action; here the connection on ιm∗Lcris(Dm) is the pull-back of the connection on Lcris(W[[x,y]]),Lcris(W[[x,y,z]]) by a ring homomorphism W[[x,y]]→W[[t]] or W[[x,y,z]]→W[[t]] which lifts k[[x,y]]→k[[t]] or k[[x,y,z]]→k[[t]] given by C222222We may pick a lift k→W, for instance, the Teichmüller lift and hence view x(t),y(t),z(t) as power series in W[[t]]. and the σt-linear Frobenius is given in [Moonen98, §4.3.3].232323Here we refer to [Moonen98] for the existence of an explicit formula of the σt-linear Frobenius, but we do not need this explicit formula for our purpose. We will always carry out our computation using the σ-linear Frobenius; see the rest of the proof for the details.
The connection on Lcris(W[[x,y]]) or Lcris(W[[x,y,z]]) gives rise to a connection on Lcris,P(W)⊗WK[[x,y]]⊃Lcris(W[[x,y]] or Lcris,P(W)⊗WK[[x,y,z]]⊃Lcris(W[[x,y,z]].
Let w~ denote the horizontal section in Lcris,P(W)⊗WK[[x,y]] or Lcris,P(W)⊗WK[[x,y,z]] extending w∈L′⊗Zp⊂Lcris,P(W). Since the image of w~m in ιm∗Lcris(Dm) is horizontal and the connection on ιm∗Lcris(Dm) is the pull-back connection, then w~m=ιm∗w~. Therefore, if w lift to a special endomorphism in Sm, then ιm∗w~∈ιm∗Lcris(Dm)⊂Lcris,P(W)⊗WK[[t]].
The section w~ is constructed in [Kisin, §1.5.5] as follows. Recall from §§3.2-3.3, the Frobenius on Lcris(W[[x,y]]),Lcris(W[[x,y,z]]), with respect to a φ-invariant basis {wi}, is given by (I+F)∘σ for some matrix F with entries in (x,y)K[x,y] or (x,y,z)K[x,y,z].
We define F∞ to be the infinite product i=0∏∞(1+F(i)), where F(i) is the i-th σ-twist of F (recall σ(x)=xp,σ(y)=yp,σ(z)=zp). Since vt(y),vt(x),vt(z)≥1, the product is well-defined and the entries of F∞ are power series valued in K[[t]]. The Qp-span of the columns of F∞ are vectors of Lcris,P(W)⊗K[[x,y]],Lcris,P(W)⊗K[[x,y,z]] which are Frobenius stable and horizontal.
Now we are ready to reduce to the proof of the decay lemma to the following proposition. Indeed, by Proposition 5.1.3, with respect to {wi}, there exists a rank 3Zp-submodule of L′⊗Zp such that for every primitive w in this submodule, the coefficient of tkn for some kn≤A(1+p+⋯+pn+1) in pnw~ does not lie in (p−1W)4; since pLcris,P(W)⊂L′⊗W, with respect to a W-basis of Lcris,P(W), the coefficient of tkn in pnw~ does not lie in W4. On the other hand, for any N<p(An+1), we have p−1tN∈/DAn+1. Note that p(An+1)>pA(pn+⋯+1/p)=A(pn+1+⋯+1)≥kn. Hence pnw~ does not extend to a special endomorphism over SAn+1. Thus, this rank 3 submodule decays rapidly. Moreover,
the existence of a vector decaying very rapidly follows by the second assertion of Proposition 5.1.3 via the same argument and the fact that p(An−1+apn+1)>p(A(pn−1+⋯+1/p)+apn)=A(pn+⋯+1)+apn+1.
∎
Proposition 5.1.3**.**
Assume P is superspecial.
With respect to the wi-basis in §§3.2-3.3, there exists a rank 3Zp-submodule of L′⊗Zp such that for every primitive w in this submodule, the coefficients of 1=t0,…,tA(1+p+⋯+pn) in the power series pnw~∈(K[[t]])4 do not all lie in W4 for all n∈Z≥0 (property DR); moreover, there exist a≤A/2 and a primitive w in the rank 3 submodule such that the coefficients of 1,…,tA(1+p+⋯+pn−1)+apn in pnw~∈(K[[t]])4 do not all lie in W4 for all n∈Z≥0 (property DvR).
By a slight abuse of terminology, if a submodule of L′⊗Zp satisfies the property DR (with respect to basis {wi}), we also say that this submodule decays rapidly; if a primitive vector satisfies property DvR, we also say that this vector decays very rapidly. By the proof of Theorem 5.1.2 above, property DR (resp. DvR) implies decaying (resp. very) rapidly in the sense of 5.1.1.
The rest of this section is devoted to prove Proposition 5.1.3 for the Hilbert case and its proof for the Siegel case is given in §6. In the following, the split/inert case means that p is split/inert in the real quadratic field attached to the Hilbert modular surface.
In the Hilbert case, by Corollary 3.4.2, the non-ordinary locus is cut out by the equation xy=0. As in the proof of reducing Theorem 5.1.2 to Proposition 5.1.3, we pick a lift W[[x,y]]→W[[t]] of of the local ring homomorphism k[[x,y]]→k[[t]] defined by C. Since C is generically ordinary, we have that both x and y map to non-zero power series in W[[t]]. Without loss of generality, we assume that vt(x)≤vt(y), and that x(t)=ta+… and y(t)=αtb+…, where α∈W×.
5.2. Decay in the split case
Notation as in the proof of Theorem 5.1.2. We first compute F∞=i=0∏∞(1+F(i)), where by (3.2.2),
[TABLE]
Let F∞(1) and F∞(2) denote the top-left and top-right 2×2 blocks of F∞ respectively.
To simplify the notation, define242424These three matrices are the same; however, we use different notations to be consistent with the proof for the Siegel case in §6.
[TABLE]
and let Ft, Fu and Fl denote the top-left, top-right, and bottom-left 2×2 blocks of F.
The following elementary lemma picks out the terms in F∞(1),F∞(2) with the desired p-power on the denominators.
Lemma 5.2.1**.**
(1)
The part of F∞(1) with p-adic valuation −(n+1) consists of sums of products of the form i=0∏m1+2m2Xi(ni). Here Xi is either
Ft, Fu or Fl,252525The terms Xi are chosen so that the product makes sense, and has the right size. Note that this would imply that Fu,Fl must occur in consecutive pairs.m1+1 is the number of occurrences of Ft, and m2 is the number of occurrences of the pair Fu,Fl, m1+m2=n, and ni is a strictly increasing sequence of non-negative integers. The analogous statement holds for F∞(2) as well.
2. (2)
Fix values of m1,m2 as above. Among all the terms in the above sum, the ones with minimal t-adic valuation only occur when ni=i, and either when X0=X1=…=Xm1=Ft, or X0=X2=…=X2m2−2=Fu. The analogous statement holds for F∞(2) as well.
3. (3)
(for F∞(1)) The product i=0∏m1Ft(i)i=0∏m2−1Fu(m1+2i+1)Fl(m1+2i+2) (modulo terms with smaller p-power in denominators262626We use here that xp±yp≡(x±y)pmodp.) equals
[TABLE]
4. (4)
(for F∞(2)) The product i=0∏m1Ft(i)i=0∏m2−1Fu(m1+2i+1)Fl(m1+2i+2)⋅Fu(m1+2m2+1) (modulo terms with smaller p-power in denominators) equals
[TABLE]
5.2.2*.*
Notations. We make the following definition to further lighten the notation.
Let P(1)m2,n denote the product
[TABLE]
Recall that A=a+b denotes the t-adic valuation vt(xy) of xy and let B denote vt(xp+1+yp+1). Note that B≥a(p+1) and the equality holds unless a=b.
In order to prove Proposition 5.1.3, we will consider the following case-by-case analysis depending on the relation between a and b. The following elementary lemmas will be used in the case-by-case analysis.
Lemma 5.2.3**.**
Let n,e,f be in Z≥0.
(1)
The kernel of the 2×2 matrix P(1)e,n modulo p is defined over Fp2 but not over Fp.
2. (2)
The reductions of P(1)e,n and P(1)f,n modulo p are not scalar multiples (over k) of each other if e≡fmod2. In particular, these reductions are not scalar multiples of each other if f=e±1.
Proof.
As the entries of G, Hu and Hl are all in W(Fp2)[1/p], it follows that G(2m)=G and G(2m+1)=G(1) (and the analogous statements hold for Hu and Hl). A direct computation shows that GG(1)G=G, HuHl(1)HuHl(1)=HuHl(1), and Hu(1)HlHu(1)Hl=Hu(1)Hl. Therefore, if n−e is odd, then P(1)e,n simplifies to either GG(1)HuHl(1), GG(1) or HuHl(1); if n−e is even, P(1)e,n simplifies to G or GHu(1)Hl. A direct computation shows that the matrices GG(1), HuHl(1) and GG(1)HuHl(1) (resp. G and GHu(1)Hl) are equal to
[TABLE]
In either case, since λ∈W(Fp2)\Zp, there is no non-trivial Fp-linear combination of the columns modulo p which equals zero; this implies part (1). Furthermore, the above matrices are clearly not scalar multiples of each other, whence part (2) follows.
∎
Lemma 5.2.4**.**
Let n,e,f be in Z≥0.
(1)
The kernel of the 2×2 matrix P(1)e,n−1⋅Hu(n+e) modulo p is defined over Fp2 but not Fp.
2. (2)
The reductions of P(1)e,n−1⋅Hu(n+e) and P(1)f,n−1⋅Hu(n+f) modulo p are not scalar multiples of each other if e≡fmod2. In particular, these reductions are not scalar multiples of each other if f=e±1.
Proof.
We argue along the lines of the proof of Lemma 5.2.3. Indeed, if n−e is odd (resp. even), we are reduced to the cases of GG(1)HuHl(1)Hu, GG(1)Hu, HuHl(1)Hu, and Hu (resp. GHu(1)HlHu(1) and GHu(1)). The rest of the argument is similar. ∎
We now prove Proposition 5.1.3 when p is split in the real quadratic field defining the Hilbert modular surface. The proof is a case-by-case study in the following four cases based on the relation of a=vt(x) and b=vt(y). The idea is to pick out the term(s) with minimal t-adic valuation among all the terms with the same p-power denominators given in Lemma 5.2.1. Case 4 is the generic case and it is easy to pick out such terms so we give the proof directly. In Cases 1-3, we first state the lemmas on the terms with minimal t-adic valuation and then prove the decay lemma. For the convenience of the reader, we summarize the desired vectors which decay rapidly enough at the beginning of each case.
Case 1: a=b.
Recall that A=vt(xy)=a+b=2a.
We will prove that every vector in SpanZp{w1,w2,wi} decays rapidly, where wi=w4 if the t-adic valuation of x−y is >a, and wi=w3 otherwise. Moreover, wi, i=3,4 respectively, decays very rapidly.
Lemma 5.2.5**.**
(1)
Among the terms appearing in F∞(1) described in Lemma 5.2.1 with denominator pn+1, the unique term with minimal t-adic valuation is
[TABLE]
2. (2)
Among the terms appearing in F∞(2) described in Lemma 5.2.1 with denominator pn+1, the unique term with minimal t-adic valuation is
[TABLE]
This lemma follows directly from Lemma 5.2.1 and the assumption that a=b.
We first prove that every primitive vector w∈SpanZp{w1,w2} decays rapidly.
Indeed, write w=cw1+dw2, by Lemma 5.2.3(1) and Lemma 5.2.5(1), there is a unique (non-vanishing) term in F∞(1)w with denominator 1/pn+1 and minimal t-adic valuation A(1+p+⋯+pn) given by P(1)0,n[cd]T(xy)1+p+⋯+pn. Hence, modulo tA(1+p+⋯+pn)+1, the horizontal section pnw~=F∞(pnw) does not lie in W[[t]] and hence w decays rapidly.
Secondly, let i∈{3,4} be defined as above and we show that wi decays very rapidly. Note that our definition of wi implies that the first two entries of the ith row of F have t-adic valuation equalling a. Furthermore, by Lemma 5.2.3(1), P(1)0,n−1⋅v=0 mod p, where v is the nth Frobenius twist of either column of Hu. Therefore, among the terms in the ith column of F∞ with denominator pn+1, the term with minimal t-adic valuation has t-adic valuation 2a(1+p+…+pn−1)+apn. Hence wi decays very rapidly since a≤(2a)/2=A/2.
Finally, we show that every vector in SpanZp{w1,w2,wi} decays rapidly. Let wu denote a primitive vector in the span of w1,w2. It suffices to show that every vector which either has the form pmwu+wi or wu+pmwi decays rapidly, where m≥0. We first prove that every vector which has the form pmwu+wi decays rapidly where m≥0. Indeed, consider the two-dimensional vector whose entries are the first two entries of F∞⋅pmwu. The t-adic valuation of the coefficient of 1/pn+1 equals 2a(1+p+…+pn+m). Similarly, consider the two-dimensional vector whose entries are the first two entries of F∞⋅wi. The t-adic valuation of the coefficient of 1/pn+1 equals 2a(1+p+…+pn−1)+apn. Regardless of the value of m, the latter quantity is always smaller than the former quantity, whence it follows that pmwu+w decays rapidly. Now, consider a vector of the form wu+pmwi, where m>0. Analogous to the previous case, consider the two-dimensional vector whose entries are the first two entries of F∞⋅wu. The t-adic valuation of the sum of all terms with denominator pn+1 equals 2a(1+p+…+pn). Similarly, consider the two-dimensional vector whose entries are the first two entries of F∞⋅pmwi. The t-adic valuation of the coefficient of 1/pn+1 equals 2a(1+p+…+pn+m−1)+apn+m. Regardless of the value of m (recall that m>0), the latter quantity is always greater than the former quantity, whence it follows that pmwu+w decays rapidly.
∎
Case 2: b=p2ea for some e∈Z≥1
We will prove that SpanZp{w1,w2,w} decays rapidly where w is some primitive vector in SpanZp{w3,w4}. We will further prove that w decays very rapidly.
Lemma 5.2.6**.**
(1)
Among the terms appearing in F∞(1) described in Lemma 5.2.1 with denominator pn+1, the unique term with minimal t-adic valuation is
[TABLE]
2. (2)
Among the terms appearing in F∞(2) described in Lemma 5.2.1 with denominator pn+1, there are exactly two terms with minimal t-adic valuation, and they are
[TABLE]
[TABLE]
Proof.
In the following, we will prove part (1); part (2) will follow by an identical argument.
Note that the t-adic valuation of all the entries of F(1) is a+b, and the t-adic valuation of the entries of Fu and Fl is a . Let k,l be in Z≥0 such that k+l=n+1. Consider the following terms of F∞(1) with denominator exactly pn+1:
[TABLE]
Similar to Lemma 5.2.1(2), we observe that among all the terms of F∞(1) with denominator exactly pn+1 given in Lemma 5.2.1(1), for each other term X not listed above, there exists at least one Xk,l (as k and l vary over all non-negative integers constrained by k+l=n+1) such that vt(Xk,l)<vt(X).
Therefore, to prove (1), it suffices to show that vt(Xk,l) with k=n−e+1 and l=e is less than vt(Xk,l) with any other choice of k,l.
Since b=ap2e and k+l=n+1, then f(k):=vt(Xk,n)=a((1+p2e)p−1pk−1+p−1p2(n−k+1)−1pk), and we need to prove that k=n−e+1 minimizes this expression as k ranges over Z∩[0,n+1]. Note that if we allow k to take all real values in the interval [0,n+1], a direct computation shows that f is convex (i.e., f′′(k)>0). Therefore, it suffices to show that f(n−e+1)<f(n−e) and f(n−e+1)<f(n−e+2). These claims can be verified directly and hence we prove (1). ∎
We first prove that SpanZp{w1,w2} decays rapidly.
Indeed, let w′ be a primitive vector in SpanZp{w1,w2}. Lemma 5.2.3(1) implies that P(1)e,n⋅w′ mod p is non-zero. This fact taken in conjunction with Lemma 5.2.6(1) yields that w′ decays rapidly.
Secondly, we prove that there exists a primitive vector w∈SpanZp{w3,w4} (independent of n) which decays very rapidly.
Set Ye,n:=P(1)e,n−1⋅Fu(n+e−1)(xy)1+p+…+pn−e−1xpn−e+pn−e+1+…+pn+e−2+P(1)e+1,n−1⋅Fu(n+e)(xy)1+p+…+pn−e−2xpn−e−1+pn−e+…+pn+e−1, which is the sum of the two terms with minimal t-adic valuation listed in Lemma 5.2.6(2). The sum Ye,n is non-zero modulo p by Lemma 5.2.3(2). Furthermore, up to Frobenius twists and multiplication by scalars, the matrix Ye,nmodp is independent of n. Therefore, there exists a vector w∈SpanZp{w3,w4} which is independent of n and does not lie in the kernel of Ye,nmodp. The very rapid decay of w follows from this observation and Lemma 5.2.6(2).
Finally, a valuation-theoretic argument analogous to Case 1 shows that every primitive vector in SpanZp{w1,w2,w} decays rapidly, thereby establishing Proposition 5.1.3 in this case.
∎
Case 3: b=p2e+1a for some e∈Z≥0
We will prove that SpanZp{w3,w4,w} decays rapidly where w is some primitive vector in SpanZp{w1,w2} and that SpanZp{w3,w4} decays very rapidly.
Lemma 5.2.7**.**
(1)
Among the terms appearing in F∞(2) described in Lemma 5.2.1 with denominator pn+1, the unique term with minimal t-adic valuation is
[TABLE]
2. (2)
Among the terms appearing in F∞(1) described in Lemma 5.2.1 with denominator pn+1, there are exactly two terms with minimal t-adic valuation, and they are
[TABLE]
[TABLE]
Proof.
The proof of this lemma is identical to that of Lemma 5.2.6, so we omit the details.
∎
Analogous to Case 2, Lemma 5.2.4 and Lemma 5.2.7(2) imply the existence of a primitive w∈SpanZp{w1,w2} that decays rapidly; and by Lemma 5.2.4(1) and Lemma 5.2.7(1), SpanZp{w3,w4} decays very rapidly. Finally, a valuation-theoretic argument shows that every primitive vector in SpanZp{w,w3,w4} decays rapidly.
∎
As this is the easiest case, we will be content with merely sketching a proof. Analogous to Lemmas 5.2.6 and 5.2.7, it is easy to see that in this case there are unique terms with minimal t-adic valuations with denominator pn+1 occurring in both F∞(1) and F∞(2). It follows that every primitive vector in SpanZp{w1,w2} decays rapidly and every vector in SpanZp{w3,w4} decays very rapidly. Finally, a valuation theoretic argument similar to Case 1 shows that every vector in the span of w1,w2,w3,w4 does decay rapidly, finishing the proof of Proposition 5.1.3.
∎
5.3. Decay in the inert case
Notation as in the proof of Theorem 5.1.2 and §3.2.1.
Recall that P is superspecial and we will show that The Zp-span of w1,w2,w3 decays rapidly, and the vector w3 decays very rapidly.
The proof goes along the same lines as the proof of the decay lemma for split Hilbert modular varieties, so we will be content with just outlining the salient points.
We first compute F∞=i=0∏∞(1+F(i)), where by (3.2.1), with respect to the basis {w1,w2,w3,w4}, F=(FtFlFu0), where
[TABLE]
Recall that the non-ordinary locus is cut out by the equation xy=0 and a=vt(x),b=vt(y)∈Z>0.
Similar to Lemma 5.2.1, it is easy to see that the top-left 2×2 block of F∞ with p-adic valuation −(n+1) has a term of the form FtFt(1)…Ft(n), and this term is the unique term with minimal t-adic valuation (equalling (a+b)(1+p+…pn)). Similarly, the top-right 2×2 block of F∞ with p-adic valuation −(n+1) has a term of the form FtFt(1)…Ft(n−1)Fu(n), and this term is the unique term with minimal t-adic valuation (equaling (a+b)(1+p+…pn−1)+apn).
Arguments identical to Lemma 5.2.3 and Lemma 5.2.4 yield that every primitive vector in the Zp span of w1,w2 (and in the span of w3) decays rapidly (very rapidly, in the case of w3). Further, as the t-adic valuation of FtFt(1)…Ft(m) is different from the t-adic valuation of FtFt(1)…Ft(n−1)Fu(n) for every pair of integers n,m, it follows that SpanZp{w1,w2,w3} also decays rapidly. The argument is elaborated on in the last paragraph of the proof for Case 1 in §5.2.
∎
6. Proof of the decay lemma in the Siegel case
In this section, we prove Proposition 5.1.3 and hence Theorem 5.1.2 (for superspecial points) in the Siegel case. We refer the reader to the appendix for a decay lemma for supergeneric points. The main idea of the proof is similar to that of the Hilbert case in §5.
6.1. Preparation of the proof
We follow the notation in §5, k=Fˉp, W=W(k), K=W[1/p], λ∈Zp2× such that σ(λ)=−λ, and C=Spfk[[t]] a generically ordinary formal curve in Mk which specializes to a superspecial point P. This gives rise to a local ring homomorphism k[[x,y,z]]→k[[t]] and we pick a lift W[[x,y,z]]→W[[t]] (still a ring homomorphism), and we denote by x(t),y(t) and z(t) the images of x,y,z respectively.
Let a,b,c denote the t-adic valuations of x(t),y(t) and z(t) respectively. We adopt the convention that a,b,c may take on the value ∞ if the corresponding power series is [math]. As before, vt denotes the t-adic valuation map on K[[t]] or k[[t]].
Also recall that σ denotes both the Frobenius on K and the Frobenius on the coordinate rings W[[x,y,z]] with σ(x)=xp,σ(y)=yp,σ(z)=zp; and for a matrix M with entries in K[[x,y,z]], M(n) denotes σn(M).
The preparation lemmas of the Siegel case are very similar to that of the split Hilbert case in the beginning of §5.2.
6.1.1*.*
Notations.
Recall that F∞=i=0∏∞(1+F(i)), where by (3.3.1), with respect to the basis {w1,⋯,w5},
[TABLE]
where ϵ=λ2∈Zp×.
We denote by Ft, Fu, and Fl the top-left 2×2 block, the top-right 2×3 block, and the bottom-left 3×2 block of F respectively.
Define
[TABLE]
Let F∞(1) and F∞(2) denote the top-left 2×2 block and top-right 2×3 of F∞ respectively.
By Corollary 3.4.2, the non-ordinary locus is cut out by the equation xy+z2/(4ϵ)=0.
Let ηtA and μtB denote the leading terms of xy+z2/(4ϵ) and xyp+xpy+z1+p/(2ϵ) respectively. In particular, A=vt(xy+z2/(4ϵ)), and B=vt(xyp+xpy+z1+p/(2ϵ)).
The part of F∞(1) with p-adic valuation −(n+1) consists of sums of products of the form i=0∏m1+2m2Xi(ni). Here, Xi is either
Ft, Fu or Fl,272727The terms Xi are chosen so that the product makes sense, and has the right size. Note that this would imply that Fu,Fl must occur in consecutive pairs.m1+1 is the number of occurrences of Ft, and m2 is the number of occurrences of the pair Fu,Fl, m1+m2=n, and {ni}i=0m1+2m2 is a strictly increasing sequence of non-negative integers. The analogous statement holds for F∞(2) as well.
2. (2)
Fix values of m1,m2 as above. Among all the terms in the above sum, the ones with minimal t-adic valuation only occur when ni=i for all i, and either when X0=X1=…Xm1=Ft, or X0=X2=…=X2m2−2=Fu, depending on whether A≥B. The analogous statement holds for F∞(2) as well.
3. (3)
(for F∞(1)) The product i=0∏m1Ft(i)i=0∏m2−1Fu(m1+1+2i)Fl(m1+2i+2) equals
[TABLE]
4. (4)
(for F∞(2)) The product i=0∏m1Ft(i)i=0∏m2−1Fu(m1+2i+1)Fl(m1+2i+2)⋅Fu(m1+2m2+1) equals
[TABLE]
6.1.3*.*
Notation.
Let P(1)m2,n denote the product
i=0∏m1G(i)i=0∏m2−131Hu(m1+2i+1)Hl(m1+2i+2).
The following will play a similar role as Lemma 5.2.3.
Lemma 6.1.4**.**
The kernel of P(1)g,f+gmodp does not contain any non-zero vector defined over Fp. Moreover,
if f is odd (resp. even), the kernel of P(1)g,f+gmodp does not contain the vector [λ−11] (resp. [−λ−11]).
Proof.
We prove the assertions by explicit computation as in Lemmas 5.2.3 and 5.2.4. Note that
[TABLE]
Both these matrices satisfy the relation X2=−X and hence i=0∏m2−1Hu(m1+2i+1)Hl(m1+2i+2) equals, up to a multiple of ±1, one of these matrices depending on the parity of m1.
Similarly, we have
[TABLE]
Therefore, P(1)g,f+g equals
±21[1λλ−11]
if f is odd, and equals
±21[1λ−λ−1−1]
if f is even.
The lemma then follows immediately.
∎
For fixed n, among the terms listed in Lemma 6.1.2 with denominator pn+1, the number of terms with equal minimal t-adic valuation depends on certain numerical relation between A and B. We then perform the following case-by-case analysis in §§6.2-6.4 to prove the Decay Lemma. The first case, while technically the easiest, holds the main ideas in general.
6.2. Case 1: A<B.
Note that if a+b=2c, or more generally, if the leading terms of xy and z2/(4ϵ) do not cancel, then A<B.
For the ease of exposition, we assume that a≤b≤c. Note that this forces 2a≤A. Even though the statement of Proposition 5.1.3 is not symmetric in a,b,c, an identical argument as the one below suffices to deal with all the other cases.
We will prove that SpanZp{w1,w2,w3} decays rapidly.
For a primitive vector w∈SpanZp{w1,w2,w3},
write w=αuwu+αlw3, where wu is a primitive vector in SpanZp{w1,w2}, and αu,αl∈Zp. Since w is primitive, then either αu or αl is a p-adic unit.
We may assume that αu is a unit – the other case is entirely analogous to this one. Suppose that the p-adic valuation of αl is m≥0.
Consider the terms appearing in F∞(1) described in Lemma 6.1.2 with denominator pn+1. As A<B, the one with minimal t-adic valuation is P(1)0,n(xy+z2/(4ϵ))1+p+…+pn, and this is the unique term with this property. Similarly, consider the terms appearing in F∞(2) with denominator pn+1+m. As A<B, the unique term whose first column has minimal t-adic valuation is P(1)0,n+m−1⋅Fu(n+m)(xy+z2/(4ϵ))1+p+…+pn+m−1.
Let P denote the 2×3 matrix whose first two columns equal P(1)0,n(xy+z2/(4ϵ))1+p+…+pn (part of F∞(1)), and whose last column is the first column of P(1)0,n+m−1⋅Fu(n+m)(xy+z2/(4ϵ))1+p+…+pn+m−1 (part of F∞(2)). Since 1≤a<A, then for any m∈Z≥0, we have A(1+…+pn)=A(1+…+pn+m−1)+apm+n. Therefore,
regardless of the value of m, the t-adic valuation of entries of the first two columns of P are different from the t-adic valuation of the last column of P.
To prove that w decays rapidly, it suffices to prove that among the monomials in Pw with p-adic valuation equalling −(n+1), there exists a monomial with t-adic valuation ≤A(1+…pn). By the proof of Proposition 5.1.3 in Case 1 in §5.2, this in turn reduces to proving the following statement: if m≥1, then wumodp is not in the kernel of P(1)0,nmodp; and if m=0, the vector [(λ−1)(n)1]modp is not in the kernel of P(1)0,n−1modp. Both statements follow from Lemma 6.1.4, establishing the decay of the rank 3 submodule SpanZp{w1,w2,w3}.
Proposition 5.1.3 in this case follows from the observation that since 2a≤A, then w3 decays very rapidly.
∎
6.3. Case 2: A≥B,a=b
Note that if A≥B, then a+b=2c (as the only way this can happen is if xy has the same t-adic valuation as z2/(4ϵ)). We may therefore assume without loss of generality that a<b. It follows then that a<c<b. Within this case, we will need to consider the following two subcases.
Subcase (2.1)e: B(1+p2e−1)<A(1+p)<B(1+p2e+1) for some e∈Z≥1
In this subcase, we will prove that SpanZp{w1,w2,wi} decays rapidly, where i∈{3,4,5} will be chosen depending on the values of a,b and c.
The following lemma, in conjunction with Lemma 6.1.4, implies (as in Case 1) that SpanZp{w1,w2} decays rapidly. It can be proved by the same argument as in the proof of Lemma 5.2.6(1), so we omit its proof.
Lemma 6.3.1**.**
Among the terms appearing in F∞(1) described in Lemma 6.1.2 with denominator pn+1, the unique term with minimal t-adic valuation is
[TABLE]
The t-adic valuation of this term is A(1+…+pn−e)+B(pn−e+1+pn−e+3+…+pn+e−1).
The following lemmas will be used to show that one of w3,w4,w5 also decays rapidly. These lemmas imply that among the terms appearing in F∞(2) with denominator pn+1, for at least one of the columns of this matrix, there exists a unique term with minimum t-adic valuation.
Lemma 6.3.2**.**
Given g∈Z≥1,n∈Z≥0, consider the multiset consisting of numbers of the form A(1+…+pn−f−1)+B(pn−f+pn−f+2+…+pn+f−2)+gpn+f, as f varies over Z∩[0,n]. If the minimal number in this multiset occurs more than once, then it must occur for consecutive values of f.
Proof.
For any choice of f, let us denote the expression by v(f). It suffices to prove the following statement: for f1<f2−1, if v(f1)=v(f2), then v(f2)>v(f2−1).
To that end, suppose that v(f1)=v(f2). Then A(1+p+…pf2−f1−1)=B(pf2−f1−1)(pf2+f1+1)/(p2−1)+gpf2(pf2−pf1).
To prove v(f2)>v(f2−1),
note that p−(n−f2)(v(f2)−v(f2−1))=B(p2f2−1+1)/(p+1)+gp2f2−1(p−1)−A. Multiplying this by (1+p+…+pf2−f1) and applying the relation of A and B above, we have
[TABLE]
which is positive since f2>f1+1. The lemma follows.
∎
Lemma 6.3.3**.**
There are at most two numbers g in the set {a,b,c} such that there exists an integer f (f is allowed to depend on the choice of g ) with A(1+…pn−f−1)+B(pn−f+pn−f+2+…pn+f−2)+gpn+f=A(1+…pn−f)+B(pn−f+1+pn−f+1+…pn+f−3)+gpn+f−1.282828Note that if the equation holds, then f is independent of n, since the equation is actually independent of n; see the proof of Lemma 6.3.2.
Proof.
Suppose there existed choices of f∈Z≥0 for all three choices of g. Let f1,f2,f3 be the choices for f. Then, by the proof of Lemma 6.3.2, we have that ap2f1−1(p−1)=A−B(1+p2f1−1)/(1+p), and similarly bp2f2−1(p−1)=A−B(1+p2f2−1)/(1+p),cp2f3−1(p−1)=A−B(1+p2f3−1)/(1+p).
Substituting these expressions in the equality a+b=2c yields the equation
[TABLE]
Since A≥B≥p+1, we have A=B/(p+1) and hence p1−2f1+p1−2f2−2p1−2f3=0. Since f1,f2,f3∈Z≥1, we must have f1=f2=f3 and hence a=b=c, which is a contradiction.
∎
Let h∈{a,b,c} be such that there is no f which satisfies the hypothesis of Lemma 6.3.3 (indeed, the lemma guarantees the existence of such an h).
We first show the existence of a rank 3 submodule which decays rapidly.
Without loss of generality, we may assume that h=a and we will prove that SpanZp{w1,w2,w3} decays rapidly (if h=b or c, the identical proof will show sufficient decay, with w4 or w5 taking the place of w3).
As in Case 1, Lemmas 6.3.1, 6.1.4, 6.3.2 and 6.3.3 imply that SpanZp{w1,w2} and SpanZp{w3} both decay rapidly. Therefore, it suffices to show that αuwu+α3w3 decays rapidly, where wu is a primitive vector in the span of w1,w2, and either αu or α3 in Zp is a p-adic unit.
By Lemma 6.3.1, the t-adic valuation of the coefficient of 1/pn+1 of F∞wu is d(n)=A(1+…+pn−e)+B(pn−e+1+pn−e+3+…+pn+e−1). Similarly, the t-adic valuation of the coefficient of 1/pm+1 of F∞⋅w3 is c(m)=A(1+…+pm−f−1)+B(pm−f+pm−f+2+…+pm+f−2)+apm+f for some f∈Z∩[0,n]. As in Case 1, it suffices to prove that d(n) is never equal to c(m), regardless of the values of n and m.
Let c(f′,m)=A(1+…+pm−f−1)+B(pm−f′+pm−f′+2+…+pm+f′−2)+apm+f′, for any value of f′≤m. By the definition of f, c(m)=c(f,m), and f′=f minimizes the value of c(f′,m).
If n≥m, since a<A, then d(n)>c(e,m)≥c(f,m)=c(m), as required. On the other hand, if m>n, we have c(m)>A(1+…+pm−f−1)+B(pm−f+pm−f+2+…+pm+f−2)≥d(n), where the last inequality follows from Lemma 6.3.1.
Finally, we treat the question of very rapid decay. If we may take h=a or h=c, the very rapid decay of w3 or w5 is established by the inequality 2a<2c≤A. Otherwise, h must be b and for both a,c, there exist f1,f3 satisfying the equation in Lemma 6.3.3. Since a=c, then f1=f3 and at least one fi≥2. By the proof of Lemma 6.3.3, we have A−B(1+p2fi−1)/(p+1)>0 and hence A≥7B>2b. Thus, w4 decays very rapidly.
∎
Subcase (2.2)e: A(1+p)=B(1+p2e−1) for some e∈Z≥1
In this subcase, we will prove that SpanZp{w3,w4,w5} decays rapidly. We first need the following lemma.
Lemma 6.3.4**.**
Among the terms appearing in F∞(2) described in Lemma 6.1.2 with denominator pn+1, the unique term with minimal t-adic valuation is
[TABLE]
The t-adic valuation of the ith column term is A(1+…+pn−e)+B(pn−e+1+pn−e+3+…+pn+e−3)+gpn+e−1, where g is either a,b or c depending on whether i is 1,2 or 3.
Proof.
It suffices to prove that choice of f=e minimizes the expression A(1+p+…+pn−f)+B(pn−f+1+pn−f+3…+pn+f−3)+gpn+f−1, where f is allowed to range between [math] and n. This can be verified by direct calculation.
∎
It follows from Lemmas 6.3.4 and 6.1.4 that w3, w4 and w5 individually decay rapidly, and that w3 decays very rapidly. In order to show that SpanZp{w3,w4,w5} decays rapidly, it suffices to show that the t-adic valuations of the coefficients 1/pl+1,1/pm+1,1/pn+1 of F∞(w3),F∞(w4),F∞(w5) are always distinct, regardless of the values of l,m,n. By Lemma 6.3.4, these quantities equal A(1+p+…+pl−e)+B(pl−e+1+pl−e+3+…+pl+e−3)+apl+e−1, A(1+p+…+pm−e)+B(pm−e+1+pm−e+3+…+pm+e−3)+bpm+e−1 and A(1+p+…+pn−e)+B(pn−e+1+pn−e+3+…+pn+e−3)+cpn+e−1.
As a,b,c are all strictly less than B, these quantities will all be different unless two of l,m,n are equal. In this case, the quantities still differ, because a,b,c are all distinct integers by assumption. Therefore, SpanZp{w3,w4,w5} decays rapidly.
∎
6.4. Case 3: A≥B and a=b
In this case, a=b=c. We may assume that x(t)=ta, y(t)=βta+∑i=a+1∞βiti, and z(t)=γta+∑i=a+1∞γiti. Since A≥B, we have β+γ2/(4ϵ)=0. We will break the proof of the Decay Lemma into two subcases and the following lemma will be used in both cases.
Lemma 6.4.1**.**
Suppose that γ∈Fp. Let a′>a denote the smallest integer such that either βa′=0 or γa′=0. Then both βa′ and γa′ are non-zero and moreover, B≥(p−1)a+2a′.
Proof.
Since γ∈Fp and β+γ2/(4ϵ)=0, then β∈Fp.
Therefore, in k[[t]],
[TABLE]
[TABLE]
If one of βa′ and γa′ were zero, then A=a′+a, whereas B≥a′+pa; this contradicts with the assumption that A≥B. Hence, we obtain the first assertion of the lemma.
Let a′′≥a′ denote the smallest integer such that βi+γγi/(2ϵ)=0. Then by applying the Frobenius action, we have βa′′p+γγa′′p/(2ϵ)=0, and B≥min{(p+1)a′,a′′+pa}. If B≥(p+1)a′, then the second assertion of the lemma follows.
Therefore, we assume that B=a′′+pa<(p+1)a′. The expansion of xy+z2/(4ϵ) above has a non-zero term of the form (βa′′+γγa′′/(2ϵ))ta+a′′. As A≥B, the term (βa′′+γγa′′/(2ϵ))ta+a′′ has to be cancelled out by a term of the form (4ϵ)−1∑i+j=a+a′′,i,j≥a′γiγjti+j. Therefore, it follows that 2a′≤a+a′′ and hence B=a′′+pa≥(p−1)a+2a′.
∎
Case (3.1)e: B(1+p2e−1)<(p+1)A<B(1+p2e+1) for some e∈Z≥1
The same argument as in Case 2.1 suffices to prove Proposition 5.1.3, unless A=B1+p1+p2e−1+a(p2e−p2e−1). Therefore, we will assume that this is the case.
Lemma 6.4.2**.**
Among the terms appearing in F∞(2) described in Lemma 6.1.2 with denominator pn+1, there are exactly two with minimal t-adic valuation. They are:
[TABLE]
[TABLE]
Both the terms have t-adic valuation A(1+…+pn−e)+B(pn−e+1+pn−e+3+…+pn+e−3)+apn+e−1.
Proof.
This lemma follows from a similar argument as Lemma 5.2.6(2) and the proofs of Lemmas 6.3.2 and 6.3.3, so we omit the details.
∎
We will show that either w3 or w5 decays very rapidly. There are two terms with minimal t-adic valuation as in Lemma 6.4.2, appearing in the coefficient of 1/pn+1 of F∞(w3) and F∞(w5). A direct computation yields that the sum of these two terms equals by
[TABLE]
where
•
u(t) stands for either x(t) or z(t), according to whether we work with w3 or w5,
•
X(t)=pFu⋅Fl(1)⋅pFu(2)⋯Fl(2e−1)⋅[(λ−1)(2e),1]T, and
•
Y(t)=pFt⋅pFu(1)⋅Fl(2)⋯pFu(2e−3)⋅Fl(2e−2)⋅[(λ−1)(2e−1),1]T. The superscript T stands for transpose.
The decay of w3 and w5 is determined by the t-adic valuation of the entries of X(t)u(t)p2e+Y(t)u(t)p2e−1. For the rest of the proof, it suffices to focus on the second row of X(t),Y(t) and hence we view them as functions. We prove the very rapid decay of w3 or w5 in two cases.
(1)
Both β,γ∈Fp.
In this case, we claim that the t-adic valuation of X(t)u(t)p2e+Y(t)u(t)p2e−1 is at most A+B(p+p3+…+p2e−3)+a′p2e−1 for at least one choice of u(t) between x(t) and z(t), where a′ is defined in Lemma 6.4.1.
This claim implies that the t-adic valuation of the coefficient of 1/pn+1 of F∞(w3) or F∞(w5) is at most A(1+…+pn−e)+B(pn−e+1+pn−e+3+…+pn+e−3)+a′pn+e−1. This is sufficient to prove the rapid decay of w3 or w5. Indeed, this quantity is strictly less than A(1+…+pn−f)+B(pn−f+1+pn−f+3+…+pn+f−3)+apn+f−1 for all values of f=e,e+1 by Lemma 6.4.1 and hence the sum of the two terms in Lemma 6.4.2 gives the minimal t-adic valuation term of the coefficient of 1/pn+1 in F∞(w3) or F∞(w5). Moreover, the bounds on a′ in Lemma 6.4.1 proves that w3 or w5 decays very rapidly.
We now prove the claim by contradiction.
Suppose that X(t)x(t)p2e+Y(t)x(t)p2e−1 has t-adic valuation greater than A+B(p+p3+…+p2e−3)+a′p2e−1. Since z(t)=γx(t)+γa′ta′+… with γ∈Fp,γa′=0 and we have assumed that A=B1+p1+p2e−1+a(p2e−p2e−1), it follows that there is a unique monomial in X(t)z(t)p2e+Y(t)z(t)p2e−1 with t-adic valuation A+B(p+p3+…+p2e−3)+a′p2e−1, thereby establishing the claim for u(t)=z(t).
2. (2)
Either β or γ is not in Fp.
In this case, as β+γ2/(4ϵ)=0, we may assume that γ∈/Fp. We again consider the function X(t)u(t)p2e+Y(t)u(t)p2e−1. Suppose that the leading coefficient of X(t) is μX and that of Y(t) is μY. Then, the terms of minimal equal t-adic valuations cancel out in the case when u(t)=x(t) only if μX+μY=0, otherwise by the same idea as in (1), w3 decays very rapidly. Therefore, we may assume that μX+μY=0. However in this case, if we pick u(t)=z(t), then the terms terms with minimal equal t-adic valuations cancel out only if μXγp2e+μYγp2e−1=0, which is not possible as γp2e=γp2e−1. In other words, we show that in this case, w5 decays very rapidly.
As in Case 2.1, SpanZp{w1,w2} decays rapidly, and also every vector that can be written as αuwu+αiwi with αi∈Zp× (i=3,5 depending on whether w3 or w5 decays) decays very rapidly. The latter statement follows by the same valuation-theoretic argument as in the proof of Case 2.1, which also proves that SpanZp{w1,w2,wi} decays rapidly.
∎
Case (3.2)e:A(1+p)=B(1+p2e−1) for some e∈Z≥1
Lemma 6.4.3**.**
Among the terms appearing in F∞(1) described in Lemma 6.1.2 with denominator pn+1, there are exactly two with minimal t-adic valuation.
They are:
[TABLE]
[TABLE]
Both these terms have t-adic valuation A(1+…+pn−e)+B(pn−e+1+pn−e+3…+pn+e−1).
As we have seen many lemmas of this flavor, we omit the proof.
This lemma shows that there are two terms with the same t-adic valuation, which could therefore lead to cancellation, and such phenomenon prevents us from proving that SpanZp{w1,w2} decays rapidly. Nevertheless, the following lemma shows that there is at least a saturated rank one submodule of SpanZp{w1,w2} which decays rapidly.
Lemma 6.4.4**.**
There is a vector w0 in SpanZp{w1,w2} which decays rapidly.
Proof.
By Lemma 6.4.3 and the proof of Lemma 6.1.4, the coefficient (viewed as a power series in t) of the sum of the two terms with minimal t-adic valuation among the terms with denominator pn+1 is of the form μ1M1+μ2M2, for some p-adic units μi, where {M1,M2}={[1λλ−11],[1λ−λ−1−1]}.
As M1modp and M2modp are not scalar multiples of each other, the linear combination μ1M1+μ2M2modp is non-zero. Therefore, there exists a vector wˉ0 defined over Fp which does not lie in ker(μ1M1+μ2M2modp). Choosing w0∈SpanZp{w1,w2} which lifts wˉ0 finishes the proof of this lemma.
∎
We will first prove that there is a rank 2 submodule of SpanZp{w3,w4,w5} which decays rapidly. For ease of notation, let Fuˉ denote the matrix ta1Fu evaluated at t=0.
Let K denote ker(P(1)n−1,e−1Fuˉ(n+e−1)modp)∩SpanFp{w3,w4,w5}. If dimFpK≤1, then lifting two linearly independent Fp-vectors ∈/K gives the desired rank 2 submodule. Therefore, we assume that dimFpK=2 (note that since P(1)n−1,e−1Fuˉ(n+e−1)modp is not the zero matrix, so dimFpK=3). It follows that β,γ∈Fp.
We will prove that SpanZp{w3,w4} decays rapidly. First,
since K∩SpanFp{w3,w4}=SpanFp{βw3−w4}, then any primitive vector in SpanZp{w3,w4} which modulo p is not a multiple of βw3−w4 must decay rapidly. Now we consider βw3−w4. Up to constants, the coefficient of the 1/pn+1 part of the first entry of F∞(βw3−w4) equals βa′tA(1+…+pn−e)+B(pn−e+1+pn−e+2…+pn+e−3)+a′pn+e−1.
Lemma 6.4.1 establishes the required decay as follows: firstly, as a′≤B≤A, we have that the vector βw3−w4 decays rapidly. Secondly, the exact bound for a′ in Lemma 6.4.1 implies (as in the proof in Case 2.1) that SpanZp{w3,w4} decays rapidly. Finally, the very rapid decay of w3,w4 follows from the bound 2a′≤B≤A.
In this section, we provide the general setup of the proofs of Theorems 1 and 3. As mentioned in §1.3, the proofs consist of the following parts:
(1)
The sum of the local contributions at supersingular points is at most 11/12 of the global contribution; and
2. (2)
the local contribution from non-supersingular points is of smaller magnitude.
Proposition 7.2.5 makes (1) precise, and is stated in §7.2. We will prove Proposition 7.2.5 and (2) in §8 for the Hilbert case and in §9 for the Siegel case. The idea involved in the statement of Proposition 7.2.5 is that we break the global intersection number C.Z(m) into pieces, one for each non-ordinary point on C, by using the relation between the Hasse invariant and the Hodge line bundle in §7.1. We also relate the local intersection multiplicity at a point to a lattice-point count.
7.1. The global contribution and its decomposition
Recall that in §4.3.3, we list the set T of m∈Z>0 for which we will study C.Z(m) to prove our main theorems. In order to study the asymptotic behavior, we define TM={m∈T∣m≤M} for M∈Z>0. Moreover, in §§8-9, we will construct a subset SM⊂TM which consists of bad values of m that we want to rule out. The total global intersection number that we will consider is ∑m∈TM−SMC.Z(m). We sum over m instead of working with individual m because geometry-of-numbers techniques which we use to bound the local intersection multiplicity (for cumulative m) do not work for individual m. The following lemma gives the asymptotics of the global term using results in §4.
Lemma 7.1.1**.**
Assume that #SM=O(M1−ϵ)=O(#TM1−ϵ) for some ϵ>0 if L=LH and that #SM=o(#TM) if L=LS. Then
[TABLE]
Moreover, we have, for Theorem 1(2), ∑m∈TM−SMC.Z(m)≍M2; for Theorem 1(1) and Remark 2, ∑m∈TM−SMC.Z(m)≍M2/logM; for Theorem 3, ∑m∈TM−SMC.Z(m)≍M5/2/logM.
Proof.
By §4.3.1 and the assumption on SM, we have ∑m∈SM∣qL(m)∣=o(∑m∈TM∣qL(m)∣). Then
the assertions follow from §4.3.1, Lemma 4.3.2, Lemma 4.3.4, and the prime number theorem.
∎
For each non-ordinary point P on C∩Z(m), we introduce the notion of global intersection number gP(m) at P using the following (well-known) relation between the non-ordinary locus and the divisor class of the Hodge bundle. Note that in the proof, we will only use the notion gP(m) for a supersingular point.
Lemma 7.1.2**.**
The non-ordinary locus in Mk and Mktor is cut out by a Hasse-invariant H, which is a section of ωp−1, and hence the number of non-ordinary points (counted with multiplicity) on C is given by (p−1)(C.ω).
See for instance [Boxer, §§1.4-1.5, Theorem 6.2.3] for an explanation of this fact (and we use the fact that the ordinary Newton stratum coincides with the ordinary Ekedahl–Oort stratum). For the last assertion in the lemma, we remark that when L=LH, the boundary Mktor∖Mk is ordinary and hence the intersection of C′ (in §4.1.3) with the non-ordinary locus is the same as the intersection of C with the non-ordinary locus.
Definition 7.1.3**.**
Let t be the local coordinate at P (i.e., CP=Spfk[[t]]) and let A=vt(H). We define gP(m)=p−1A∣qL(m)∣.
Note that by the above lemmas, we have the following decomposition
[TABLE]
7.2. The lattices and the outline of the proof
Let B→Spfk[[t]] denote the generically ordinary abelian surface given by fulling back the universal family over Mk to CP=Spfk[[t]] for some point P∈C. Recall the notion of special endomorphisms from §2.2 and by a slight abuse of terminology, when L=LH, we will also refer to a special quasi-endomorphism with certain integrality condition in §2.2.11 as a special endomorphism. For any n∈Z>0, the lattice is special endomorphisms of Bmodtn is a sublattice of Bmodt, which is equipped with a positive definite quadratic form Q′ (see 2.3.1).
Lemma 7.2.1**.**
The local intersection multiplicity of C.Z(m) at P, denoted by lP(m), equals
[TABLE]
The lemma follows directly from the moduli interpretation of Z(m). Note that as B generically has no special endomorphisms, this infinite sum can actually be be truncated at some finite stage (which will depend on m).
Remark 7.2.2*.*
Given B, the lattices of special endomorphisms of Bmodtn have the same rank for all n∈Z>0. Indeed, the work of de Jong, Moonen and Kisin cited in the proof of Theorem 5.1.2 applies to any P and for any special endomorphism w of Bmodt, we have the parallel extension w~∈K[[t]], which is invariant under the Frobenius on Lcris(W[[t]]). By de Jong’s theory (here we need the fully faithfulness of the Dieudonné functor, see [dJ95, Cor. 2.4.9]), whether w extends over modtn depends on the p-powers in the denominators of the coefficients of w~. Therefore, given n, there exists N such that pNw extends over modtn and hence these lattices tensor Zℓ,ℓ=p are all isomorphic and in particular, the rank of the lattices is independent of n.
Motivated by the Decay Lemmas Theorems 5.1.2 and A.0.1, we define the following lattices for supersingular points (note that the notation is slightly different from that in the introduction and we will use the notation in this section for the rest of the paper).
7.2.3*.*
Assume P is supersingular and recall that A=vt(H), where H is the Hasse invariant and in 5.1.1, An=[A(pn+pn−1+⋯+1+p1)].
•
If Bmodt is superspecial, define L0,1, Ln,1,n∈Z>0, and Ln,2,n∈Z≥0 to be the lattices of special endomorphisms of B mod t, mod tAn−1+1, and mod tAn−1+apn+1 respectively. As in 2.3.1, we pick a lattice Ln,i′⊂L′ such that Ln,i⊂Ln,i′ and for ℓ=p, Ln,i′⊗Zℓ=L′⊗Zℓ and Ln,i′⊗Zp=Ln,i⊗Zp. In particular, L0,1′=L′ and by Theorem 5.1.2, we have [Ln,1′:Ln,2′]≥p and [L′:Ln,1′]≥p3n.
•
If Bmodt is supergeneric, define L0 and Ln,n∈Z>0 to be the lattice of special endomorphisms of B mod t and mod tAn−1+1. Again, as in 2.3.1, we pick a lattice Ln′⊂L′ such that Ln⊂Ln′ and for ℓ=p, Ln′⊗Zℓ=L′⊗Zℓ and Ln′⊗Zp=Ln⊗Zp; Theorem A.0.1 implies that [L′:Ln′]≥p3n.
Since we assume that C does not admit any global special endomorphisms, we have ∩n=0∞Ln,i={0}=∩n=0∞Ln. By Remark 7.2.2, the difference between Ln′,Ln,i′ and Ln,Ln,i is the same as that between L0,L0,i and L′, we also have ∩n=0∞Ln,i′={0}=∩n=0∞Ln′.
where the last equality follows from the facts that rn,1(m)≥rn,2(m),rn,2(m)≥rn+1,2(m) and a≤A/2,An≤A(pn+⋯+p−1). We then obtain the assertion in (1) by rearranging the summations. A similar argument gives (2).
∎
The main task of the next two sections is to prove that
Proposition 7.2.5**.**
Given C, there exists SM satisfying the assumption in Lemma 7.1.1 such that for every supersingular point P on C, we have
[TABLE]
Once we have this proposition, we will prove that the local contribution from non-supersingular points have smaller order of magnitude, whence we conclude that there are infinitely many non-supersingular points on C which lie in the desired special divisors.
7.3. Ordinary points
In order to bound lP(m), we need the following decay lemma for ordinary points, which follows directly from Serre–Tate theory. We thank Keerthi Madapusi Pera for pointing this out to us. Let B→Spfk[[t]] denote the abelian surface with ordinary reduction given by pulling back the universal family over Mk to CP=Spfk[[t]] for an ordinary point P.
Lemma 7.3.1**.**
If w is not a special endomorphism for the p-divisible group B[p∞]modtA+1, then pw is not a special endomorphism for B[p∞]modtpA+1.
Proof.
Note that an endomorphism of B[p∞]modtn is special if and only if its reduction on B[p∞]modt is special. Hence we only need to consider the deformation of endomorphisms.
Let {qi} (i=1,2 for the Hilbert case and i=1,2,3 for the Siegel case) denote q-coordinates of the formal group in the Serre–Tate deformation theory. Then, the formal neighborhood of the Shimura variety is given by SpfW[[ti]], where ti=qi−1. Note that over W[[ti]], s is an endomorphism of a point x=(x1,x2,x3) 292929i.e., x is defined by setting t1=x1,t2=x2 and t3=x3 if and only if, all points y=(y1,y2,y3) satisfying the condition (yi+1)p=xi+1 (equivalently (yi+1)p−1=xi) have the property that ps is an endomorphism of y. In other words, if f(ti)=f(qi−1)=0 is the local equation of the locus on W[[ti]] such that the endomorphism w deforms, then f(qip−1)=f((ti+1)p−1)=0 is the local equation for pw to deform. Hence over k[[t]], if f=0 is the local equation that deforms w, then fp=0 is the local equation for pw, and this gives the desired assertion.
∎
Lemma 7.3.2**.**
Let L0,Ln,n∈Z>0 be the lattices of special endomorphisms of Bmodt and BmodtApn−1+1 respectively where A∈Z>0. Then
(1)
for any A, we have rkZLn≤2 if L=LH and rkZLn≤3 if L=LS;
2. (2)
there exist a constant A and a Zp-lattice Λ (depending on P) with rkZpΛ≤1 when L=LH and rkZpΛ≤2 when L=LS such that Ln⊂(Λ+pn−1L1⊗Zp)∩L0.
In particular, if rkZLn=3 when L=LS or rkZLn=2 when L=LH, then (discLn)1/2≥pn−1.
Proof.
Note that Ln⊂Ln⊗Zp⊂L0⊗Zp=Lcris,P(W)φ=1, where Lcris,P is the fiber of the F-crystal Lcris defined in 2.2.3 and 2.2.9 and φ is the Frobenius action. Since P is ordinary, then φ acts on Lcris,P(W) with slope −1,1,0,0 (Hilbert case) or −1,1,0,0,0 (Siegel case) and hence (1) follows.
Let Λ′ be the Zp-lattice of special endomorphisms of B[p∞].
Since CP is not contained in any special divisor,303030This is the assumption of Theorem 1(1) and Theorem 3; and for Theorem 1(2), we may assume this as otherwise, the conclusion is automatic. B[p∞] admits at most a rank 2 (resp. rank 1) module of special endomorphisms when L=LS (resp. L=LH); indeed, if rkZpΛ′=3 (resp. 2), then Λ′⊗Qp=L0⊗Qp, and thus the B admits special endomorphisms.
We now mimic the proof of [Ananth17, Thm. 4.1.1] using Lemma 7.3.1 instead of [Ananth17, Lem. 4.1.2(2)].
Let Λ⊂L0⊗Zp be the saturation of Λ′ in L0⊗Zp; then there exists Λ0⊂L0⊗Zp such that L0⊗Zp=Λ⊕Λ0. Let Λn denote (Ln⊗Zp+Λ)∩Λ0; then
Ln⊗Zp+Λ=Λ⊕Λn.
It suffices to show that there exists A such that Λn⊂pΛn−1 (and this implies that Λn⊂pn−1Λ1).
By definition, none of the elements in Λ0 extend to Spfk[[t]], then there exists A such that Λ1⊂pΛ0. For n≥2, assume for contradiction that there exists α∈Λn\pΛn−1. If α∈pΛn−2, then write α=pβ with β∈Λn−2. Since pβ=α∈Λn, then by Lemma 7.3.1, β∈Λn−1, which contradicts with the assumption that α∈/pΛn−1. Thus we have α∈/pΛn−2; by iterating the argument, we have α∈/pΛ0. This is a contradiction since α∈Λn⊂Λ1⊂pΛ0.
∎
In this section, we use the results proved in §§4-5 to prove Proposition 7.2.5 in the case of Hilbert modular surfaces. This, in conjunction with Lemma 8.1.2, yields Theorem 1(2).
8.1. The bad set SM and the local intersection multiplicities at non-supersingular points
We first construct the set SM; the following lemma only concerns ordinary and superspecial points because we only need to consider such P for the proof of Theorem 1(2). Indeed, if P∈Z(m), then P is either ordinary or supersingular and if P∈Z(m),p∤m, then by §4.4.2(1), P is not supergeneric. Therefore for P∈Z(m),m∈T, P is either superspecial or ordinary.
Lemma 8.1.1**.**
Notation as in §7.1,7.2.3 and Lemma 7.3.2. Given a finite set {Pi}⊂(C∩(∪m∈Z>0Z(m)))(k),
there exists SM⊂TM with #SM=O(M1−ϵ) for some 0<ϵ<1/6 such that for all i,
(1)
if Pi is superspecial, then {s∈LN,1′∣0=Q′(s)≤M,Q′(s)∈/SM}=∅ where N=31+ϵlogpM;
2. (2)
if Pi is ordinary, then {s∈LN∣0=Q′(s)≤M,Q′(s)∈/SM}=∅ where N=ϵlogpM.
Proof.
Since the union of finitely many sets with cardinality O(M1−ϵ) still has cardinality to be O(M1−ϵ), it suffices to prove the assertion for each Pi separately. We follow the idea of the proof of [Ananth17, Thm. 4.3.3].
If Pi is superspecial, we take SM={m∈TM∣∃s∈LN,1′ with Q′(s)=m} and then it satisfies (1) by definition. Note that #SM≤#{s∈LN,1′∣Q′(s)≤M}. Then by a geometry-of-numbers argument (see for instance [Ananth17, Lem. 4.2.1]) and Theorem 5.1.2, we have
[TABLE]
where dN is the first successive minimum of LN,1′ and dN→∞ as N→∞ because ∩LN,1′={0}. Then #SM=O(M1−ϵ) by the definition of N.
If Pi is ordinary, then rkLN=2 by Lemma 7.3.2 and the fact that rkLN=rkL0 is even by the Tate conjecture. Similar to the superspecial case, we take SM={m∈TM∣∃s∈LN with Q′(s)=m} and then by Lemma 7.3.2, #SM=O(M/pN+M1/2/dN)=O(M1−ϵ).
∎
Lemma 8.1.2**.**
Notation as in Lemma 8.1.1.
For an ordinary point P=Pi∈C(k), we have
where rn(m)=#{s∈Ln∣Q′(s)=m}. By a geometry-of-numbers argument and Lemma 7.3.2, we have ∑m=1Mrn(m)=O(M/pn+M1/2/dn), where dn is the first successive minimum of Ln and the implicit constant here only depends on p. Thus ∑m∈TM−SMlP(m)=O(NM+pNM1/2)=O(M1+ϵ).
∎
We follow the notation in Lemma 8.1.1 and P=Pi superspecial. We break ∑m∈TM−SMlP(m) into two parts and are treated in the following lemmas.
Lemma 8.2.1**.**
Notation as in Corollary 7.2.4.
For any ϵ>0, there exists c∈Z>0 which only depends on P and ϵ such that
[TABLE]
Proof.
By Lemma 8.1.1, rn,i(m)=0 for n>N=31+ϵlogpM and hence
[TABLE]
since rn,1(m)≥rn,2(m).
By a geometry-of-numbers argument, ∑m=1Mrn,1(m)≤c2(M2/p3n+M3/2/p2n+M/pn+M1/2/dn), where c2 is an absolute constant and dn is the first successive minimum of Ln,1′. Hence
[TABLE]
Note that Ac2∑n=cN≤Ac2(p2c(1−p−2))−1, which goes to [math] as c→∞ and the second term is
Let θn,i denote the theta series attached to the lattice Ln,i′. We decompose θn,i=En,i+Gn,i, where Gn,i is a cusp form and En,i is an Eisenstein series as in §4.2 and follow the proof of Lemma 4.3.2.
Let E=2pA(p+2)E0,1+2AE0,2+∑n=1c2Apn(En,1+En,2), G=2pA(p+2)G0,1+2AG0,2+∑n=1c2Apn(Gn,1+Gn,2).
Note that G is a weight 2 cusp form and by Deligne’s Weil bound, we have that its m-th Fourier coefficient qG(m)=O(m1/2+ϵ). Hence the total contribution from the cusp form G is ∑m∈TM−SMqG(m)=O(M3/2+ϵ).
Let qn,i(m) and q(m) denote the m-th Fourier coefficient of En,i and E. Recall that for p∤m for m∈TM; by Lemma 4.4.6 and the fact that ∣L′∨/L′∣=p2, we have for any n,i
[TABLE]
Recall from §7.2.3 that [L′:Ln,1′]≥p3n and [L′:Ln,1′]≥p3n+1; therefore,
[TABLE]
Take α=2pp+2+p2−1p, which is <11/12 when p≥5. We have the left hand side equals
[TABLE]
which gives the desired estimate by the definition of gP(m).
∎
The set SM is constructed by Lemma 8.1.1 and taking {Pi} to contain all of (the finitely many) supersingular points in C∩(∪p∤mZ(m)). Then the desired estimate follows from Lemma 8.2.1 and Lemma 8.2.2 by taking c such that ϵ<1211−α.
∎
If C is contained in Z(m) with m being a perfect square, then by applying suitable Hecke translates, we may assume that C is contained in the product of modular curves and then the assertion is a special case of [CO06, Proposition 7.3]. Now for the rest of the proof, we may assume that C is contained in some Hilbert modular surface and we will use Z(m) to denote special divisors on the Hilbert modular surface.
Note that any point on Z(m) corresponds to an abelian surface isogenous to the self-product of an elliptic curve.
Thus we assume for contradiction that there are only finitely many points on C∩(∪m∈TZ(m)) and take {Pi} to be this finite set and apply Lemma 8.1.1 to construct SM. Since all Z(m) are compact, it makes sense to consider C.Z(m). We deduce a contradiction by Lemma 7.1.1, Proposition 7.2.5, and Lemma 8.1.2.
∎
In this section, we prove all of Theorem 1 and Theorem 3. §9.1 consists of results pertaining to squares represented by positive definite quadratic forms.313131Recall that we must prove our curve intersects special divisors of the form Z(Dℓ2) at infinitely many points. This involved dealing with squares represented by quadratic forms, and hence the Geometry-of-numbers arguments are more involved than in the Hilbert case. In §9.2, we prove Proposition 7.2.5 by combining results proved in §§4, 6, and 9.1. Finally, we deal with the intersection multiplicities at non-supersingular points in §9.3 to finish the proof of the main theorem.
We now set up notation that we will use for §9. For supersingular points P, recall that we defined Ln,i′ for superspecial points and Ln′ for supergeneric point in §7.2.3.
Let l(n)i,i=1,…,5 denote the ith successive minimum of the quadratic form Q′ restricted to Ln,1′ or Ln′.
Let Pn denote a rank two sublattice of Ln,1′ or Ln′ with minimal discriminant.
Note that l(n)1l(n)2≍dn, where dn denotes the root discriminant of Pn. Moreover, since ∩n=0∞Ln,i′={0}=∩n=0∞Ln′, we have l(n)1→∞ as n→∞.
9.1. Preparation
We need the following results to prove Proposition 7.2.5. Although Lemma 9.1.2 is stated for the rank 5 lattices Ln,1′,Ln′, the proof does not use the assumption on rank and hence it holds for the lattices Ln for ordinary points (notation as in Lemma 7.3.2) when rkZL0=3; see §9.3 for details.
Lemma 9.1.1**.**
We have l(n)1l(n)2⋯l(n)i≫p(i−2)n for i≥3.
Proof.
Note that if we have two lattices L1⊃L2, then the successive minima of L2 give upper bounds of that of L1. Thus we may enlarge Ln,i′,Ln′ and prove the assertion for the enlarged lattices.
We enlarge Ln,i′,Ln′ as follows. For ℓ=p, we still require Ln,i′⊗Zℓ=L′⊗Zℓ and Ln′⊗Zℓ=L′⊗Zℓ; at p, let Λ0 denote the rank 3 submodule of L′⊗Zp which decays rapidly in the Decay Lemmas (Theorems 5.1.2 and A.0.1), then we enlarge Ln,1′,Ln′ such that Ln,1′⊗Zp=pnΛ0+L′⊗Zp and Ln,1′⊗Zp=pnΛ0+L′⊗Zp.
For the enlarged Ln,1′,Ln′, we have
[TABLE]
where the implied constants only depend on the lattice L′. Thus the assertion follows.
∎
Lemma 9.1.2**.**
Suppose that dn2M=o(p2n) as n→∞. Then for any vector v∈Ln,1′ or Ln′ such that Q(v)≤M, we have that v∈Pn for n≫1. In particular, if dn≤pn/2, then for any vector v∈Ln,1′ or Ln′ such that Q′(v)<pn−ϵ for some absolute constant ϵ>0, we have that v∈Pn for n≫1. (All the implicit constants here are independent of n,M.)
Proof.
Recall that l(n)1⋅l(n)2≍dn.
Thus, by Lemma 9.1.1, we have
[TABLE]
In other words, for any vector v linearly independent to Pn, we have Q′(v)≥l(n)32≫p2n/dn2. Then the first assertion follows. The second assertion follows directly from the first assertion by taking M=pn−ϵ.
∎
Proposition 9.1.3**.**
Fix D∈Z>0. Recall rn,i(m),rn(m) from Corollary 7.2.4. Then we have the following two bounds: (here we state the results for rn(m) and the same results also hold for rn,1(m))
(1)
\displaystyle\sum_{m=D\ell^{2},m\leq M,\ell\text{ prime }}r_{n}(m)=O_{\epsilon}\Big{(}\frac{M^{2+\epsilon}}{p^{2n}}+\frac{M^{3/2+\epsilon}}{p^{n}}+M^{1+\epsilon}\Big{)}.
2. (2)
m=Dℓ2,m≤M,ℓ prime ∑rn(m)* and ℓ≤M,ℓ prime ∑rn(ℓ) are both O\Big{(}\frac{M^{5/2}}{p^{3n}}+\frac{M^{2}}{p^{2n}}+\frac{M^{3/2}}{p^{n}}+\frac{M}{d_{n}}+\frac{M^{1/2}}{l(n)_{1}}\Big{)}.*
Proof.
In the proof, we work with Ln′ and everything holds true for Ln,1′, too.
We note that (2) is a trivial upper bound from a geometry-of-numbers argument. Indeed, both m=Dℓ2,m≤M,ℓ prime ∑rn(m) and ℓ≤M,ℓ prime ∑rn(ℓ) are no greater than m=1∑Mrn(m); we then obtain the desired bound by [Ananth17, Lem. 4.2.1] and Lemma 9.1.1.
Now we prove (1). We may assume that there exists a vector v0∈L0′ such that Q′(v0)=Dℓ02 for some prime ℓ0. Otherwise rn(m)=0 for all m=Dℓ2 for any prime ℓ. Let e1 denote a primitive vector in Ln′ such that e1=pkv0 for some k∈Z≥0. By definition, pnL0′⊂Ln′ and thus pnv0∈Ln′. Therefore k≤n. Since e1 is primitive in Ln′, we extend it into a basis {e1,e2,…e5} of Ln′. Let L′n denote the sublattice of L0′ spanned by f1:=v0=e1/pk,e2,…e5; since L′n is a sublattice of L0′, then Q′∣L′n is still Z-valued. Since Q′(v0)=Dℓ02=:N, then there exist vectors f2…f5∈(2N)−1L′n such that [fi,f1]′=0 for i≥2, and SpanZ{f1,f2,f3,f4,f5}⊃L′n.
Let Q′ denote the restriction of Q′⊗Q to SpanZ{f2,f3,f4,f5}⊂L0′⊗Q. By the definition of fi, we have Q′ is a (2N)−2Z-valued quadratic form. Let l(n)1,…,l(n)4 denote the successive minima of SpanZ{f2,f3,f4,f5}. Since (2N)−1L′n=(2N)−1Ln′+(2N)−1p−kZe1, then l(n)1⋯l(n)i≫p(i−1)n−k≥p(i−2)n for i≥2 (note that k≤n and N is absorbed in the implicit constant as N is independent of n,k). Then the standard geometry-of-numbers argument gives
[TABLE]
On the other hand, on SpanZ{f1,f2,f3,f4,f5}, for v=xf1+y2f2+⋯+y5f5, we have Q′(v)=Dℓ02x2+Q′(vy), where vy=y2f2+⋯+y5f5. If Q′(v)=Dℓ2≤M, then Q′(vy)≤Q′(v)≤M and Q′(vy)=D(ℓ−ℓ0x)(ℓ+ℓ0x). For a given vy with Q′(vy)≤M, there are at most Oϵ(Mϵ) ways to factor Q′(vy)/D into two factors and thus there are at most Oϵ(Mϵ) possible x such that for v=xf1+vy, we have Q′(v)=Dℓ2≤M for some prime ℓ. Since Ln′⊂SpanZ{f1,f2,f3,f4,f5}, then m=Dℓ2,m≤M,ℓ prime ∑rn(m)=Oϵ(MϵYn), which gives the (1) by the above bound for Yn.
∎
Proposition 9.1.4**.**
Fix D∈Z>0.
The proportion of primes ℓ≤(M/D)1/2 such that Dℓ2 is represented by the quadratic form restricted to Pn goes to zero as n→∞.
Proof.
Let Rn denote the imaginary quadratic ring with discriminant −dn2.
The class group of Rn is in bijection with equivalence classes of binary quadratic forms of discriminant −dn2. Let a denote the ideal corresponding to Q′ restricted to Pn. Recall that l(n)1→∞ as n→∞. Thus for n≫1, we have that a is not equivalent to any ideal whose norm is D, i.e., (Pn,Q′) does not represent D. Note that it suffices to deal with primes ℓ which are relatively prime to Ddn2.
The correspondence between ideal classes and binary quadratic forms yields that the integer Dℓ2 is represented by (Pn,Q′) if and only if there exists an invertible ideal b equivalent to a with Nmb=Dℓ2.
This implies that ℓ=c1c2 (i.e. the prime ℓ splits in Rn), and that b=dc12 or b=dc22, where d is some ideal such that Nmd=D (the case b=c1c2 is ruled out by the above discussion that a and therefore b is not equivalent to any ideal whose norm is D). In other words, Q′ restricted to Pn represents Dℓ2 if and only if there exist some ideals c,d such that Nmc=ℓ,Nmd=D and c2d is equivalent to a.
Let C denote the equivalence classes of ideals c such that c2 is equivalent to ad−1 for some d with Nmd=D. Since D is fixed, then C is a finite (independent of n) union of torsors for the 2-torsion of the class group of Rn, when C is nonempty. By the Genus theory, the cardinality of the two-torsions of the class group of Rn is bounded by the number of prime divisors of dn2; thus, #C=Oϵ(dnϵ).
We finish the proof in two cases.
(1)
If dn≤(logM)2, it follows by [TZ] that the proportion of primes represented by the quadratic form associated to any ideal class c is 1/dn because dn≍ the class number of Rn. Thus the total proportion of ℓ such that Dℓ2 is representable is #C/dn=Oϵ(dnϵ−1), which goes to [math] as dn→∞.
2. (2)
If dn≥(logM)2, let fc denote the binary quadratic form associated to c. Then as in the proof of [Ananth17, Claim 3.1.9], we have
[TABLE]
Thus by the above discussion,
[TABLE]
which finishes the proof.∎
The following result gives a bound of Fourier coefficients of the cuspidal part of our theta series in terms of the discriminant of the quadratic lattice.
Proposition 9.1.5** (Duke, Waibel).**
Let S be a fixed finite set of primes.
Let θ be the theta series attached to a positive definite quadratic lattice of rank 5 with discriminant Dθ such that all prime factors of Dθ lie in S. Write θ=E+G, where E is an Eisenstein series and G is a cusp form. Then, there exist absolutely bounded positive constants N0 and C such that for all m∈T (the set T defined in §4.3.3), the m-th Fourier coefficient qG(m) of G satisfies that qG(m)≤CDθN0m1+1/4.
By Remark 7.2.2, we have that discLn,i′,discLn′ are independent of n,i away from p and hence all the theta series attached to these lattices satisfy the assumption on Dθ.
An analogous result of Proposition 9.1.5 was proved by Duke in the case of ternary quadratic forms. The main steps of his proof carry through in this case too, so we will be content with just sketching his proof.
Proof.
The proof of [Duke, Lem. 1] and the discussion on [Duke, p.40] apply to rank 5 quadratic forms (with suitable modification of the power of Dθ)
and we have that the Petersson norm of G satisfies ∣∣G∣∣=O(DθN1) for some absolute constant N1 (here we use the fact that the level Nθ of G is O(Dθ).
Thus to obtain a bound for qG(m), we only need to bound the Fourier coefficients aj(m) for an orthonormal basis of the space of cusp forms of weight 5/2 and level Nθ (with respect to certain quadratic character determined by θ).
Now we apply [Waibel, Theorem 1]. Using the notation there, we have that if m=ℓ, then t=ℓ,v=1,w=1,(m,Nθ)=O(1); if m=Dℓ2, then t=D,v≍1,w≍ℓ,(m,Nθ)=O(1). Thus ∣aj(m)∣≪ϵm2827+ϵDθϵ for m=ℓ and ∣aj(m)∣≪ϵm43+ϵDθϵ, which gives the desired bound once we combine with the above estimate of ∣∣G∣∣.
∎
Notation as in §7.2.3 and
Corollary 7.2.4. For a supersingular point P, we estimate ∑m∈TM−SMrn,i(m),∑m∈TM−SMrn(m) with respect to different ranges of n.
Definition 9.2.1**.**
Given absolute constants ϵ0,ϵ1>0 (we will choose ϵ0,ϵ1 in the proof of Proposition 7.2.5), the ranges of n are defined as follows:
•
n is small if n≤ϵ0logpM, where ϵ0 is a constant independent of M.
•
n is in the lower medium range if ϵ0logpM<n≤43logpM
•
n is in the upper medium range if 43logpM<n≤(1+ϵ1)logpM.
For m∈TM, we have m=Dℓ2, where D is a non-zero quadratic residue mod p. Then by §4.4.3(2), any supergeneric point does not lie on Z(m). Hence we will only consider P superspecial.
Recall from Lemma 7.1.1 that for any SM such that #SM=o(#TM), we have
[TABLE]
We will first prove that there exists SM such that #SM=o(#TM) and the contribution from n≥ϵ0logpM is o(M2/logM).
The upper medium range.
We treat this part in two ways according to whether dn0≤M1/8, where n0=⌈43logpM⌉.
(1)
If dn0≥M1/8, then we bound this part using geometry-of-numbers.
Since Ln,1′⊂Ln0,1′ for all n≥n0, then by definition, dn≥dn0≥M1/8. By Proposition 9.1.3(2), we have that
[TABLE]
which is o(M2/logM) once we take ϵ1<1/8.
2. (2)
If dn0<M1/8, we control this part by putting m’s in this range into SM.
More precisely, consider RM:={m∈TM∣∃v∈Ln0,1′,Q′(v)=m}.
By our assumption, dn02M<M5/4=o(p2n0) and by Lemma 9.1.2, for M≫1, if m∈RM, then m is represented by Q′∣Pn0, which is a binary quadratic form. Then by Proposition 9.1.4 (note that n0→∞ as M→∞), #RM=o(M1/2/logM)=o(#TM). Thus we may choose SM such that SM⊃RM and then
[TABLE]
The large n’s. Let n0=⌈(1+ϵ1)logpM⌉ and let RM′:={m∈TM∣∃v∈Ln0,1′,Q′(v)=m}. We will show that #RM′=o(M1/2/logM) and thus we may choose SM such that SM⊃RM′ and then
[TABLE]
We bound the size of RM′ case by case depending on the size of dn0,l(n0)1.
•
Case (1): dn0≤M1/2+ϵ2 for some absolute constant ϵ2<ϵ1/2.
Then dn0≤M1/2+ϵ2<pn0/2 and M<pn0−ϵ1. By Lemma 9.1.2, for M≫1, if m∈RM′, then m is represented by Q′∣Pn0. By Proposition 9.1.4, #RM′=o(M1/2/logM).
•
Case (2): dn0>M1/2+ϵ2 for all ϵ2<ϵ1/2 and l(n0)1>Mϵ3 for some absolute constant ϵ3>0.
We have #RM′≤#{v∈Ln0,1′∣Q′(v)∈TM}, which is O(M1/2−ϵ1+M1/2−ϵ2+M1/2/l(n0)1)=o(M1/2/logM) by Proposition 9.1.3(2).
•
Case (3) dn0>M1/2+ϵ2 for some ϵ2<ϵ1/2 and l(n0)1≤Mϵ3 for some ϵ3<ϵ2.
Then l(n0)2=dn0/l(n0)1>M1/2. In other words, any vector v∈Ln0,1′ which is not a scalar multiple of the chosen vector v0 of the minimum length has Qn0′(v)≤l(n0)22>M. Therefore, any m∈RM′ has to be represented by the rank 1 quadratic form spanned by v0. As M→∞, we have l(n0)1→∞. Thus once M is large enough such that l(n0)12>D, then this rank 1 quadratic form would represent at most one element in TM and hence #RM′=o(#TM).
In conclusion, taking SM=RM∪RM′, we have #SM=o(#TM) and
[TABLE]
The small n’s. We follow the notation and the idea of the proof in Lemma 8.2.2.
We enlarge Ln,1′ as in the proof of Lemma 9.1.1; also let w be the vector which decays very rapidly in the Decay Lemma for superspecial points, then we enlarge Ln,2′ such that Ln,2′⊗Zp=Ln,1′⊗Zp+pn+1Zpw. Then discLn,i′≍p6n with the implicit constant only depending on P. Note that Corollary 7.2.4 still holds with the new definitions of Ln,i′.
Let
[TABLE]
Note that here the Eisenstein series E and the cusp form G depend on M.
Since discLn,i′=O(p6ϵ0logpM)=O(M6ϵ0) for n≤ϵ0logpM, then by Proposition 9.1.5,
the m-th Fourier coefficient
[TABLE]
and ∑m∈TM−SMqG(m)=O(M(6N0+1)ϵ0+7/4)=o(M2/logM) once we take ϵ0<(24N0+4)−1.
The computation for the Eisenstein part is the same as in the proof of Lemma 8.2.2. More precisely, since p∤m, by Lemma 4.4.6(1)(3), we have
[TABLE]
Thus we finish the proof by putting all parts together and using Corollary 7.2.4.
∎
Comparing to the previous case when p is split in F, the only differenced are (1) Z(m) may contain supergeneric points; and (2) p∣m when p is ramified and the estimate for ∣qL(m)∣q(m) may change. Nevertheless, except the computation of the Fourier coefficients of the Eisenstein series, all the bounds for n in the medium and large range and for the cuspidal contribution for small n remain valid for supergeneric points if we replace Ln,1′ by Ln′ for supergeneric points. Note that we only enlarge Ln′,Ln,i′ as in the proof of Lemma 9.1.1 when we treat small n. Thus, to finish the proof, we compute the contribution from the Eisenstein part (for the enlarged lattices).
If P supergeneric, let θn denote the theta series attached to the lattice Ln′. We decompose θn=En+Gn, where Gn is a cusp form and En is an Eisenstein series. Let E=pA(p+1)E0+n=1∑[ϵ0logpM]ApnEn.
If p is inert, i.e., p∤D, p∤m, then by Lemma 4.4.6(2)(3) and the fact that ∣(L0′)∨/L0′∣=p2 for P supergeneric, we have
[TABLE]
If p is ramified, we have vp(m)=1. For P superspecial, by Lemma 4.4.6(1)(4), we have
[TABLE]
which is <1211⋅p−1A for p≥7. Similarly, for P supergeneric, by Lemma 4.4.6(2)(4), we have
Since every m∈TM in this case is a non-zero quadratic residue modp, hence by §4.4.3(2), all supersingular points on Z(m) are superspecial. The idea of the proof is similar to the case of Theorem 1 (1).
By Lemma 7.1.1, we have ∑m∈TM−SMgP(m)≍M5/2(logM)−1. We construct SM by large n. More precisely, we set SM={m∈TM∣∃v∈Ln0,1′,Q′(v)=m}, where n0=⌈(1+ϵ1)logpM⌉. Then
[TABLE]
which is o(M/logM)=o(#TM) if there exists an absolute constant ϵ>0 such that dn0≫Mϵ. If not, then by Lemma 9.1.2, we have that for M≫1, all m∈SM representable by the binary quadratic form Q′∣Pn0. Since dn0→∞, the density of primes representable by Q′∣Pn0 goes to zero, i.e., we still have #SM=o(#TM). With this choice of SM, we have
[TABLE]
For n in the medium range,
[TABLE]
since ∑m≤Mrn,1(m)=O(M5/2/p3n+M2/p2n+M3/2/pn+M).
The estimate for small n’s is exactly as in the case for Theorem 1(1) above and thus we finish the proof.
∎
9.3. Contribution from non-supersingular points and conclusions
To finish the proof, we only need to show that ∑m∈TM−SMlP(m) for non-supersingular points P are o(∑m∈TM−SMC.Z(m)), which is o(M2/logM) for Theorem 1(1) and Remark 2 and is o(M5/2/logM) for Theorem 3. We still use the notation in Lemma 7.3.2 for ordinary points.
Recall that an abelian surface is ordinary, almost ordinary (i.e., its Newton polygon has slopes 0,1/2,1), or supersingular.
Lemma 9.3.1**.**
If P is almost ordinary or if P is ordinary with rkZL0=3, then
[TABLE]
Proof.
By the classification of endomorphism rings of char p abelian surfaces (see for instance [Tate71, Thm. 1]), we see that if the abelian surface corresponding to P has almost ordinary reduction, then its lattice of special endomorphisms has rank at most 1. On the other hand, if P is ordinary, then rkZL0 is odd and hence rkZL0=1. In both cases, let anx2 to denote the quadratic form with one variable given by Q′ restricted to the lattice of special endomorphisms of the abelian surface mod tn. Since the lattice mod tn+1 is a sublattice of the one mod tn, we have an∣an+1.
Since C does not have any global special endomorphisms, we have an→∞ and hence anx2 does not represent any element in TM⊂{Dℓ2∣ℓ prime } or TM⊂{ℓ∣ℓ prime } once n≫1 (with then implicit constant only depending on P).
Now it only remains to treat the case when P is ordinary and rkZL0=3. We first construct SM for such P.
Lemma 9.3.2**.**
Given M, set n0=⌈(1+ϵ0)logpM⌉ and SM={m∈TM∣∃v∈Ln0 with Q′(v)=m}. Then #SM=o(#TM).
Proof.
By a geometry-of-numbers argument and Lemma 7.3.2, we have
[TABLE]
where an0 is the minimal length of a non-zero vector in Ln0 and bn0 is the minimal root discriminant of a rank 2 sublattice in Ln0. Since C does not have any global special endomorphisms, we have an0,bn0→∞ as M→∞. Fix 0<ϵ1<ϵ0/4. We prove the desired estimate by a case-by-case discussion based on the size of an0,bn0.
(1)
an0<Mϵ1 and bn0>M1/2+2ϵ1. Then we conclude as in the proof Proposition 7.2.5 for Theorem 1(1) for large n case (3). More precisely all v∈Ln0 with Q′(v)≤M lie in a rank 1 sublattice of Ln0 and thus the total number of such v is o(#TM).
2. (2)
an0≥Mϵ1 and bn0>M1/2+2ϵ1.
Then
[TABLE]
3. (3)
bn0≤M1/2+2ϵ1. Then pn0/2=M1/2+ϵ0/2≥bn0 and by Lemma 9.1.2 (note the proof of this lemma applies to this case), for M≫1, if m∈SM, then m is represented by the binary quadratic form given by restricting Q′ to the rank 2 sublattice in Ln0 with minimal discriminant(=bn02). Since bn0→∞, then we conclude by Proposition 9.1.4 for Theorem 1(1) and Remark 2 and by the fact that the density of primes represented by such quadratic forms goes to [math] for Theorem 3.∎
Now we estimate the total local contribution at an ordinary point with rkZL0=3.
Proposition 9.3.3**.**
Assume P is ordinary with rkZL0=3.
After possible enlarging SM in Lemma 9.3.2 (still with #SM=o(#TM)), we have ∑m∈TM−SMlP(m)=o(∑m∈TM−SMC.Z(m)).
Second, for ∑n=n1[(1+ϵ0)logpM]pn∑m∈TM−SMrn(m), we bound it by studying the following two cases separately.
(1)
bn1≥M1/8. As in the first part, we have
[TABLE]
which is O(M3/2logM+M2+ϵ0−1/8+M3/2+ϵ0)=o(M2/logM) once we take ϵ0<1/8.
2. (2)
bn1<M1/8. We are going to enlarge SM to be {m∈TM∣∃v∈Ln1 with Q′(v)=m}. Since bn12M<M5/4=o(p2n1), then we conclude, as in the upper medium range Case (2) in the proof of Proposition 7.2.5 for Theorem 1(1), by Lemma 9.1.2 and Proposition 9.1.4 that #SM=o(#TM) and ∑n=n1[(1+ϵ0)logpM]pn∑m∈TM−SMrn(m)=0.∎
Assume for contradiction that there are only finitely many points on C∩(∪m∈TZ(m)). Then we construct SM by taking the union of the SM in Proposition 7.2.5 for supersingular points and that in Lemma 9.3.2 and Proposition 9.3.3 for ordinary points with rkZL0=3. Since it is a finite union, we still have #SM=o(#TM). We deduce a contradiction by Lemma 7.1.1, Proposition 7.2.5, Lemma 9.3.1, and Proposition 9.3.3.
∎
Appendix A The decay lemma in the supergeneric case
Notation as in §5, especially §5.1.
The goal of this section is to prove the following theorem:
Theorem A.0.1** (Decay Lemma).**
At a supersingular point P on C, there exists a rank 3 submodule of the lattice L′⊗Zp of special endomorphisms which decays rapidly (see 5.1.1).
We will first prove the Siegel case, i.e., L=LS and then we deduce the Hilbert case (L=LH) from the proof of the Siegel case. Since we have proved the above theorem when P is superspecial in §§5-6, we will only treat the case when P is supergeneric in this appendix and hence we only need to consider the case when p is inert for the Hilbert case.
A.1. Siegel case
First, we follow §3.3 and let w1:=pv1+v3+(c+σ−1(c))v4, w2:=λpv1−λv3+λ(c−σ−1(c))v4, w3:=v4, w4:=−σ−1(c)pv1+pv2−cv3−σ−1(c)cv4, and w5:=v5, where recall that λ∈Zp2× such that σ(λ)=−λ. Then a direct computation implies that
[TABLE]
By §3.1.5 and using the universal unipotent u in §3.3, we have that the Frobenius on Lcris(W[[x,y,z]]), with respect to the basis {wi}i=15, is I+pyA+pQB+xC+zD, where Q=xy+4ϵz2,
[TABLE]
[TABLE]
[TABLE]
and
[TABLE]
To lighten notation, let G denote the matrix pDD(1). We then have:
[TABLE]
Here, as in §3.3, we assume that c∈W is the Teichmuller lift of its reduction cˉ in k; since P is supergeneric, we have that cˉ=σ2(cˉ) and hence c±σ(c)∈W×.
As in §5.1, we consider the formal curve C=Spfk[[t]] which is generically ordinary and specializes to P; the map C→Mk
gives rise to a map k[[x,y,z]]→k[[t]], and we denote by x(t),y(t),z(t) the images of x,y,z. The proof of Theorem A.0.1 in the case y(t)=0 is much simpler than that of the case y(t)=0, so in the rest of this appendix, we will present the proof of Theorem A.0.1 assuming y(t)=0; similar ideas (with simpler case-by-case discussion) yield the proof when y(t)=0. Without loss of generality, we assume that y(t)=tvy for some vy∈Z>0 and z(t) has t-adic valuation vz, and that the leading coefficient of z(t)2/4ϵ (here we also use ϵ to denote the reduction of ϵ∈Zp mod p) is α (set vz=∞ and α=0 if z(t)=0). Let α~∈W be the Teichmuller lift of α, and define E=A+α~B.
Definition A.1.1**.**
For n∈Z>0, define Xn to be the product ∏i=0nX(i), where X stands for either A,B or E.
We record the following lemma which we will make crucial use of in the proof of the Decay Lemma.
Lemma A.1.2**.**
(1)
The matrices An and Anmodp have rank 1. Further, the first four rows of An are each scalar multiples of R(n−1) by p-adic units, where
[TABLE]
2. (2)
Bn* and Bnmodp have rank 1. Further, the first three rows of Bn are each scalar multiples of S(n−1) by p-adic units, where*
[TABLE]
3. (3)
En* has rank ≤1. Further, the rows of En are contained in the span of R(n−1)−α(n)S(n−1). If n≥2 and Enmodp has rank 1 if and only if c−c(2)−α(1) is a p-adic unit.*
Proof.
As each of the above cases follow from similar straightforward inductive calculations, we will content ourselves with solely proving part (2). That the rows are spanned by S(n−1) follows directly from the fact that the row span of Bn is contained in the row span of B(n), which is indeed equal to the span of S(n−1).
We will prove that the first three rows of Bn are scalar multiples of S(n−1) by p-adic units by an inductive argument. The case of n=1 can be checked by a direct calculation. Observing that Bn=Bn−1B(n), it suffices to prove that the 1×5 matrix S(n−1)⋅B(n) has its first, second and fourth entries equalling p-adic units. This is also seen by a direct calculation, and the lemma follows.
∎
For the remainder of our paper, we will replace c,cˉ by c~,c respectively. We will also abuse notation by allowing λ to denote both λ and its mod p reduction – it will be clear from context what we refer to.
As in the proof of Theorem 5.1.2 in §5.1, we consider the matrix F∞, which has a product expansion ∏m=0∞(I+py(t)A+pQ(t)B+x(t)C+z(t)D)(m) (note that as in §5.1, the twist is with respect to σ(x)=xp,σ(y)=yp,σ(z)=zp and then we specialize to x=x(t),y=y(t),z=z(t)). Given any positive integer n, the coefficient of 1/pn in F∞ having minimal t-adic valuation arise from products of (Frobenius-twisted) powers of A, B and D, with C contributing only larger order terms. We will therefore ignore C while proving our decay results.
We will use part (2) of Lemma A.1.2 to prove that SpanZp{w1,w2,w4} decays rapidly. As vy>2vz, the t-adic valuation of Q=xy+z2/(4ϵ) equals 2vz, which is strictly smaller than that of y.
Moreover, by Corollary 3.4.2, the equation of the non-ordinary locus is (x+a)y+4ϵz2 and hence A=2vz, where A in 5.1.1 is the t-adic valuation of the equation. Further, the t-adic valuation of z1+p equals (p+1)vz, which is strictly greater than 2vz=vt(Q). Therefore, the term in the coefficient of 1/pn+1 with minimum t-adic valuation equals (QB)⋯(QB)(n). In order to prove that the property DR (in the statement of Proposition 5.1.3) holds, by Lemma A.1.2(2), it suffices to prove that the kernel of the matrix S(n−1)modp does not contain any non-zero Fp-linear combinations of w1,w2,w4modp. Indeed, this follows directly from the fact that c and therefore all its Frobenius twists do not lie in Fp2. Thus Theorem A.0.1 follows from the property DR by the same argument that Proposition 5.1.3 implies Theorem 5.1.2 in §5.1.
∎
It suffices to prove that SpanZp{w1,w2,w3} decays rapidly. As vy<2vz, the term in the coefficient of 1/pn+1 with minimum t-adic valuation equals (yA)…(yA)(n). Note also that A in 5.1.1 is vt((x+a)y+4ϵz2)=vy. As in Case 1, it suffices to prove that the kernel of A⋯A(n)modp contains no non-zero Fp-linear combinations of w1,w2,w3modp.
To that end, let w be some Fp-linear combination of these three vectors, which were in the kernel of A⋯A(n). By Lemma A.1.2(1), the (first four) rows of A⋯A(n) are unit-scalar multiples of R(n−1). Therefore, R(n−1)⋅w=0. As w is Frobenius-invariant, this is equivalent to R⋅w=0.
The existence of w implies that there exist s1,s2,s3∈Fp such that s1(c+c(1))+s2λ(c−c(1))+s3=0. If either s1=0 or s2=0, it follows that either c+c(1)∈Fp or λ(c−c(1))∈Fp. Either case would imply that c=c(2), which is a contradiction as we have assumed that c∈/Fp2. Therefore, we assume that s1=−1. We now have
[TABLE]
and therefore applying σ to the above equation, we also have
[TABLE]
Subtracting the two equations yields
[TABLE]
This is a contradiction, as λ∈/Fp.
Therefore, such w could not have been in the kernel of A⋯A(n)modp, whence it follows that every primitive vector in SpanZp{w1,w2,w3} satisfies the condition DR in the statement of Proposition 5.1.3, as required. Therefore, Theorem A.0.1 follow in this case.
∎
A.4. Case 3: vy=2vz
In this case, it follows that z(t)=0, and thus α=0.
We will prove that SpanZp{wi,w3,w5} decays rapidly where i is either 1 or 2. In this case, we have A≥2vz.
The conditions imposed on vy,vz imply that the term with minimal t-adic valuation in the coefficient of 1/pn+1 is
[TABLE]
Note that the first term in the sum has its last column equalling zero, and the second term has its first four columns equalling zero.
For brevity, we will henceforth denote a Frobenius twist with a superscript of ′. We claim that at least one of c+c′−α′ and λ(c−c′−α′) does not lie in Fp. Indeed, the first element being in Fp implies that c′+c′′−α′′=c+c′−α′, consequently −(α′−α′′)=c′′−c. Similarly, λ(c−c′−α′) being an element of Fp implies that α′+α′′=c−c′′. Therefore, both elements being in Fp implies that α′+α′′=α′−α′′, which in turn implies that α=0, a contradiction.
Without loss of generality, we will assume that c+c′−α′∈/Fp, and prove that SpanZp{w1,w3,w5} decays rapidly (if λ(c−c′−α′)∈/Fp, then an identical argument would yield that SpanZp{w2,w3,w5} decays rapidly). We first prove that every primitive vector in SpanZp{w1,w3} decays rapidly. Indeed, let w be any primitive vector. In order to prove that w decays rapidly, it suffices to prove that wmodp is not in the kernel of EE(1)⋯E(n)modp. However, as the 4th row of this matrix is a unit-multiple of R(n−1)−α(n)S(n−1) by Lemma A.1.2 (3), it suffices to prove that wmodp is not in the kernel of the 1×5 matrix R(n−1)−α(n)S(n−1). As wmodp is Fp-rational, this is equivalent to asking that wmodp is not in the kernel of R−α′Smodp. This follows from the fact that c+c′−α′∈/Fp.
We now show that w5 decays rapidly. Indeed, the last column of EE(1)⋯E(n−1)(pD)(n) has (R(n−2)−α(n−1)S(n−2))⋅v(n) as its fourth entry (up to a unit), where v is the last column of pD. It suffices to prove that (R(n−2)−α(n−1)S(n−2))⋅v(n)=0modp, equivalently (R−α′S)⋅v(2)=0modp. A direct computation shows that this element equals −(α′−c+c′′), which we have assumed to be not zero. Therefore, it follows that w5 decays rapidly.
Let w denote a primitive vector in the span of w1,w3. Consider a Zp-linear combination aw+bw5, where either a or b is a p-adic unit. The only way for aw+bw5 to not decay rapidly is if the t-adic valuation of the coefficient of 1/pn+1 in F∞w equalled the t-adic valuation of the coefficient of 1/pm+1 in F∞w5. However, the former equals t2vz(1+p+…+pn) and the latter equals t2vz(1+p+…+pm−1)+vzpm. These two quantities are clearly never equal, thereby establishing the required decay as in Case 1 in §5.2.
∎
Subcase 2: α(1)−c+c(2)=0
In this case, we see that α=c(−1)−c(1), and thus by Corollary 3.4.2, the local equation of the non ordinary locus is given by the equation Q(t)−αy(t). In particular, we see that H(t):=Q(t)−αy(t)=Q(t)−αtvy=0. We will now express
[TABLE]
where E,B,C,D are defined in §A.1. We will establish the required decay by considering the fourth row of F∞. As mentioned above, we omit C and powers of x(t) in this analysis, as there are no negative powers of p in the entries of C.
We need the following lemma:
Lemma A.4.1**.**
Consider all products of the form W0W1W2…Wn, where Wi is the ith Frobenius twist of E,B or D. The only products which have a non-zero fourth row are those with the following properties:
(1)
W0=E.
2. (2)
Suppose that Wi,Wj= twists of D but Wi+1,…,Wj−1= twists of D for 1≤i<j≤n. Then, j−i has to be odd. Equivalently, any maximal consecutive subsequence consisting exclusively of Frobenius twists of D has to have even length, unless the subsequence is terminates with Wn.
3. (3)
Apart from i=0, the only possible j≤n such that Wj=E(j) is j=n.
Further, a product that satisfies the above properties does indeed have a non-zero fourth row.
Finally, consider the length four vector given by the first four rows of the last column of the product. This vector is nonzero if and only if Wn=D(n) and the number of occurrences of twists of D is odd. If this is the case, the first four columns of the product are all zero.
Proof.
(1)(2) are clear. Part (3) follows from a direct computation. We will illustrate this computation in the particular case
[TABLE]
It will follow from explicitly computing the product that multiplying it by either E(m+2e+2), D(m+2e+2) or B(m+2e+2) yields the zero matrix. The other cases (where the Wi are other choices of B,D) are entirely analogous, and the same computation goes through.
An easy inductive argument shows that, the product i=1∏mB(i)j=m+1∏m+2eD(j)=pe1i=1∏mB(i)j=m+1∏m+eG(2j−m−1) equals
[TABLE]
Multiplying this matrix on the left by the fourth row of E and on the right by E(m+2e+1) and using the relation α(1)=c−c(2) yields
[TABLE]
Note that the product (not just the fourth row) matrix has rank one, and so every other row must be some multiple of this row. In order to show that the product in (A.4.1) multiplied by Wm+2e+2 (where Wm+2e+2 is either B(m+2e+2),D(m+2e+2) or E(m+2e+2)) is zero, it suffices to prove that the fourth row of this product is zero. This can be checked by direct computation.
∎
We record the fourth row of the product in (A.4.1) (scaled by pe) for future use:
Lemma A.4.2**.**
The fourth row of the product peEi=1∏mB(i)j=m+1∏m+2eD(j)E(m+2e+1) equals
[TABLE]
Let H(t)=ηtA+tA+1(z2(t)), where η∈k×. Let η~ denote the Teichmuller lift of η.
In this case, we will prove that SpanZp{w1,w2,w3} decays rapidly.
Lemma A.4.3**.**
The term with minimal t-adic valuation in the coefficient of 1/pn+1 in the fourth row of (the top-left 4×4 block of) F∞ is
[TABLE]
Proof.
Note that F∞ is composed of sums of products as in Lemma A.4.1, where each Wi is multiplied by:
•
y(t)(i)=t2vzpi if Wi=E(i);
•
h(t)(i)=η~(i)tApi+⋯ if Wi=B(i);
•
z(t)(i)=β(i)tvzpi if Wi=D(i), where β is the leading coefficient in z=βtvz+⋯.
Consider products as in Lemma A.4.1. As we are looking for matrices where the first four columns are not all zero, it follows that the number of occurrences of twists of D must be even. Therefore, consider a product with n1 occurrences of twists of either E or B, and 2n2 occurrences of twists of D. The fourth row of such a product would have a p-adic valuation of −(n1+n2), and hence we assume that n+1=n1+n2.
It is clear that the t-adic valuation of the expression is minimized if the first and last matrices in the product are both twists of E. Indeed, the t-adic valuation is minimized when Wi=E(i) for as many i as possible, and Lemma A.4.1 implies that this happens when the first and last matrices are both twists of E. As the t-adic valuation of H(t) is strictly greater than that of z(t), it follows that products of the form EB(1)⋯B(n1−2)Dn1−1⋯D(n1+2n2−2)E(n1+2n2−1) contain the term with minimal t-adic valuation.
As in Case 4 in §5.2, a convexity argument yields that the t-adic valuation is minimized exactly for the product listed in the statement of this result, thereby concluding the proof.
∎
We will prove that SpanZp{w1,w2,w3} decays rapidly.
Let Re denote the mod p reduction of the row detailed in Lemma A.4.2. It suffices to show that there is no Fp-linear combination of the first three entries of Re which evaluates to zero. This is equivalent to prove that the elements c+c(1), λ(c−c(1)) and 1 are Fp-linearly independent. This fact has already been established in Case 2 in §A.3. The result follows.
∎
Subsubcase 2: vz(2p2e−p2e−1+1)=A
Lemma A.4.4**.**
(1)
There are two terms with minimal t-adic valuation in the coefficient of 1/pn+1 in the fourth row of (the top-left 4×4 block of) F∞. They are:
•
peEB(1)⋯B(n−e−1)D(n−e)⋯D(n+e−1)E(n+e)η(1+p+⋯+pn−e−1)βpn−e+⋯+pn+e−1; and
We will show that there exists a rank 2 submodule W⊂SpanZp{w1,w2,w3} such that W+Zpw5 decays rapidly.
The term with minimal t-adic valuation in the coefficient of 1/pn+1 of the fourth row of the top-left 4×4 block of F∞ is the sum of the corresponding parts in the two matrices in Lemma A.4.4(1). The t-adic valuation of this sum is vz(2+pn−e+⋯+pn+e−1+2pn+e)+A(p+⋯+pn−e−1), which is ≤A(1+⋯+pn).
In order to prove that W decays rapidly with the above t-adic valuation, it suffices to show that the
kernel of the sum of the fourth rows of the two matrices in Lemma A.4.4(1) mod p∩Wmodp is {0} for a suitable choice of W. To that end, let μ=2ϵβpn−e+pn+e−1 and ν=ηpn−e. Then, the sum of the two rows with equal minimal t-adic valuation mod p is a scalar multiple323232The scalar multiple factor is (−1)n−1(2ϵ)e−1η1+p+…pn−e−1βpn−e+1+…pn+e−2. of
We now prove that depending on the values of μ and ν, we may take W to be either SpanZp{w1,w2}, SpanZp{w1,w3} or SpanZp{w2,w3}. To further lighten notation, let δ1=(μ+ν)c(n+e) and δ2=μc(n+e+1)+νc(n+e−1).
(1)
Suppose ν+μ=0. We will show that SpanZp{w1,w2} decays rapidly. It suffices to prove that there is no non-trivial Fp-linear relation between α1=μ(c(n+e−1)−c(n+e+1)) and α2=λμ(c(n+e−1)−c(n+e+1)). However, this follows directly from the facts that λ∈/Fp and that c(n+e+1)=c(n+e−1).
2. (2)
Suppose that δ1=±δ2. We will show that either SpanZp{w1,w2} or SpanZp{w1,w3} decays rapidly. Indeed, the former happens when α1=δ1+δ2 and α2=λ(δ1−δ2) are not Fp multiples of each other. Therefore, suppose that they were. Then we have that for some l∈Fp,
δ1+δ2=lλ(δ1−δ2).
Note that this yields that δ2/δ1∈Fp2.
We will prove that SpanZp{w1,w3} decays rapidly. If not, then α1=sα3, where s∈Fp. That is, δ1+δ2=sδ1/c(n+e). Equivalently, δ2/δ1=(s−c(n+e))/c(n+e). As c∈/Fp2, it follows that δ2/δ1∈/Fp2, which is a contradiction.
3. (3)
Suppose that δ1=δ2. We will show that SpanZp{w1,w3} decays rapidly, by showing that α1,α3 are Fp-linearly independent. Indeed, α1=2(ν+μ)c(n+e), and α3=ν+μ. As c∈/Fp2, the two quantities are Fp-linearly independent as required.
4. (4)
Suppose that δ1=−δ2. The same argument as above works to show that SpanZp{w2,w3} decays rapidly.
Thus we have proved that there exists a rank 2 submodule W of SpanZp{w1,w2,w3} which decays rapidly and the minimal t-valuation in the coefficient of 1/pn+1 is vz(2+pn−e+⋯+pn+e−1+2pn+e)+A(p+⋯+pn−e−1).
On the other hand, the term with minimal t-adic valuation in the coefficient of 1/pm+1 of the last column of F∞ is the last column of the matrix in Lemma A.4.4(2) - it is easy to see that this last column is non-zero. Hence w5 decays rapidly. The t-adic valuation of this term is vz(2+pm−e+⋯+pm+e−1+pm+e)+A(p+…pm−e−1), and we notice that this is always different from vz(2+pn−e+⋯+pn+e−1+2pn+e)+A(p+⋯+pn−e−1) regardless of the values of m,n. Therefore, W+Zpw5 decays rapidly by the same argument at the end of Case 3 Subcase 1 above.
∎
A.5. The case of inert Hilbert modular surfaces
That we have a rank 3 submodule of special endomorphisms follows directly from the Siegel case. Indeed, by §3.2.1 and §3.3, the F-crystal Lcris in the setting of inert Hilbert modular surfaces is obtained by setting z=0 in the Siegel case. The required decay follows directly from §A.3.
Bibliography1
The reference list from the paper itself. Each links out to its DOI / PubMed record.