Counting rational points on a Grassmannian

Seungki Kim

arXiv:1908.01245·math.NT·October 14, 2022

Counting rational points on a Grassmannian

Seungki Kim

PDF

TL;DR

This paper provides a refined estimate for counting rational points on Grassmannian varieties with bounded height, improving classical results and extending to all points, with implications for flag varieties.

Contribution

It introduces a new counting formula for rational points on Grassmannians that counts all points, refining previous bounds and extending to flag varieties.

Findings

01

Derived a comprehensive estimate for rational points on Grassmannians.

02

Extended counting results to flag varieties.

03

Improved upon classical bounds by including all points.

Abstract

We prove an estimate on the number of rational points on the Grassmannian variety of bounded twisted height, refining the classical results of Schmidt ([12]) and Thunder ([20]) over the rational field: most importantly, our formula counts all points. Among the consequences are a couple of new implications on the classical subject of counting rational points on flag varieties.

Equations347

a (n, d)

a (n, d)

b (n, d)

P (Z^{n}, d, H) = a (n, d) H^{n} + O (H^{n - b (n, d)}),

P (Z^{n}, d, H) = a (n, d) H^{n} + O (H^{n - b (n, d)}),

(P \cap Z^{n}) L = {v L : v \in P \cap Z^{n}} \subseteq R^{n}

(P \cap Z^{n}) L = {v L : v \in P \cap Z^{n}} \subseteq R^{n}

L_{i} = L \cap span_{R} (v_{1}, \dots, v_{i}),

L_{i} = L \cap span_{R} (v_{1}, \dots, v_{i}),

λ_{i} (L) = in f {r > 0 : dim span_{R} (L \cap B (r)) \geq i},

λ_{i} (L) = in f {r > 0 : dim span_{R} (L \cap B (r)) \geq i},

P_{L_{n - d}} (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O (\frac{H ^{n - b (n, d)}}{( det L ) ^{d - b (n, d)} ( det L _{n - d} ) ^{b (n, d)}}),

P_{L_{n - d}} (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O (\frac{H ^{n - b (n, d)}}{( det L ) ^{d - b (n, d)} ( det L _{n - d} ) ^{b (n, d)}}),

P (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O (1 + (\frac{H}{λ _{1} ( L ) ^{d}})^{n - b (n, d)}),

P (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O (1 + (\frac{H}{λ _{1} ( L ) ^{d}})^{n - b (n, d)}),

P (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O j \in E_{n, d} \sum b_{j} (L) H^{γ_{j}},

P (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O j \in E_{n, d} \sum b_{j} (L) H^{γ_{j}},

b_{j} (L) = i = 1 \prod n (det L_{i})^{- β_{i}}

b_{j} (L) = i = 1 \prod n (det L_{i})^{- β_{i}}

\frac{H ^{n - b (n, d)}}{( det L ) ^{d - b (n, d)} ( det L _{n - d} ) ^{b (n, d)}},

\frac{H ^{n - b (n, d)}}{( det L ) ^{d - b (n, d)} ( det L _{n - d} ) ^{b (n, d)}},

c (n, d) = a (n, d) i = 1 \prod d ζ (n - i + 1) .

c (n, d) = a (n, d) i = 1 \prod d ζ (n - i + 1) .

N (L, d, H) = c (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O j \in F_{n, d} \sum b_{j}^{'} (L) H^{γ_{j}},

N (L, d, H) = c (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O j \in F_{n, d} \sum b_{j}^{'} (L) H^{γ_{j}},

P_{S} (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O j \in G_{n, d} \sum b_{j} (L) H^{γ_{j}},

P_{S} (L, d, H) = a (n, d) \frac{H ^{n}}{( det L ) ^{d}} + O j \in G_{n, d} \sum b_{j} (L) H^{γ_{j}},

\frac{a H}{( det L ) ^{d}} lo g \frac{H}{( det L ) ^{d}} + O (\frac{H}{( det L ) ^{d - b (n, d) (n - d) / n} ( det L _{n - d} ) ^{b (n, d)}}),

\frac{a H}{( det L ) ^{d}} lo g \frac{H}{( det L ) ^{d}} + O (\frac{H}{( det L ) ^{d - b (n, d) (n - d) / n} ( det L _{n - d} ) ^{b (n, d)}}),

\frac{a H}{( det L ) ^{d}} lo g \frac{H}{ε _{e}^{d} ε _{d}^{n - e}} + O j \in E_{n, e, d} \sum b_{j} (L) H^{γ_{j}},

\frac{a H}{( det L ) ^{d}} lo g \frac{H}{ε _{e}^{d} ε _{d}^{n - e}} + O j \in E_{n, e, d} \sum b_{j} (L) H^{γ_{j}},

1 - \frac{b ( n , d )}{n}, 1 - \frac{b ( d , e ) ( n - e )}{n d}, or 1 - \frac{1}{n} (1 - \frac{2 b ( d , e )}{d} + \frac{1 - b ( d , e )}{n - e});

1 - \frac{b ( n , d )}{n}, 1 - \frac{b ( d , e ) ( n - e )}{n d}, or 1 - \frac{1}{n} (1 - \frac{2 b ( d , e )}{d} + \frac{1 - b ( d , e )}{n - e});

O (H lo g \frac{( det L ) ^{d} )}{ε _{e}^{d} ε _{d}^{n - e}}) .

O (H lo g \frac{( det L ) ^{d} )}{ε _{e}^{d} ε _{d}^{n - e}}) .

H p (lo g H) + o (H),

H p (lo g H) + o (H),

\int_{X_{n}} v \in L \ {0} \sum f (v) d μ_{n} (L) = \int_{R^{n}} f (x) d x,

\int_{X_{n}} v \in L \ {0} \sum f (v) d μ_{n} (L) = \int_{R^{n}} f (x) d x,

\int_{X_{n}} independent v _{1} , \dots , v _{k} \in L \sum f (v_{1}, \dots, v_{k}) d μ_{n} (L) = \int_{R^{n}} \dots \int_{R^{n}} f (x_{1}, \dots, x_{k}) d x_{1} \dots d x_{k},

\int_{X_{n}} independent v _{1} , \dots , v _{k} \in L \sum f (v_{1}, \dots, v_{k}) d μ_{n} (L) = \int_{R^{n}} \dots \int_{R^{n}} f (x_{1}, \dots, x_{k}) d x_{1} \dots d x_{k},

\int_{X_{n}} P (L, d_{1}, \dots, d_{k}, H_{1}, \dots, H_{k}) d μ_{n} (L) = i = 1 \prod k a (n, d_{i}) H_{i}^{n} .

\int_{X_{n}} P (L, d_{1}, \dots, d_{k}, H_{1}, \dots, H_{k}) d μ_{n} (L) = i = 1 \prod k a (n, d_{i}) H_{i}^{n} .

i = 1 \prod k (det S^{i})^{d_{i + 1} - d_{i - 1}} .

i = 1 \prod k (det S^{i})^{d_{i + 1} - d_{i - 1}} .

\int_{X_{n}} P (L, d, H) d μ_{n} (L) = \infty.

\int_{X_{n}} P (L, d, H) d μ_{n} (L) = \infty.

P (L, d, H) = P (\overset{ˉ}{L}, d - 1, \frac{H}{λ _{1} ( L )}) + Φ (P (\overset{ˉ}{L}, d, H)),

P (L, d, H) = P (\overset{ˉ}{L}, d - 1, \frac{H}{λ _{1} ( L )}) + Φ (P (\overset{ˉ}{L}, d, H)),

f_{H}(M)=\begin{cases}1&\mbox{if $\det M\leq H$}\\ 0&\mbox{otherwise,}\end{cases}

f_{H}(M)=\begin{cases}1&\mbox{if $\det M\leq H$}\\ 0&\mbox{otherwise,}\end{cases}

A \in Gr (M, d) \sum F (det A) = B \in Gr (Z^{m}, d) \sum F (det B M) = B \in Γ\ Mat_{d \times m}^{p r} (Z) \sum F (det B M),

A \in Gr (M, d) \sum F (det A) = B \in Gr (Z^{m}, d) \sum F (det B M) = B \in Γ\ Mat_{d \times m}^{p r} (Z) \sum F (det B M),

P (L, 1, H) = a (n, 1) \frac{H ^{n}}{det L} + O (i = 1 \sum n \frac{H ^{n - i}}{det L _{n - i}}) .

P (L, 1, H) = a (n, 1) \frac{H ^{n}}{det L} + O (i = 1 \sum n \frac{H ^{n - i}}{det L _{n - i}}) .

det S^{⊥} = \frac{det S}{det L}

det S^{⊥} = \frac{det S}{det L}

P (L, d, H) = P (L^{P}, n - d, \frac{H}{det L}) .

P (L, d, H) = P (L^{P}, n - d, \frac{H}{det L}) .

P (L, n - 1, H) = a (n, n - 1) \frac{H ^{n}}{( det L ) ^{n} det L ^{P}} + O (i = 1 \sum n \frac{H ^{n - i}}{( det L ) ^{n - i} det ( L ^{P} ) _{n - i}}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Counting rational points of a Grassmannian

Seungki Kim

Abstract.

We prove an estimate on the number of rational points on the Grassmannian variety of bounded twisted height, refining the classical results of Schmidt ([12]) and Thunder ([20]) over the rational field: most importantly, our formula counts all points. Among the consequences are a couple of new implications on the classical subject of counting rational points on flag varieties.

Key words and phrases:

rational points, Grassmannians, flag varieties, Manin’s conjecture.

2020 Mathematics Subject Classification:

11H06, 11G50, 14G05

1. Introduction

1.1. Main result

For $\mathcal{L}\subseteq\mathbb{R}^{n}$ a lattice, $1\leq d<\mathrm{rk\,}\mathcal{L}$ and $H>0$ , let $P(\mathcal{L},d,H)$ be the number of primitive rank $d$ sublattices of $\mathcal{L}$ of determinant less than or equal to $H$ . The purpose of this paper is to investigate the quantitative behavior of $P(\mathcal{L},d,H)$ . The earliest result of this kind goes back to the mid-twentieth century, due to W. Schmidt ([12]):

Theorem 1.1 (Schmidt [12], Theorem 1).

Let

[TABLE]

where $V(i):=\pi^{i/2}/\Gamma(i/2+1)$ is the volume of the unit ball in $\mathbb{R}^{i}$ and $\zeta(s)$ is the Riemann zeta function, except that we understand $\zeta(1)=1$ for convenience. Then

[TABLE]

where the implicit constant depends on $n$ only.

For $\mathcal{L}\subseteq\mathbb{R}^{n}$ of full rank, $P(\mathcal{L},d,H)$ may also be understood in terms of a counting problem on the Grassmannian variety $\mathrm{Gr}(n,d)$ consisting of the $d$ -dimensional subspaces of $\mathbb{R}^{n}$ . A rational point $P$ on $\mathrm{Gr}(n,d)$ is a $d$ -dimensional subspace such that $P\cap\mathbb{Z}^{n}$ is a rank $d$ sublattice of $\mathbb{Z}^{n}$ . Its height is given by the determinant of $P\cap\mathbb{Z}^{n}$ , and given a real $n\times n$ matrix $L$ , its height twisted by $L$ is given by the determinant of the rank $d$ lattice

[TABLE]

(we write vectors horizontally, so that matrices multiply from the right). Observe that, if $L$ and $L^{\prime}$ are two $n\times n$ matrices whose row vectors generate the same lattice $\mathcal{L}$ , i.e. $L=gL^{\prime}$ for some $g\in\mathrm{GL}(n,\mathbb{Z})$ , then the number of the rational points whose heights are bounded by $H$ is the same, regardless of whether one twists the height by $L$ or $L^{\prime}$ . In addition, one checks that this number is precisely $P(\mathcal{L},d,H)$ as defined above. We refer the reader to Thunder ([19, Introduction], [20, Part I]) for the general definition of a twisted height, in which it is first introduced.

From this perspective, Thunder ([20]) proved a vast generalization of Theorem 1.1 above, extending it to any lattice and to any number field $K$ (here $\mathcal{O}_{K}$ -modules play the role of lattices). His result, from the 1990’s, remains state-of-the-art to this day. We state his result in case $K=\mathbb{Q}$ :

Theorem 1.2 (Thunder [20], Theorem 3).

Let $\mathcal{L}\subseteq\mathbb{R}^{n}$ be a lattice of full rank. In addition to the notations in Theorem 1.1, define

[TABLE]

where $v_{1},\ldots,v_{n}$ are choices of linearly independent vectors in $\mathcal{L}$ such that $\|v_{i}\|=\lambda_{i}(\mathcal{L})$ , where $\lambda_{i}(\mathcal{L})$ is the $i$ -th successive minimum of $\mathcal{L}$ defined by

[TABLE]

where $B(r)$ here is the closed ball at origin of radius $r$ in $\mathbb{R}^{n}$ . Let $P_{\mathcal{L}_{n-d}}(\mathcal{L},d,H)$ be the number of rank $d$ sublattices of $\mathcal{L}$ of determinant $\leq H$ whose intersection with $\mathcal{L}_{n-d}$ is trivial. Then

[TABLE]

where the implicit constant depends only on $n$ .

A notable feature of Theorem 1.2 is that it provides an explicit description of the dependence of the error term on the successive minima $\lambda_{1}(\mathcal{L}),\ldots,\lambda_{n}(\mathcal{L})$ of $\mathcal{L}$ (observe that $\det\mathcal{L}_{i}\sim\lambda_{1}\lambda_{2}\ldots\lambda_{i}$ by the Minkowski’s second theorem). Informally speaking, it reflects the “skewness” of the lattice: in case $\mathcal{L}$ is severely skewed, in the sense that $\lambda_{i}(\mathcal{L})$ is much smaller than $\lambda_{i+1}(\mathcal{L})$ for some $i$ , one expects a different behavior of the error term than the case in which most $\lambda_{i}(\mathcal{L})$ are about equal. Theorem 1.2 may be seen as a realization of this intuition.

However, Thunder ([20]) does not provide a corresponding estimate for $P(\mathcal{L},d,H)$ , remarking that it would “be a cumbersome task.” In the present paper, we introduce a method that circumvents this difficulty, and prove

Theorem 1.3’.

Continue with the notations in Theorems 1.1 and 1.2 above. Then for all $H>0$ ,

[TABLE]

where the implied constant depends only on $n$ .

In fact, we prove the more precise

Theorem 1.3.

Let $\mathcal{L}\subseteq\mathbb{R}^{n}$ be a lattice of full rank. For all $H>0$ ,

[TABLE]

where the implied constant in the big-O notation depends only on $n$ , and $E_{n,d}$ is a finite set of indices, of cardinality less than $n^{3n}$ for $n\geq 2$ . Each $j\in E_{n,d}$ is associated with $\gamma_{j}\in[0,n-b(n,d)]$ and $b_{j}$ of the form

[TABLE]

for some real $\beta_{i}\geq 0$ such that $\sum_{i=1}^{n}i\beta_{i}=d\gamma_{j}$ . This makes the right-hand side of (1.2) scale-invariant i.e. it remains unchanged if $\mathcal{L}$ and $H$ are replaced by $c\mathcal{L}$ and $c^{d}H$ , respectively.

In particular, the leading error term is

[TABLE]

as in Theorem 1.2.

*Remark**.*

If $\mathcal{L}$ is of rank $m<n$ , we may adapt Theorem 1.3 by observing that, for any isometry $\iota:\mathbb{R}^{m}\rightarrow\mathrm{span}_{\mathbb{R}}(\mathcal{L})$ , it holds that $P(\mathcal{L},d,H)=P(\iota^{-1}(\mathcal{L}),d,H)$ . The same applies to the results of the similar flavor that we state below.

*Remark**.*

The estimate $|E_{n,d}|<n^{3n}$ is an extremely crude one, provided only to assure that the sum is finite. Describing $E_{n,d}$ exactly from our computations would be quite a laborious task that would not yield a pretty formula and whose fruits seem unclear.

Thus one may feel that the complicated statement of Theorem 1.3 is unnecessary. However, we leave it as it is, since the precise knowledge of at least a part of the error term may be useful for certain applications. For instance, to prove Corollary 1.1 below, we really need the (close-to-)optimal version of the leading error term stated as in Theorem 1.3. If we could compute the coefficients $b_{j}$ ’s optimally for more $j$ ’s, we expect to be able to strengthen Corollary 1.1 accordingly.

We also state the subsequent results in the precise form. If one desires simplification, one may replace $b_{j}$ by an appropriate power of $\lambda_{1}$ ’s, as in (1.1).

Before we go on to discuss a few applications of Theorem 1.3, let us present a few of its variants that may also be of use.

Theorem 1.4.

Let $N(\mathcal{L},d,H)$ be the number of (not necessarily primitive) rank $d$ sublattices of $\mathcal{L}$ of determinant $\leq H$ . Also let

[TABLE]

Then similarly to Theorem 1.3, for a full-rank lattice $\mathcal{L}\subseteq\mathbb{R}^{n}$ we have

[TABLE]

where the implied constant in the big-O notation depends only on $n$ , and $F_{n,d}$ is a set of indices of cardinality at most $n^{3n}$ . The description of $b^{\prime}_{j}$ (resp. $\gamma_{j}$ ) is the same as that of $b_{j}$ (resp. $\gamma_{j}$ ) in Theorem 1.3. Also, if $d<n-1$ , the leading error term is the same as in Theorem 1.3. If $d=n-1$ , then the largest $\gamma_{j}$ for $j\in F_{n,d}$ may be chosen to be $n-1+\eta$ , for any $\eta>0$ .

Theorem 1.5.

For a lattice $\mathcal{L}\subseteq\mathbb{R}^{n}$ , choose a primitive sublattice $\mathcal{S}\subseteq\mathcal{L}$ of rank $\leq n-d$ , and let $P_{\mathcal{S}}(\mathcal{L},d,H)$ be the number of primitive rank $d$ sublattices of $\mathcal{L}$ whose intersection with $\mathcal{S}$ is trivial. For $\mathcal{L}\subseteq\mathbb{R}^{n}$ of full rank, we have

[TABLE]

where again the description of the error term is identical to that of Theorem 1.3, except that $G_{n,d}$ is now a set of cardinality less than $3n^{3n}$ . In particular, the leading error term is the same as in Theorem 1.3. Moreover, this formula is independent of $\mathcal{S}$ .

The analogous statement holds for $N_{\mathcal{S}}(\mathcal{L},d,H)$ , which counts both primitive and non-primitive lattices.

1.2. Applications

Below we demonstrate a few immediate applications of Theorem 1.3 and the techniques used in its proof. Its main strength lies in the fact that it counts all the sublattices, and that it provides information regardless of how skewed the given lattice is, in particular relative to $H$ . In comparison, its precedent Theorem 1.2 misses the sublattices that nontrivially intersect $\mathcal{L}_{n-d}$ , and thus it does not say anything about the lattices for which $\det\mathcal{L}/\det\mathcal{L}_{n-d}>H$ .

We expect there to be more uses of Theorem 1.3; for instance, see a recent work of Le Boudec ([7]), which employs the $d=1$ case (due to Schmidt ([12]), see (2.1) below) as the main device.

1.2.1. Rational points of flag varieties

It is natural to expect that a counting formula on Grassmannians should yield a counting formula for general flag varieties. Indeed, Thunder ([20]) derives such a formula as a relatively simple application of Theorem 1.2. We present its simplest case to initiate the discussion:

Theorem 1.6 (Thunder [20], Theorem 5).

Let $\mathcal{L}\subseteq\mathbb{R}^{n}$ be a lattice of rank $n$ , and suppose $H$ is sufficiently large. Then the number of flags $\mathcal{S}^{e}\subseteq\mathcal{S}^{d}\subseteq\mathcal{L}$ of type $(e,d)$ (hence $\mathrm{rk\,}\mathcal{S}^{i}=i$ ) such that $\mathcal{S}^{i}\cap\mathcal{L}_{n-i}=0$ for $i\in\{e,d\}$ , and $\mathcal{S}^{e}\cap\mathcal{S}^{d}_{d-e}=0$ , whose height twisted by $\mathcal{L}$ is at most $H$ is

[TABLE]

where $a$ is some explicit constant depending only on $n,d,e$ , and the implicit constant in the error depends only on $n$ .

In this context, the height of the flag $\mathcal{S}^{e}\subseteq\mathcal{S}^{d}$ is the quantity $(\det\mathcal{S}^{e})^{d}(\det\mathcal{S}^{d})^{n-e}$ ; see e.g. Thunder ([20]) for details.

In comparison, we can derive from Theorem 1.3 the following

Corollary 1.1.

Let $\mathcal{L}$ be a lattice of rank $n\geq 3$ , and $H$ be sufficiently large — more precisely, $\lambda_{n}(\mathcal{L})\ll_{n}H^{1/nd}$ . Then the number of flags $\mathcal{S}^{e}\subseteq\mathcal{S}^{d}\subseteq\mathcal{L}$ of type $(e,d)$ such that $\mathcal{S}^{e}\cap\mathcal{S}^{d}_{d-e}=0$ whose height twisted by $\mathcal{L}$ is at most $H$ is

[TABLE]

where $\varepsilon_{e}=\min\{\det\mathcal{X}:\mathcal{X}\subseteq\mathcal{L},\mathrm{rk\,}\mathcal{X}=e\}$ and likewise for $\varepsilon_{d}$ , $a$ is the same as in (1.3), the implicit constant depends on $n$ only, $E_{n,e,d}$ is an index set of cardinality at most $3n^{3n+1}$ , and $b_{j}(\mathcal{L})$ and $\gamma_{j}$ are as in the statement of Theorem 1.3, except that $\sum_{i}i\beta_{i}=dn\gamma_{j}$ .

Furthermore, the largest $\gamma_{j}$ is $1$ , and the next largest is either

[TABLE]

if $d\leq n/2$ , it is always $1-b(n,d)/n=1-1/dn$ .

In order to keep the proof relatively short and simple, we had to keep some of the assumptions made by Theorem 1.6. Still, it has a couple of new implications that may be of interest. First, it shows that there must exist a gap between Theorem 1.6 and an ideal counting formula that would count all the rational flags, and that it must be at least of size

[TABLE]

For a heavily skewed $\mathcal{L}$ , for instance if $\det\mathcal{L}=1$ but $\varepsilon_{d}\ll H^{-1/n}$ , then this is of comparable size to the main term.

The second implication has to do with the error term in the well-known theorem of Franke, Manin, and Tschinkel ([4]) on the number of rational points on flag varieties. In the corollary to Theorem 5 therein, which says that the number of rational points on a flag variety $V$ of (“untwisted”) height bounded by $H$ is

[TABLE]

where $p$ is a polynomial of degree $\mathrm{rk\,}\mathrm{Pic}(V)-1$ , they conjecture that the error term is of size $O(H^{1-\varepsilon})$ with $\varepsilon=(\dim V)^{-1}$ . On the other hand, when $V$ is a Grassmannian, the literature (for instance Schmidt [12], and Thunder [19] [20]) suggests that $\varepsilon=b(n,d)/n$ , as their analyses seem fairly sharp. Our Corollary 1.1 extends this to flag varieties of type $(e,d)$ , suggesting that we have $\varepsilon=b(n,d)/n$ again, at least when $d\leq n/2$ . In case $e\geq n/2$ , one may be able to estimate $\varepsilon=1/(n-e)n$ by duality. But in general the nature of $\varepsilon$ appears rather complicated.

It is possible to prove by a similar argument an analogue of Corollary 1.1 for a flag variety of any type that is strong enough to yield these same implications. On the other hand, we expect the ideal formula that would count all points on a flag variety to require another substantial amount of effort along the lines of the present paper. As stated in the remark after Theorem 1.3, if we could find explicit expressions for every $b_{j}$ in (1.2), preferably containing large powers of $\det\mathcal{L}$ , it would allow our error-bounding techniques to apply immediately. Using the methods of the present paper, it may be possible to achieve this for the first few small values of $d$ .

1.2.2. Mean value theorems over lattices

Mean value theorems over random lattices provide a method of averaging lattice-point counting formulas over the space $X_{n}:=\mathrm{SL}(n,\mathbb{Z})\backslash\mathrm{SL}(n,\mathbb{R})$ of determinant $1$ full-rank lattices in $\mathbb{R}^{n}$ . The first known such theorem is the famous Siegel integration formula:

Theorem 1.7 (Siegel [17], Theorem on p.341).

Let $f:\mathbb{R}^{n}\rightarrow\mathbb{R}$ be a Borel measurable and compactly supported function. Then

[TABLE]

where $d\mu_{n}$ is the normalized Haar measure on $\mathrm{SL}(n,\mathbb{R})$ , and $dx$ is the Lebesgue measure.

For example, if one sets $f$ to be the characteristic function of a set $S\subseteq\mathbb{R}^{n}$ , then the sum on the left-hand side of (1.5) equals $\left|S\cap\mathcal{L}\backslash\{0\}\right|$ , and thus Theorem 1.7 implies that a random lattice sampled according to $d\mu_{n}$ has on average $\mathrm{vol}(S)$ nonzero vectors contained in $S$ . Another important example is the Rogers integral formula, one of the main tools in geometry of numbers (see e.g. [5], [15], [18] for some of the applications):

Theorem 1.8 (Rogers [9], (essentially) Theorem 4).

For $k<n$ , let $f:(\mathbb{R}^{n})^{k}\rightarrow\mathbb{R}$ be a Borel measurable and compactly supported function. Then

[TABLE]

where $d\mu_{n}$ is the normalized Haar measure on $\mathrm{SL}(n,\mathbb{R})$ , and each $dx_{i}$ is the Lebesgue measure.

In the author’s work [6], written concurrently with the present paper, the machinery that turns a lattice-point counting formula, such as Theorems 1.3 and 1.5, into a mean-value theorem has been developed, inspired by the argument of Rogers ([9]). As a result, the following extension of Theorem 1.8 to Grassmannians is obtained from Theorem 1.5

Theorem 1.9 (Kim [6], Theorem 3).

Suppose $1\leq k<n$ , $1\leq d_{1},\ldots,d_{k}<n$ and $d_{1}+\ldots+d_{k}<n$ . Define $P(\mathcal{L},d_{1},\ldots,d_{k},H_{1},\ldots,H_{k})$ to be the number of independent primitive sublattices $\mathcal{A}_{1},\ldots,\mathcal{A}_{k}$ of $\mathcal{L}$ of ranks $d_{1},\ldots,d_{k}$ and determinants bounded by $H_{1},\ldots,H_{k}$ , respectively. Then

[TABLE]

We note that Thunder ([21]) proved the $k=1$ case of this result. Theorem 1.9 has several implications on the statistics of the randomized heights of the points on Grassmannians and flag varieties, such as the following.

Corollary 1.2 (Kim [6], Corollary to Theorem 5).

Let $k\geq 2$ , and $1=d_{0}\leq d_{1}<\ldots<d_{k}<d_{k+1}=n$ . For a lattice $\mathcal{L}\subseteq\mathbb{R}^{n}$ and its flag $\mathcal{S}^{d_{1}}\subseteq\ldots\subseteq S^{d_{k}}\subseteq\mathcal{L}$ of type $\mathfrak{d}=(d_{1},\ldots,d_{k})$ , its height is defined as the quantity

[TABLE]

Let $P(\mathcal{L},\mathfrak{d},H)$ be the number of type $\mathfrak{d}$ flags of $\mathcal{L}$ whose heights are bounded by $H$ . Then

[TABLE]

It may be interesting to compare this result with Theorem 1.6 and Corollary 1.1. The author speculates that the divergence here is related to the main term of (1.4) in the statement of Corollary 1.1 being dependent on the skewness of the lattice in question.

1.3. Method of proof

All previous works on this topic ([12], [19], [20]) count “upwards,” i.e. they construct the $d$ -dimensional sublattice from either a $(d-1)$ -dimensional sublattice or a $d$ -dimensional sublattice lying in an $(n-1)$ -dimensional ambient space. Our main idea is to take the dual approach, and count “downwards” instead: we project all the $d$ -dimensional sublattices to a hyperplane, and count the cardinality of each fiber. This lets us bypass some of the technical difficulties that arise when counting upwards.

To elaborate, we prove Theorem 1.3 by the following inductive procedure that resembles the Pascal’s triangle method of computing the binomial coefficients. In case $d=1$ or $d=n-1$ , the formulas are well-known. Otherwise, let $\bar{\mathcal{L}}$ be the projection of $\mathcal{L}$ onto the orthogonal complement of a shortest nonzero vector of $\mathcal{L}$ . Then we have

[TABLE]

where $\Phi$ can be regarded as a certain integral transformation. For a choice of a basis $\{v_{1},\ldots,v_{n}\}$ of $\mathcal{L}$ and a sublattice $\mathcal{B}\subseteq\mathcal{L}$ of rank $d$ , let us say $\mathcal{B}$ is of d-type $(\alpha_{1},\ldots,\alpha_{n})$ — “d” stands for “dual” — if the projection of $\mathcal{B}$ onto $\mathrm{span}_{\mathbb{R}}(v_{1},\ldots,v_{n-i+1})^{\perp}$ has rank $\alpha_{i}$ . Then the first term on the right-hand side of (1.6) is counting the sublattices of d-types $(*,\ldots,*,d-1,d)$ , and the second term is counting those of d-types $(*,\ldots,*,d,d)$ .

In comparison, Theorem 1.2 counts precisely the sublattices of d-type $(1,2,\ldots,d,\ldots,d)$ . The upward counting method forces one to count the sublattices of each d-type separately, which is precisely what Thunder refers to as being “cumbersome.” The downward method resolves this difficulty.

Most of this paper is devoted to explicitly writing out and estimating $\Phi(P(\bar{\mathcal{L}},d,H))$ . Many parts of the computation can be done by slightly refining the methods of Schmidt ([12]) or Thunder ([20]). However, the fact that $\mathcal{L}$ can be arbitrarily skewed presents a new difficulty, especially when bounding the error terms. This is resolved by comparing the gaps between the successive minima to $H^{1/d}$ : if $\lambda_{i+1}-\lambda_{i}\ll H^{1/d}$ for all $i=1,\ldots,n-1$ , the lattice may be considered not so severely skewed, as the classical techniques continue to apply. If in contrast $\lambda_{i+1}-\lambda_{i}\gg H^{1/d}$ for some $i$ , we exploit this gap to finesse the desired error bound.

1.4. Organization

In Section 2, we introduce the definitions and notations used throughout the paper, and state the known formulas for $P(\mathcal{L},1,H)$ and $P(\mathcal{L},n-1,H)$ . In Section 3, we set up the induction argument, establishing the precise version of (1.6). Sections 4 and 5 are devoted to the main and error term estimates, respectively. Section 6 collects all the computations and concludes the proof of Theorem 1.3. The variants are all proved in Section 7.

1.5. Acknowledgment

Part of this work is supported by NSF grant CNS-2034176. The author thanks the anonymous referee, Lillian Pierce, Anders Södergren, and Jeffery Thunder for helpful comments and suggestions.

2. Some backgrounds

2.1. Definitions, notations, and conventions

Unless mentioned otherwise, the definitions and notations of this section apply.

Generalities

The lowercase letter $p$ denotes a prime. Let us write $\Gamma=\mathrm{GL}(d,\mathbb{Z})$ for short. We use capital letters such as $L,M$ to refer to matrices, and calligraphic fonts such as $\mathcal{L},\mathcal{M}$ to denote lattices. $n\in\mathbb{Z}_{>0}$ and $d\in\{1,\ldots,n-1\}$ are fixed integers throughout the paper.

As in the statement of Theorem 1.2, $\lambda_{i}(\mathcal{L})$ is the $i$ -th successive minimum of $\mathcal{L}$ , and $\mathcal{L}_{i}$ denotes (a choice of) a primitive $i$ -dimensional sublattice containing $v_{1},\ldots,v_{i}\in\mathcal{L}$ , which are linearly independent with $\|v_{i}\|=\lambda_{i}(\mathcal{L})$ . The $(i,j)$ -entry of a matrix is denoted by the lowercase letter of the name of the matrix indexed by $ij$ . For example, if $A$ is a $d\times n$ matrix, then $A=(a_{ij})_{1\leq i\leq d\atop 1\leq j\leq n}$ . Similarly, if $x\in\mathbb{R}^{n}$ , then the $i$ -th entry of $x$ is denoted by $x_{i}$ .

Later, given a $d\times(n-1)$ matrix $A$ and a $d\times 1$ vector $v$ , we will need to consider the $d\times n$ matrix $B$ whose $i$ -th row equals $(a_{i1},\ldots,a_{i,n-1},v_{i})$ . We write $B=(A;v)$ to describe such a matrix.

For two quantities $f$ and $g$ , $f\ll g$ means $f<Cg$ , where $C$ is a positive constant possibly depending on $d$ and $n$ but no other variables. $f\sim g$ means $f\ll g$ and $g\ll f$ . For example, Minkowski’s second theorem says that $\det\mathcal{L}\sim\prod\lambda_{i}(\mathcal{L})$ .

For two matrices $A$ and $B$ with $d$ rows, $A\sim B$ means they differ by the left multiplication by an element of $\Gamma$ . If the rows of each of $A$ and $B$ respectively span $\mathcal{A}$ and $\mathcal{B}$ in $\mathrm{Gr}(\mathcal{M},d)$ (whose precise definition is given below), $A\sim B$ means that $\mathcal{A}=\mathcal{B}$ .

Later in the paper, we will need a few facts from reduction theory. Let $\{v_{1},\ldots,v_{m}\}$ be a basis of a lattice $\mathcal{M}$ , and $\{v_{1}^{*},\ldots,v_{m}^{*}\}$ be its Gram-Schmidt orthogonalization, that is, each $v_{i}^{*}$ is the projection of $v_{i}$ to the orthogonal complement of $\mathrm{span}(v_{1},\ldots,v_{i-1})$ . Let us say the basis is reduced if each $v_{i}\sim\lambda_{i}(\mathcal{M})$ and $\langle v_{i}^{*},v_{j}\rangle/\|v_{i}^{*}\|^{2}\leq 1/2$ for all $i<j$ . It is known by reduction theory (see e.g. [2, Chapter 1]) that any lattice has a reduced basis. Moreover, the LLL algorithm ([8]) outputs a reduced basis of any lattice, given any basis of that lattice.

$\mathrm{Gr}(\mathcal{M},d)$ and the determinant/height

A $d\times n$ integral matrix $X\in\mathrm{Mat}_{d\times n}(\mathbb{Z})$ is said to be primitive if $X$ can be completed to an element of $\mathrm{GL}(n,\mathbb{Z})$ . When $d=1$ , this agrees with the standard definition of a primitive vector. We denote the set of all primitive $d\times n$ matrices by $\mathrm{Mat}^{pr}_{d\times n}(\mathbb{Z})$ .

For a lattice $\mathcal{M}\subseteq\mathbb{R}^{n}$ of rank $m\leq n$ , a sublattice $\mathcal{A}\subseteq\mathcal{M}$ is said to be primitive if $\mathrm{span}_{\mathbb{R}}(\mathcal{A})\cap\mathcal{M}=\mathcal{A}$ . We denote $\mathrm{Gr}(\mathcal{M},d)$ for the set of all rank $d$ primitive sublattices of $\mathcal{M}$ inside $\mathbb{R}^{n}$ . Choose a basis $\{v_{1},\ldots,v_{m}\}$ of $\mathcal{M}$ , and a basis $\{w_{1},\ldots,w_{d}\}$ of $\mathcal{A}\in\mathrm{Gr}(\mathcal{M},d)$ . Let $M$ and $A$ respectively denote the $m\times n$ and $d\times m$ matrices, such that the $i$ -th row of $M$ is $v_{i}$ , and the $i$ -th row of $AM$ is $w_{i}$ . One checks that $A\in\mathrm{Mat}^{pr}_{d\times m}(\mathbb{Z})$ by the fact that $\mathcal{A}\in\mathrm{Gr}(\mathcal{M},d)$ .

Suppose one chooses a different basis $\{w^{\prime}_{1},\ldots,w^{\prime}_{d}\}$ of $\mathcal{A}$ , and let $A^{\prime}\in\mathrm{Mat}^{pr}_{d\times m}(\mathbb{Z})$ be the matrix such that the $i$ -th row of $A^{\prime}M$ is $w^{\prime}_{i}$ . Then $A=gA^{\prime}$ for some $g\in\Gamma$ . Conversely, if $A=gA^{\prime}$ for some $g\in\Gamma$ and $A,A^{\prime}\in\mathrm{Mat}^{pr}_{d\times m}(\mathbb{Z})$ , then the rows of $AM$ and $A^{\prime}M$ span the same element of $\mathrm{Gr}(\mathcal{M},d)$ . Therefore, with a choice of $M\in\mathrm{Mat}_{m\times n}(\mathbb{R})$ whose rows span $\mathcal{M}$ over $\mathbb{Z}$ , there exists a bijection between $\mathrm{Gr}(\mathcal{M},d)$ and the set of orbits $\Gamma\backslash(\mathrm{Mat}^{pr}_{d\times m}(\mathbb{Z})\cdot M)$ of the action of $\Gamma$ on $\mathrm{Mat}^{pr}_{d\times m}(\mathbb{Z})\cdot M$ by left multiplication.

To make this even more explicit, recall that each element of $\Gamma\backslash\mathrm{Mat}^{pr}_{d\times m}(\mathbb{Z})$ is uniquely represented by a primitive Hermite normal form over $\mathbb{Z}$ . Thus there is also a bijection between $\mathrm{Gr}(\mathcal{M},d)$ and the set of all elements of form $AM$ where $A$ is a $d\times m$ primitive Hermite normal form over $\mathbb{Z}$ . Whenever convenient, we will use these identifications of $\mathrm{Gr}(\mathcal{M},d)$ interchangeably throughout the paper.

In order to simplify some notations, we adopt the unusual convention that all determinants are nonnegative (the groups $\mathrm{GL}(n,\cdot)$ and $\mathrm{SL}(n,\cdot)$ maintain their usual meanings, though). Specifically, for a square matrix $X$ , we write $\det X$ for the absolute value of its usual definition of determinant. For a non-square matrix $X$ , we define $\det X=\sqrt{\det XX^{\mathrm{tr}}}$ . For a lattice $\mathcal{A}\subseteq\mathbb{R}^{n}$ , we define $\det\mathcal{A}$ to be its covolume within its $\mathbb{R}$ -span. For $\mathcal{A}\in\mathrm{Gr}(\mathcal{M},d)$ , note that $\det\mathcal{A}=\det AM$ holds, where $M$ is any choice of a matrix whose row vectors form a basis of $\mathcal{M}$ , and $A$ is any choice of an element of $\mathrm{Mat}^{pr}_{d\times m}(\mathbb{Z})$ such that the row vectors of $AM$ form a basis of $\mathcal{A}$ .

For a matrix $M$ , we define

[TABLE]

and similarly for a lattice $\mathcal{M}$ .

It is easy to see that, for any compactly supported function $F$ defined on a subset of $\mathbb{R}$ , we have

[TABLE]

where $\mathcal{B}M$ is understood as the image of $\mathcal{B}$ by the linear map $\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ induced by the matrix $M$ . Again we will switch freely between these notations as we see fit.

Orthogonality notions

Following Schmidt ([12]), we define the polar lattice $\mathcal{M}^{P}$ of $\mathcal{M}$ by $\mathcal{M}^{P}=\{w\in\mathrm{span}_{\mathbb{R}}(\mathcal{M}):\langle v,w\rangle\in\mathbb{Z},\forall v\in\mathcal{M}\}$ . If $\mathcal{S}\in\mathrm{Gr}(\mathcal{M},d)$ , we define its orthogonal lattice $\mathcal{S}^{\perp}\in\mathrm{Gr}(\mathcal{M}^{P},m-d)$ by $\mathcal{S}^{\perp}=\{w\in\mathcal{M}^{P}:\langle v,w\rangle=0,\forall v\in\mathcal{S}\}$ .

In addition, for a $m\times n$ matrix $M$ whose $i$ -th row vector is denoted by $v_{i}$ , we define its polar matrix $M^{P}$ as the $m\times n$ matrix whose $j$ -th row vector $v^{P}_{j}$ lies in $\mathrm{span}_{\mathbb{R}}(v_{1},\ldots,v_{m})$ and satisfies $\langle v_{i},v^{P}_{j}\rangle=\delta_{ij}$ . Then the rows of $M^{P}$ generate the polar lattice of the lattice generated by the rows of $M$ .

2.2. Base cases

In case $d=1$ , Theorem 1.3 is precisely Theorem 4 in [20] (also Lemma 2 of [12]), which states that

[TABLE]

Below in Lemma 3.6, we present an extension of (2.1) to an affine lattice, which we will need later.

In case $d=n-1$ , we apply the duality theorem (see Section 2 of [20]) to (2.1), which states that, for a sublattice $\mathcal{S}\subseteq\mathcal{L}$ and its orthogonal lattice $\mathcal{S}^{\perp}\subseteq\mathcal{L}^{P}$ ,

[TABLE]

holds, and thus

[TABLE]

Therefore (2.1) implies

[TABLE]

By the well-known facts that $\det\mathcal{L}\cdot\det\mathcal{L}^{P}=1$ and $\lambda_{i}(\mathcal{L})\lambda_{n-i+1}(\mathcal{L}^{P})\geq 1$ (in fact, $\lambda_{i}(\mathcal{L})\lambda_{n-i+1}(\mathcal{L}^{P})\sim 1$ , by [1, Theorem 2.1]),

[TABLE]

so we can rewrite the above as

[TABLE]

3. Division into two parts

3.1. Preliminaries

Until the end of Section 6, we fix $n\geq 4$ and $2\leq d\leq n-2$ . We will divide $P(\mathcal{L},d,H)$ into two parts, and deal with them one at a time. We induct on $n$ , assuming that $P$ has been computed for all lattices of rank $<n$ .

Throughout the rest of the paper, we fix a basis $\{v_{1},\ldots,v_{n}\}$ of $\mathcal{L}$ , and denote by $L$ the $n\times n$ matrix whose $i$ -th row is $v_{i}$ . Define $\bar{\mathcal{L}}=\mathcal{L}/\langle v_{n}\rangle$ , and identify it with the projection of $\mathcal{L}$ onto the subspace of $\mathbb{R}^{n}$ orthogonal to $v_{n}$ i.e. we think of $\bar{\mathcal{L}}$ as a subset of $\mathbb{R}^{n}$ . Let $\bar{v}_{i}$ be the component of $v_{i}$ orthogonal to $v_{n}$ , so that $v_{i}=\bar{v}_{i}+a_{i}v_{n}$ for some $a_{i}\in\mathbb{R}$ and $\bar{\mathcal{L}}=\mathrm{span}_{\mathbb{Z}}(\bar{v}_{1},\ldots,\bar{v}_{n-1})$ . Also let $\bar{L}$ be the $(n-1)\times n$ matrix whose $i$ -th row is $\bar{v}_{i}$ .

We write

[TABLE]

where $P^{1}(\mathcal{L},d,H)$ equals the number of rank $d$ primitive sublattices of $\mathcal{L}$ of determinant $\leq H$ such that its projection to $\bar{\mathcal{L}}$ is also of rank $d$ , and $P^{2}(\mathcal{L},d,H)$ equals the number of those whose projection is of rank $d-1$ . Equivalently, $P^{1}$ counts primitive sublattices whose $\mathbb{R}$ -span does not contain $v_{n}$ , and $P^{2}$ counts those that does.

As discussed in Section 1.4 above, we may identify $\mathcal{A}\in\mathrm{Gr}(\mathcal{L},d)$ with an orbit $\Gamma ML$ of the left multiplication of $\Gamma$ on $\mathrm{Mat}^{pr}_{d\times n}(\mathbb{Z})\cdot L$ , for some $M=(c_{ij})_{1\leq i\leq d\atop 1\leq j\leq n}\in\mathrm{Mat}^{pr}_{d\times n}(\mathbb{Z})$ . Also, let $\tilde{L}$ be the $n\times n$ matrix whose $i$ -th row vector equals $\bar{v}_{i}$ for $1\leq i\leq n-1$ , and $v_{n}$ for $i=n$ , so that

[TABLE]

Then we can also write $\mathcal{A}$ in the form $\Gamma(C;c+c^{\prime})\tilde{L}$ , where $C=(c_{ij})_{1\leq i\leq d\atop 1\leq j\leq n-1}$ is the first $d\times(n-1)$ submatrix of $M$ , and $c=(c_{1n},\ldots,c_{dn})^{\mathrm{tr}}$ and $c^{\prime}=(\sum_{j}a_{j}c_{1j},\ldots,\sum_{j}a_{j}c_{dj})^{\mathrm{tr}}$ are vectors in $\mathbb{R}^{d}$ .

3.2. Computing $P^{2}(\mathcal{L},d,H)$

Consider first the case $\mathrm{rank\,}\,C=d-1$ , so that $\mathcal{A}$ contributes to $P^{2}$ . We may assume that $M$ is a Hermite normal form, so that $C$ is too. Because $M$ is primitive, so is $C$ , and the $d$ -th entry of the vectors $c$ and $c^{\prime}$ must be equal to $1$ and [math] respectively. This forces each of the other entries of $c+c^{\prime}$ to have only one choice modulo the left action of $\Gamma$ . Thus

[TABLE]

to which we can simply apply Theorem 1.3 (see the remark after its statement).

3.3. Some lemmas

Working with $P^{1}$ is much more involved. Most of the remainder of this paper is devoted to this task. The goal of this section is to derive the expression (3.5) for $P^{1}$ that is amenable to computation.

We start by recalling the standard choice of the representatives of the right cosets of $\Gamma$ in the double coset $\Gamma a\Gamma$ , where $a\in\mathrm{Mat}_{d\times d}(\mathbb{Z})$ has determinant $k>0$ . Such a representative, say $h=(h_{ij})_{1\leq i,j\leq d}$ , is a lower triangular matrix with determinant $k$ , with the condition that $0\leq h_{ji}<h_{ii}$ for all $j>i$ . Of course, $\Gamma h\subseteq\Gamma a\Gamma$ if and only if $a$ and $h$ have the same invariant factors.

Lemma 3.1.

Given an integral $d\times n$ matrix $(C;c)$ with $\mathrm{rank\,}\,C=d$ , there exists a unique triple $(h,B,b)$ , where $h$ is one of the right coset representatives described above, $B$ is a $d\times(n-1)$ primitive Hermite normal form of rank $d$ , and $b\in\mathbb{Z}^{n}$ , such that $(C;c)\sim(hB;b)$ .

Proof.

By the theory of the Smith normal form, we have $(C;c)\sim(aB_{0};b_{0})$ where $a$ is an invariant factor matrix — that is, $a=\mathrm{diag}(a_{1},\ldots,a_{d})$ with $a_{i}|a_{i+1}$ — $B_{0}$ is a primitive $d\times(n-1)$ matrix of full rank, and $b_{0}\in\mathbb{Z}^{d}$ . Write $B_{0}=\gamma B$ , where $B$ is the Hermite normal form of $B_{0}$ and $\gamma\in\Gamma$ . Then there exists $\gamma^{\prime}\in\Gamma$ and $h$ a coset representative of $\Gamma a\Gamma$ such that $\gamma^{\prime}h=a\gamma$ . Therefore, writing $b=\gamma^{\prime-1}b_{0}$ , we have $(C;c)\sim(hB,b)$ .

Suppose we have another triple $(h^{\prime},B^{\prime},b^{\prime})$ such that $(hB,b)\sim(h^{\prime}B^{\prime},b^{\prime})$ . This is possible only if the row vectors of $B$ and $B^{\prime}$ generate the same lattice. Since both $B$ and $B^{\prime}$ are in the Hermite normal form, $B=B^{\prime}$ . This in turn implies $h=h^{\prime}$ and $b=b^{\prime}$ . ∎

Lemma 3.2.

Again given an integral $d\times n$ matrix $(C;c)$ , write $C=\gamma aB$ , where $\gamma\in\Gamma$ , $a=\mathrm{diag}(a_{1},\ldots,a_{d})$ is an invariant factor matrix, and $B$ is primitive. Thus $(C;c)\sim(aB;\gamma^{-1}c)=(aB;b)$ , where $b:=\gamma^{-1}c$ .

Then $(aB;b)$ is primitive if and only if $a_{1}=\ldots=a_{d-1}=1$ and $b_{d}$ is coprime to $a_{d}$ .

Proof.

Without loss of generality, we may assume $B$ to be the matrix which has $1$ ’s in the diagonal and [math]’s elsewhere. $(aB,b)$ is imprimitive if and only if there exist integers $0\leq r_{i}<a_{i}$ for $i=1,\ldots,d$ , $r_{i}$ not all zero, such that $(r_{1},\ldots,r_{d},0,\ldots,0,\sum_{i}b_{i}r_{i}/a_{i})\in\mathbb{Z}^{n}$ , or equivalently $\sum_{i}b_{i}r_{i}/a_{i}\in\mathbb{Z}$ .

Suppose $a_{d-1}\neq 1$ . We claim that, for any $b_{d-1}$ and $b_{d}$ , $b_{d-1}r_{d-1}/a_{d-1}+b_{d}r_{d}/a_{d}\in\mathbb{Z}$ for a nontrivial choice of the $r$ ’s. There exists a prime $p$ such that $p|a_{d-1}$ and $p|a_{d}$ , so it suffices to find a nontrivial solution to the expression $b_{d-1}r_{d-1}+b_{d}r_{d}\equiv 0(\mathrm{mod}\,p)$ . But this is clearly possible.

Next suppose $a_{d-1}=1$ . We are led to consider the condition $b_{d}r_{d}/a_{d}\in\mathbb{Z}$ . This is impossible if and only if $(b_{d},a_{d})=1$ , which completes the proof. ∎

Lemma 3.3.

Write $e(p^{\alpha})=\mathrm{diag}(1,\ldots,1,p^{\alpha})$ . Then the necessary and sufficient condition for $h\in\mathrm{Mat}_{d\times d}(\mathbb{Z})$ to be one of the standard form right coset representatives of $\Gamma$ in $\Gamma e(p^{\alpha})\Gamma$ is as follows: $h$ is a lower triangular matrix with $h_{ii}=p^{a_{i}}$ , where $a_{i}\geq 0$ and $\sum a_{i}=\alpha$ , $0\leq h_{ji}<h_{ii}$ for $j>i$ , and in addition if $i<j$ are two indices such that $a_{i},a_{j}\geq 1$ and $a_{i+1}=\ldots=a_{j-1}=0$ — i.e. all diagonal entries between $h_{ii}$ and $h_{jj}$ are trivial — then $(h_{ji},p)=1$ .

Proof.

Let $h$ be a coset representative of some double coset of a matrix of determinant $p^{\alpha}$ , in the form that we chose in the beginning of this section. Then all but the last condition are automatically satisfied. For the last condition, choose the three smallest indices $i<j<k$ for which $a_{i},a_{j},a_{k}>0$ . We consider the $3\times 3$ matrix

[TABLE]

We will show that this matrix has invariant factors $(1,1,p^{a_{i}+a_{j}+a_{k}})$ if and only if $h_{ji}$ and $h_{kj}$ are coprime to $p$ . Then the proof is complete because we can repeatedly apply this argument to $h$ to compute the invariant factors of $h$ .

If $h_{ji}$ and $p$ are coprime, there exist integers $x,y$ such that $yh_{ji}-xp^{a_{i}}=1$ , so that the matrix

[TABLE]

has determinant $1$ . Multiplying this on the left of (3.2), we have

[TABLE]

which, upon multiplying by suitable elements of $\Gamma$ from both sides, becomes

[TABLE]

If furthermore $h_{kj}$ is coprime to $p$ , then so is $h_{kj}-yp^{a_{j}}h_{ki}$ , so we can use the same trick to see that (3.2) has invariant factors $(1,1,p^{a_{i}+a_{j}+a_{k}})$ indeed.

Now go back to (3.2) and consider the case $h_{ji}=cp^{b}$ ; we can assume $1\leq b<a_{j}$ and $(c,p)=1$ . We restrict our attention to the $2\times 2$ upper-left corner submatrix of (3.2), and temporarily use $\approx$ to denote the equivalence under the left and right multiplication by $\Gamma$ . Then, by a similar argument as earlier, for an appropriate integer $y$ ,

[TABLE]

so $p^{b}$ appears as one of the invariant factors.

∎

Lemma 3.4.

Write $e(k)=\mathrm{diag}(1,\ldots,1,k)$ , as in the previous lemma. Then the number of the right cosets of $\Gamma$ in $\Gamma e(k)\Gamma$ equals

[TABLE]

Proof.

From the general theory of Hecke operators (see Chapter 3 of Shimura [16]), it suffices to prove the lemma for the case $k=p^{\alpha}$ . We proceed by induction on $\alpha$ .

In case $\alpha=1$ , there exist $p^{d-i}$ coset representatives which has $a_{ii}=p$ and $a_{jj}=1$ for all $j\neq i$ . This exhausts all the representatives of $\Gamma e(p)\Gamma$ , so the lemma holds true in this case.

For the general case, it suffices to match, to each representative $h$ of $\Gamma e(p^{\alpha-1})\Gamma$ , $p^{d-1}$ representatives of $\Gamma e(p^{\alpha})\Gamma$ , different for each $h$ . Suppose $j$ is the smallest number for which $h_{jj}$ is a power of $p$ . Then modifying $h_{jj}$ to $ph_{jj}$ and $h_{kj}(k>j)$ to $h_{kj}+c_{k}h_{jj}$ , for any choice of $0\leq c_{k}<p$ , yields a representative of $\Gamma e(p^{\alpha})\Gamma$ , accounting for $p^{d-j}$ out of $p^{d-1}$ total. Also, for each $i<j$ , replacing $h_{ii}(=1)$ by $p$ , a choice of each $h_{ki}$ $(k\neq j)$ from $\{0,\ldots,p-1\}$ and of $h_{ji}$ from $\{1,\ldots,p-1\}$ ( $h_{ji}$ cannot be [math] by the previous lemma) yields a representative of $\Gamma e(p^{\alpha})\Gamma$ , and there are $p^{d-i-1}(p-1)$ of this kind. Therefore, for each $h$ there is a total of $p^{d-j}+p^{d-j}(p-1)+p^{d-j+1}(p-1)+\ldots+p^{d-2}(p-1)=p^{d-1}$ coset representatives of $\Gamma e(p^{\alpha})\Gamma$ constructed in this manner, as desired. It remains to show that these representatives do not overlap with those constructed from a different choice of $h$ . But this is immediate since, given a representative of $\Gamma e(p^{\alpha})\Gamma$ , one can read off which representative of $\Gamma e(p^{\alpha-1})\Gamma$ it came from, by discarding the first factor of $p$ that appears in its diagonal. ∎

3.4. A computable expression for $P^{1}(\mathcal{L},d,H)$

For $\mathcal{A}\in\mathrm{Gr}(\mathcal{L},d)$ , recall we defined $f_{H}(\mathcal{A})=1$ if $\det\mathcal{A}\leq H$ and [math] otherwise. Also, as in the statement of Lemma 3.4 write $e(k):=\mathrm{diag}(1,\ldots,1,k)$ . Thanks to Lemmas 3.1, 3.2 and 3.4, we can rewrite $P^{1}(\mathcal{L},d,H)$ as

[TABLE]

where the sum over $h$ is taken over all coset representatives of $\Gamma e(k)\Gamma$ in the standard form.

Fix $h,k,B$ for a moment, and consider the innermost summation in (3.3). For some $B^{\prime}\sim B$ , it is equal to (cf. Lemma 3.2)

[TABLE]

where $\mu$ is the Möbius function, and we wrote

[TABLE]

for short. Note that $v_{n}$ is a row vector, whereas $b$ and $t$ are column vectors.

Temporarily write $X=(e(l)b+e(k)t)v_{n}$ and $Y=e(k)B^{\prime}\bar{L}$ . We need to compute the determinant of $X+Y$ . First observe that

[TABLE]

because $v_{n}\bar{L}^{\mathrm{tr}}=0$ , and also

[TABLE]

This motivates the use of the matrix-determinant lemma, which asserts that for a $d\times d$ matrix $A$ and (row) vectors $x,y\in\mathbb{R}^{d}$ , $\det(A+x^{\mathrm{tr}}y)=\det A\cdot(1+yA^{-1}x^{\mathrm{tr}})$ . To this end, we also need the following lemma.

Lemma 3.5.

Let $Y$ be a full-rank $d\times n$ matrix whose $i$ -th row equals $y_{i}\in\mathbb{R}^{n}$ . Let $z_{1},\ldots,z_{d}\in\mathbb{R}^{n}$ such that they form the basis of the polar lattice spanned by $y_{1},\ldots,y_{d}$ and that $\langle z_{i},y_{j}\rangle=\delta_{ij}$ . Let $Z=Y^{P}$ be the $d\times n$ matrix whose $i$ -th row equals $z_{i}$ . Then the inverse of $YY^{\mathrm{tr}}$ is given by $ZZ^{\mathrm{tr}}$ .

Proof.

Complete $Y$ to an invertible $n\times n$ matrix $\bar{Y}=\binom{Y}{Y^{\prime}}$ , such that the rows of $Y^{\prime}$ are orthogonal to the rows of $Y$ . Similarly complete $Z$ to $\bar{Z}=\binom{Z}{Z^{\prime}}$ , so that the rows of $\bar{Z}$ form the dual basis to that formed by the rows of $\bar{Y}$ . Then the rows of $Z^{\prime}$ are orthogonal to the rows of $Z$ as well.

Since $\bar{Z}$ and $\bar{Y}^{\mathrm{tr}}$ are inverses of each other, we have $\bar{Y}\bar{Y}^{\mathrm{tr}}\bar{Z}\bar{Z}^{\mathrm{tr}}=I$ . By abuse of language, write $Y=\binom{Y}{0},Y^{\prime}=\binom{0}{Y^{\prime}}$ , and similarly with $Z$ . Then

[TABLE]

and observe that the first term on the right is zero outside the first $d\times d$ submatrix, and the second term is zero outside the “last” $(n-d)\times(n-d)$ submatrix. This completes the proof. ∎

Thanks to the above lemma, with $Z=Y^{P}$ we compute that $\det(X+Y)$ is the square root of

[TABLE]

In the last line, we used the fact that $Z=\left(e(k)B^{\prime}\bar{L}\right)^{P}=e(k^{-1})(B^{\prime}\bar{L})^{P}$ .

Let

[TABLE]

if $H\geq k\det(B\bar{L})$ , and set $K(B)=0$ otherwise. We will use this notation throughout the rest of the paper. Then (3.4) becomes

[TABLE]

The lemma below ensures that the translation of the vectors by $t^{\mathrm{tr}}(B^{\prime}\bar{L})^{P}$ does not present any extra difficulty in our estimate of this sum.

Lemma 3.6.

Let $\Lambda\in\mathbb{R}^{d}$ be a lattice of rank $d$ , and $t\in\mathbb{R}^{d}$ . Temporarily denote by $N(r)$ the number of points $v\in\Lambda+t$ with $\|v\|\leq r$ . Then

[TABLE]

where the implicit constant depends on $d$ only.

Proof.

This is Lemma 2 in [12] generalized to an affine lattice, and is also a special case of Theorem 5.4 in [22]. We provide a proof here for completeness.

We proceed by induction on $d$ . The base case $d=1$ is clear. Now assume the lemma for $d-1$ . By adjusting $\det\Lambda$ , we may assume $r=1$ .

First consider the case $\lambda_{d}\leq 1$ . Let $x_{i}\in\Lambda$ , $i\in\{1,\ldots,d\}$ , be a vector with $\|x_{i}\|=\lambda_{i}$ , and consider the parallelepiped spanned by $x_{1},\ldots,x_{d}$ . Its diameter is $\leq\lambda_{1}+\ldots+\lambda_{d}\leq d\lambda_{d}$ , and it contains a fundamental parallelepiped $F$ of $\Lambda$ , which also has diameter $\leq d\lambda_{d}$ .

Write $B(s)$ for the ball in $\mathbb{R}^{n}$ at the origin of radius $s$ . Then since $B(\max(0,1-d\lambda_{d}))\subseteq(\Lambda+t)\cap B(1)+F\subseteq B(1+d\lambda_{d})$ , we have

[TABLE]

and thus

[TABLE]

where the second equality follows from the Minkowski’s second theorem.

It remains to consider the case $\lambda_{d}>1$ . Then $(\Lambda+t)\cap B(1)$ lies in at most two translates of $\Lambda_{d-1}$ in the direction of $x_{d}$ . Thus the induction hypothesis implies $N(r)=O\left(\sum_{i=1}^{d}1/\det\Lambda_{d-i}\right)$ . Also we have

[TABLE]

as above. This completes the proof.

∎

It follows that (3.4) equals

[TABLE]

where $\mathfrak{L}(x,M)$ here denotes the lattice spanned by the row vectors of $e(x)M$ . We have $\mathfrak{L}(l/k,(B^{\prime}\bar{L})^{P})=\mathfrak{L}(k/l,B^{\prime}\bar{L})^{P}$ , and $\det(\mathfrak{L}(k/l,B^{\prime}\bar{L})^{P})_{d-i}\gg\det\mathfrak{L}(k/l,B^{\prime}\bar{L})_{i}/\det\mathfrak{L}(k/l,B^{\prime}\bar{L})$ by (2.3). Also, $\det\mathfrak{L}(k/l,B^{\prime}\bar{L})_{i}\gg\det\mathfrak{L}(1,B^{\prime}\bar{L})_{i}$ , so the above sum can be rewritten as

[TABLE]

which we in turn rewrite as, for the lattice $\mathcal{B}\in\mathrm{Gr}(\mathbb{Z}^{n-1},d)$ spanned by $B^{\prime}$ ,

[TABLE]

Summing up all our work in this section, we deduce that (3.3) equals

[TABLE]

Here $\varphi(k)=\sum_{l|k}\mu(l)\frac{k}{l}$ is the Euler totient.

The remainder of this paper is devoted to computing (3.5). Because $K(\mathcal{B})$ depends on $k$ , we cannot deal with the constant factor just yet. However, we will later use

Lemma 3.7.

For $m>d+1$ ,

[TABLE]

Proof.

We can write the expression under question multiplicatively as

[TABLE]

which then becomes

[TABLE]

∎

4. Main term of (3.5)

In this section, we estimate the intended main term of (3.5), namely

[TABLE]

for each $k\geq 1$ and $2\leq d\leq n-2$ . We may also assume $H\geq k\min_{\mathcal{B}}\det(\mathcal{B}\bar{L})$ , since otherwise (4.1) is equal to [math]. Our approach is essentially that of Schmidt [12], who uses summation by parts. We improve it somewhat by adopting the language of the Riemann-Stieltjes integral, in order to simplify the computation and to derive pretty error terms.

Let us rewrite (4.1) as

[TABLE]

where

[TABLE]

and

[TABLE]

It is easy to check that $\psi(t)$ is a twice differentiable function on $0<t\leq H/k$ , with $\psi^{\prime}(t)=-((d-1)(H/t)^{2}+k^{2})((H/t)^{2}-k^{2})^{(d/2-1)}\leq 0$ .

Choose a $\delta>0$ with $\delta\leq\min_{\mathcal{B}}\det(\mathcal{B}\bar{L})$ . Write $H/k=(\alpha+s)\delta$ with $\alpha\in[0,1)$ and $s\in\mathbb{Z}$ . Also, let $P_{1}(t)$ be the number of elements $\mathcal{B}\in\mathrm{Gr}(\mathbb{Z}^{n-1},d)$ such that $t<\det(\mathcal{B}\bar{L})\leq t+\delta$ , and $P_{2}(t)=P_{1}(t-\delta)$ . Then for $i=1,2$ ,

[TABLE]

Write $R_{1}(t)=P(\bar{\mathcal{L}},d,t+\delta)$ and $R_{2}(t)=P(\bar{\mathcal{L}},d,t)$ . Since $\psi((a+s)\delta)=0$ , by the summation by parts,

[TABLE]

Thus we have bounded $Q(k,H)$ from both sides by certain Riemann-Stieltjes sums. We need to show that those sums converge as $\delta\rightarrow 0$ (and thus $s\sim H/k\delta\rightarrow\infty$ ). First, observe that, since $R_{i}$ ’s are supported strictly away from zero by $\varepsilon=\min_{\mathcal{B}}\det(\mathcal{B}\bar{L})$ , we may assume the same of $\psi$ , so that $\psi$ is of bounded variation. Second, $R_{i}$ are clearly not continuous, but by the induction hypothesis on $n$ , we know it is bounded from both sides by a polynomial in $t$ . More precisely,

[TABLE]

where $c_{j}=b_{j}(\bar{\mathcal{L}})$ is as in Theorem 1.3, and

[TABLE]

By Theorem 6.8 of Rudin ([10]), we have shown that

[TABLE]

Since the same argument will be used repeatedly later in this paper, we summarize our discussion so far in the form of a lemma:

Lemma 4.1.

Assume Theorem 1.3 for $n=m$ , and let $\mathcal{M}$ be a lattice of rank $m$ . Suppose $\psi$ is a decreasing twice differentiable function supported on $[a,b]$ . Then

[TABLE]

We return to estimating (4.3). Recall $\varepsilon=\min_{\mathcal{B}}\det(\mathcal{B}\bar{L})\sim\prod_{i=1}^{d}\lambda_{i}(\bar{\mathcal{L}})$ . In (4.3), for the integrals inside the $O$ -notation, there is no harm in replacing $\varepsilon$ with [math] if $\gamma>d-1$ . For the main term, we can do the same at the cost of

[TABLE]

Now the main term of $Q(k,H)$ contributes

[TABLE]

For the second last equality, we used the identity on the beta function (see e.g. [3, Section 6.2.1])

[TABLE]

and the last equality follows from the definition of $a(n,d)$ .

Now accounting for the factor of $1/(\|v_{n}\|^{d}k^{d})$ in $\eqref{eq:4.1re}$ , we obtain for the intended main term of (4.1)

[TABLE]

It is clear that this term is scale-invariant i.e. invariant under replacing $\mathcal{L}$ and $H$ by $c\mathcal{L}$ and $c^{d}H$ for any $c>0$ .

The error terms of $Q(k,H)$ are dealt with in a similar way, only simpler. For $j\in E_{m,d}$ with $\gamma_{j}>d-1$ , the corresponding term contributes

[TABLE]

It is apparent that $c_{j}H^{\gamma_{j}+1}k^{d-\gamma_{j}-1}/(\|v_{n}\|^{d}k^{d})$ is scale-invariant, since both $c_{j}H^{\gamma_{j}}$ and $H/\|v_{n}\|^{d}$ are.

For those with $\gamma_{j}\leq d-1$ , we proceed as follows:

[TABLE]

In case $\gamma_{j}=d-1$ , we used our assumption $H\geq k\varepsilon$ . Also, to retain the polynomial shape of the error term, we note $c_{j}H^{d}(1+\log\frac{H}{k\varepsilon})=O\left(c_{j}H^{d+\eta}(k\varepsilon)^{-\eta}\right)$ for any $\eta>0$ , and use this bound instead. The scale-invariance can be checked in a straightforward manner.

In conclusion, we proved that (4.1) equals

[TABLE]

where $E^{(1)}$ is an index set of cardinality $|E_{n-1,d}|+1$ , each $c^{\prime}_{j}$ is a reciprocal of products of $\lambda_{i}(\bar{\mathcal{L}})$ ’s and $\|v_{n}\|$ , so that $c^{\prime}_{j}(H/k)^{\gamma_{j}}$ is invariant under the appropriate scaling. The leading error term is of degree $n-b(n-1,d)$ .

5. Error term of (3.5)

In this section, we work on the intended error term of (3.5), namely

[TABLE]

for $1\leq i\leq d$ . Rewrite (5.1) as $1/(\|v_{n}\|)^{d-i}$ times

[TABLE]

which we simplify and bound from above by

[TABLE]

Our analysis of (5.2) depends on the “skewness” of $\mathcal{B}$ and $\bar{\mathcal{L}}$ . We will first explain how to deal with (5.2) in case all $\lambda_{i}(\bar{\mathcal{L}})$ is of size $(H/k)^{1/d}$ — i.e. $\bar{\mathcal{L}}$ is not too skewed — and then work out the general case.

In addition, for the rest of this section, we assume $k=1$ for simplicity. To restore the general case, one could simply replace $H$ by $H/k$ .

5.1. When $\bar{\mathcal{L}}$ is “not skewed”

Assume $\lambda_{n-1}(\bar{\mathcal{L}})\leq 2^{n-1}H^{1/d}$ . For each $0\leq d^{\prime}\leq d$ , consider the restriction of the sum (5.2) to those $\mathcal{B}\in\mathrm{Gr}(\bar{\mathcal{L}},d)$ for which $d^{\prime}$ is the lowest number such that

[TABLE]

(in fact, the former inequality follows from the latter and the minimality of $d^{\prime}$ ), where we interpret $\lambda_{0}=0$ and $\lambda_{d+1}=\infty$ . Such a sum is then bounded by a constant times

[TABLE]

where the sum is over all $\mathcal{B}$ satisfying (5.3), since it follows from the Minkowski’s second and (5.3) that

[TABLE]

The idea for bounding (5.4) is that, because we are assuming $\lambda_{n-1}(\bar{\mathcal{L}})\leq 2^{n-1}H^{1/d}$ , we can proceed as in Section 9 of Schmidt ([12]). The lemma below is a refinement of Lemma 6 of [12], so as to make explicit the dependence on the successive minima of $\bar{\mathcal{L}}$ and $\mathcal{B}$ .

Lemma 5.1.

Let $\bar{\mathcal{L}}$ be an $m$ ( $=n-1$ in our context) dimensional lattice. Fix a $\mathcal{B}^{\prime}\in\mathrm{Gr}(\bar{\mathcal{L}},d^{\prime})$ , and let $j=d-d^{\prime}$ . Then the number of $\mathcal{B}\in\mathrm{Gr}(\bar{\mathcal{L}},d)$ such that $\mathcal{B}_{d-j}=\mathcal{B}^{\prime}$ , $\lambda_{d^{\prime}+1}(\mathcal{B})\gg\lambda_{m}(\bar{\mathcal{L}})$ and $\det\mathcal{B}\leq H$ is

[TABLE]

where the implicit constant here depends only on $n$ and the implied constants on the bound relating $\lambda_{m}(\bar{\mathcal{L}})$ and $\lambda_{d^{\prime}+1}(\mathcal{B})$ .

Proof.

We may assume $\det\mathcal{B}^{\prime}\ll H^{d^{\prime}/d}$ , because by Minkowski’s second

[TABLE]

We proceed by induction on $j$ . Suppose first that $j=1$ . Let $\pi:\mathrm{span}_{\mathbb{R}}(\bar{\mathcal{L}})\rightarrow\mathrm{span}_{\mathbb{R}}(\bar{\mathcal{L}})$ be the orthogonal projection onto the orthogonal complement of $\mathrm{span}_{\mathbb{R}}(\mathcal{B}^{\prime})$ . Then $\pi(\bar{\mathcal{L}})$ ( $\cong\bar{\mathcal{L}}/\mathcal{B}^{\prime}$ ) is a rank $m-d^{\prime}$ lattice of determinant $\det\bar{\mathcal{L}}/\det\mathcal{B}^{\prime}$ , and $\pi(\mathcal{B})$ is a $1$ -dimensional primitive sublattice spanned by a vector whose length is $\det\mathcal{B}/\det\mathcal{B}^{\prime}$ . Therefore, the number of $\mathcal{B}$ is bounded by the number of primitive vectors of $\pi(\bar{\mathcal{L}})$ of length $\leq H/\det\mathcal{B}^{\prime}$ .

If $\mathfrak{F}$ is a fundamental domain of $\bar{\mathcal{L}}$ , then $\pi(\mathfrak{F})$ is a fundamental domain of $\pi(\bar{\mathcal{L}})$ . Since we can choose an $\mathfrak{F}$ of diameter $\lambda_{1}(\bar{\mathcal{L}})+\ldots+\lambda_{m}(\bar{\mathcal{L}})\leq m\lambda_{m}(\bar{\mathcal{L}})$ and $\pi$ is a contraction, $\pi(\mathfrak{F})$ has diameter $\leq m\lambda_{m}(\bar{\mathcal{L}})$ . So the number of vectors of $\pi(\bar{\mathcal{L}})$ of length $\leq H/\det\mathcal{B}^{\prime}$ is bounded by a constant times

[TABLE]

Here we used the fact that $H/\det\mathcal{B}^{\prime}\geq\det\mathcal{B}/\det\mathcal{B}^{\prime}\sim\lambda_{d^{\prime}+1}(\mathcal{B})\gg\lambda_{m}(\bar{\mathcal{L}})$ .

For a general $j$ , by inductive hypothesis what we need to estimate is

[TABLE]

where the sum is over all $\mathcal{C}\in\mathrm{Gr}(\bar{\mathcal{L}},d^{\prime}+1)$ such that $\mathcal{C}_{d^{\prime}}=\mathcal{B}^{\prime}$ and $\lambda_{d^{\prime}+1}(\mathcal{C})\gg\lambda_{m}(\bar{\mathcal{L}})$ . In addition, $\mathcal{C}$ must satisfy $\det\mathcal{C}\ll\det\mathcal{B}^{\prime}(H/\det\mathcal{B}^{\prime})^{1/j}=:h$ say, since $\lambda_{d^{\prime}+1}(\mathcal{B})\ll(H/\det\mathcal{B}^{\prime})^{1/j}$ .

From the (proof of) case $j=1$ , the number of $\mathcal{C}$ with $\mathcal{C}^{\prime}_{d^{\prime}}=\mathcal{B}^{\prime}$ , $\lambda_{d^{\prime}+1}(\mathcal{C})\gg\lambda_{m}(\bar{\mathcal{L}})$ , and $\det\mathcal{C}\leq t$ is

[TABLE]

But $t\geq\det\mathcal{C}\sim\det\mathcal{B}^{\prime}\cdot\lambda_{d^{\prime}+1}(\mathcal{C})\gg\det\mathcal{B}^{\prime}\cdot\lambda_{m}(\bar{\mathcal{L}})$ , so we may disregard the latter possibility.

Therefore, we can apply Lemma 4.1, the Riemann-Stieltjes argument in the previous section, and deduce that (5.6) is bounded by a constant times

[TABLE]

which turns out to be equal to a constant times

[TABLE]

as desired.

∎

We proceed to estimating (5.4). Thanks to Lemma 5.1, for some constant $C>0$ depending only on $n$ such that $\det\mathcal{B}^{\prime}<CH^{d^{\prime}/d}$ (which exists by Minkowski’s second), we can bound it by

[TABLE]

This can be handled again as in the previous section using Lemma 4.1, yielding $|E_{n-1,d^{\prime}}|+1$ terms of $H$ -degree at most $n-i/d$ satisfying all the miscellaneous conditions such as the scaling invariance.

5.2. The skewed case

Now assume that $0\leq l<n-1$ is the lowest number such that

[TABLE]

As earlier, we again restrict the sum (5.2) to those $\mathcal{B}\in\mathrm{Gr}(\bar{\mathcal{L}},d)$ for which $0\leq d^{\prime}\leq d$ is the lowest number such that

[TABLE]

Then we must have $d^{\prime}\leq l$ and $\mathcal{B}_{d^{\prime}}\subseteq\bar{\mathcal{L}}_{l}$ . There is a decomposition

[TABLE]

where $\mathcal{M}$ is an $n-1-l$ dimensional lattice chosen as follows: take a reduced basis $\{x_{1},\ldots,x_{n-1}\}$ of $\bar{\mathcal{L}}$ such that $\|x_{i}\|\sim\lambda_{i}(\bar{\mathcal{L}})$ and $\mathrm{span}\{x_{1},\ldots,x_{l}\}=\bar{\mathcal{L}}_{l}$ . Then we let $\mathcal{M}=\mathrm{span}\{x_{l+1},\ldots,x_{n-1}\}$ . Also, let $\bar{\mathcal{M}}$ to be the orthogonal projection of $\mathcal{M}$ onto $\mathrm{span}_{\mathbb{R}}(\bar{\mathcal{L}}_{l})^{\perp}\subseteq\mathrm{span}_{\mathbb{R}}(\bar{\mathcal{L}})$ . An important fact we will use later is that $\lambda_{1}(\bar{\mathcal{M}})\gg H^{1/d}$ by construction.

We further restrict (5.2) to those $\mathcal{B}$ for which $\mathrm{rk\,}\mathcal{B}\cap\bar{\mathcal{L}}_{l}=r$ for a fixed $r\in\{d^{\prime},\ldots,\min(l,d)\}$ , and call $\mathcal{B}_{(r)}=\mathcal{B}\cap\bar{\mathcal{L}}_{l}$ . Note that $(\mathcal{B}_{(r)})_{d^{\prime}}=\mathcal{B}_{d^{\prime}}$ . We also let $\mathcal{A}\subseteq\bar{\mathcal{M}}$ be the projection of $\mathcal{B}$ onto $\bar{\mathcal{M}}$ . Clearly $\det\mathcal{B}=\det\mathcal{B}_{(r)}\det\mathcal{A}$ , and since $\det\mathcal{A}\gg H^{(d-r)/d}$ we have $\det\mathcal{B}_{(r)}\ll H^{r/d}$ .

Our considerations so far lead us to bound the restriction of (5.2) by, for some constant $C>0$ ,

[TABLE]

Using the induction hypothesis on our main theorem, and the fact that $\lambda_{1}(\bar{\mathcal{M}})\gg H^{1/d}$ , we can rewrite the inner sum so that this becomes

[TABLE]

Let us look at one $\gamma$ at a time, and consider

[TABLE]

By Lemma 5.1 as in the previous “not skewed” section, we obtain that this is

[TABLE]

Applying Lemma 4.1, it is seen that (5.8) may be bounded by at most $|E_{l,r}|+1$ error terms. We need to make sure that the $H$ -degree of those terms are strictly below $n$ . Here we only discuss the terms of the highest degrees, as the rest can be dealt with in a similar fashion.

If $-d+i-\gamma+r\neq 0$ , estimating the sum in (5.8) using Lemma 4.1 yields a term of $H$ -degree $\frac{d^{\prime}}{d}(-d+i-\gamma+r)$ . Therefore, the $H$ -degree of (5.8) equals

[TABLE]

which attains its maximum $n-i/d$ only if $r=d$ and $l=n-1$ . But recall that we are assuming $l<n-1$ .

If $-d+i-\gamma+r=0$ , the sum is of size $O(\log H)$ , in which case we can say that, for a small $\eta>0$ , the $H$ -degree is $\leq n-i/d-id^{\prime}/d+\eta$ if $d^{\prime}\neq 0$ , and is $\leq n-1-i/d+\eta$ if $d^{\prime}=0$ .

5.3. The number of the error terms

We summarize and estimate the maximum number of error terms arising from our estimate of (5.2) so far. If $\bar{\mathcal{L}}$ is “not skewed,” then our estimate yielded $(d+1)E_{n-1}$ error terms, where we write

[TABLE]

and we understand $E_{n-1,0}$ to be the empty set.

As for the skewed case, it is really $n-1$ separate cases corresponding to the parameter $0\leq l<n-1$ , and for each $l$ we obtained at most $(l+1)E_{n-1-l}E_{l}$ error terms. Hence, regardless of $H$ and $d$ , we are able to estimate (5.2) using at most

[TABLE]

terms.

6. Summary, and a proof of Theorem 1.3

6.1. A polynomial expression for $P(\mathcal{L},d,H)$

Summing up all our work so far, we have that

[TABLE]

where $\varepsilon=\min_{\mathcal{B}\in\mathrm{Gr}(\mathbb{Z}^{n-1},d)}\det(\mathcal{B}\bar{L})$ , $E^{(2)}$ is an index set of cardinality at most

[TABLE]

(collecting all error terms from the previous two sections), and each $c_{j}$ is a reciprocal of products of $\lambda_{i}(\bar{\mathcal{L}})$ ’s and $\|v_{n}\|$ so that $c_{j}H^{\gamma_{j}}$ is scale-invariant. In this section, we will estimate the sum (6.1), and then make a choice of $v_{n}\in\mathcal{L}$ so that the dependence on $\lambda_{i}(\bar{\mathcal{L}})$ ’s turns into dependence on $\lambda_{i}(\mathcal{L})$ ’s. This will prove our main theorem.

We treat (6.1) one monomial at a time. The highest degree term contributes

[TABLE]

The corresponding infinite series, by Lemma 3.7, equals

[TABLE]

the desired main term. It remains to bound the tail, which we can, up to a constant factor, approximate as

[TABLE]

which is of size

[TABLE]

We need to show that $\varepsilon^{n-d-1}/(\det\mathcal{L})^{d}$ is bounded by a reciprocal of a product of $\lambda_{i}(\mathcal{L})$ ’s. Since $\varepsilon\sim\prod_{i=1}^{d}\lambda_{i}(\bar{\mathcal{L}})$ and $\lambda_{i}(\bar{\mathcal{L}})\leq\lambda_{i+1}(\mathcal{L})$ (which can be seen by projecting a $(i+1)$ -dimensional subspace of $\mathbb{R}^{n}$ onto the orthogonal complement of $v_{n}$ ), we have $\varepsilon\ll\prod_{i=1}^{d}\lambda_{i+1}(\mathcal{L})$ , and thus $\varepsilon^{n-d-1}\ll\prod_{i=1}^{d}\lambda_{i+1}(\mathcal{L})^{n-d-1}$ . On the other hand, $(\det\mathcal{L})^{d}\sim\prod_{j=1}^{n}\lambda_{j}(\mathcal{L})^{d}$ , which contains the factor $\prod_{j=d+2}^{n}\lambda_{j}(\mathcal{L})$ $d$ times. For any $i\leq d+1$ , $\lambda_{i}(\mathcal{L})^{n-d-1}/\prod_{j=d+2}^{n}\lambda_{j}(\mathcal{L})\leq 1$ , so $\varepsilon^{n-d-1}/(\det\mathcal{L})^{d}\ll\prod_{j=1}^{d+1}\lambda_{j}(\mathcal{L})^{-d}$ , as desired.

We return to other monomials in (6.1). For the indices $j$ with $\gamma_{j}>d+1$ , the sum under consideration is

[TABLE]

which we can bound by the corresponding infinite series and apply Lemma 3.7, obtaining $O(c_{j}H^{\gamma_{j}})$ . If $\gamma_{j}<d+1$ , the sum is of size

[TABLE]

and if $\gamma_{j}=d+1$ , it is

[TABLE]

for any $\eta>0$ . Hence, together with the expression (3.1) of $P^{2}$ , we conclude that

[TABLE]

for some index set $E_{n,d}$ of cardinality at most $dn\sum_{l=0}^{n-1}E_{n-1-l}E_{l}$ , and where each $b_{j}$ is a product of reciprocals of $\lambda_{i}(\mathcal{L})$ ’s, $\lambda_{i}(\bar{\mathcal{L}})$ ’s, and $\|v_{n}\|$ , so that $b_{j}H^{\gamma_{j}}$ is scale-invariant.

At this point, choose $v_{n}$ to be one of the shortest nonzero vectors of $\mathcal{L}$ . Then the following lemma shows that we can replace $\lambda_{i-1}(\bar{\mathcal{L}})$ by $\lambda_{i}(\mathcal{L})$ for each $i$ , so that $b_{\gamma}$ would depend only on $\mathcal{L}$ .

Lemma 6.1.

Recall that $\bar{\mathcal{L}}$ is the orthogonal projection of $\mathcal{L}$ onto the complement of a vector $v_{n}\in\mathcal{L}$ . If we choose $v_{n}$ to be a shortest nonzero vector of $\mathcal{L}$ , then $\lambda_{i-1}(\bar{\mathcal{L}})\sim\lambda_{i}(\mathcal{L})$ for all $i=2,\ldots,n$ .

Proof.

Let $\{w_{1},\ldots,w_{n}\}$ be a reduced basis of $\mathcal{L}$ containing $v_{n}=w_{1}$ . Then, writing $\bar{w}_{i}$ for the projection of $w_{i}$ to the complement of $v_{n}$ , $\{\bar{w}_{2},\ldots,\bar{w}_{n}\}$ is a reduced basis of $\bar{\mathcal{L}}$ . Therefore, $\|w_{i}\|\sim\lambda_{i}(\mathcal{L})$ and $\|\bar{w}_{i}\|\sim\lambda_{i-1}(\bar{\mathcal{L}})$ .

On the other hand, by the definition of a reduced basis, $\|\bar{w}_{i}\|^{2}=\|w_{i}\|^{2}-\mu^{2}\|w_{1}\|^{2}$ for some $|\mu|\leq 1/2$ . This immediately implies $\|\bar{w}_{i}\|\leq\|w_{i}\|$ , and also, since $\|w_{1}\|\leq\|w_{i}\|$ , we have $\|\bar{w}_{i}\|\gg\|w_{i}\|$ , completing the proof. ∎

6.2. The number of the error terms

Let us give a quick, crude estimate of $E_{n}=\max_{0\leq d\leq n}|E_{n,d}|+1$ . From the above discussion, we have

[TABLE]

Recall that $E_{0}=1$ by definition. Also, it is clear from Section 2.2 that $E_{1}=2,E_{2}=3,E_{3}=4$ . We claim in general that $E_{n}\leq n^{3n}$ for $n\geq 2$ . Indeed, the base case is obvious, and assuming the truth for the $n-1$ case, it follows from the above inequality that

[TABLE]

6.3. The primary error term, $d\leq n/2$

Finally, we provide an estimate on the primary error term of $P(\mathcal{L},d,H)$ , again assuming $\|v_{n}\|=\lambda_{1}(\mathcal{L})$ . We temporarily assume $d\leq n/2$ , and argue the cases $d>n/2$ by duality. Tracing back our estimates so far, there are two candidates for the primary error term: one is from the estimate of the “main part” (4.1), which contributes

[TABLE]

where $b(\bar{\mathcal{L}})=b_{j}(\bar{\mathcal{L}})$ for $j\in E_{n-1,d}$ corresponding to the leading error term, and the other is from the estimate of the “error part” (5.1) in case $i=1$ , which contributes

[TABLE]

but by rewriting everything in terms of $\lambda_{i}(\mathcal{L})$ ’s with help of Lemma 6.1, we find that this is bounded by

[TABLE]

The reason we use this slightly inferior bound is that this possesses a convenient symmetry under duality, as we will see below.

We claim by induction that the main error term has degree $n-b(n,d)$ , and that (6.2) is no greater than (6.3). In the base case $n=4,d=2$ , this is clear. In general, if $d=n/2$ , (6.2) is of degree strictly less than $n-b(n,d)$ , and we are done. If $d<n/2$ , then by the fact that $\|v_{n}\|=\lambda_{1}(\mathcal{L})$ and Lemma 6.1,

[TABLE]

which shows that (6.2) has the same size as (6.3), by the inductive hypothesis on $b(\bar{\mathcal{L}})$ . This proves the claim.

6.4. The primary error term, $d>n/2$

Write $d^{\prime}=n-d$ for short. By the duality theorem (2.2), $P(\mathcal{L},d,H)=P(\mathcal{L}^{P},d^{\prime},H^{\prime})$ , where $H^{\prime}=H/\det\mathcal{L}$ . Observe that both has the same main term, that is,

[TABLE]

Moreover, from the previous section, $P(\mathcal{L}^{P},d^{\prime},H^{\prime})$ has the leading error term of size

[TABLE]

which is equal to

[TABLE]

as desired.

7. Proofs of the variants

7.1. Formula for $N(\mathcal{L},d,H)$

An asymptotic formula on $N(\mathcal{L},d,H)$ can be derived easily from that of $P(\mathcal{L},d,H)$ by a standard Möbius inversion, as in Schmidt ([12, Sections 3,4,10]). As in [12], define $\sigma_{d}(m)$ inductively by

[TABLE]

It is shown in [12, Section 3] that $\sigma_{d}(m)$ equals the number of index $m$ sublattices of a rank $d$ lattice, and that

[TABLE]

for $d\leq n-1$ . From the latter it follows that

[TABLE]

where $\varepsilon:=\min_{\mathcal{B}\in\mathrm{Gr}(\mathcal{L},d)}\det\mathcal{B}$ . We handle each sum over $m$ one at a time. For the main term, we have

[TABLE]

On the right-hand side, the first sum is $H^{n}\prod_{i=1}^{d}\zeta(n+1-i)$ by (7.1), which is exactly what we need. The second sum is bounded by a constant times

[TABLE]

for any $\eta>0$ .

In the error term, for those $j\in E_{n,d}$ with $\gamma_{j}>d$ we can replace the sum $\sum_{m=1}^{H/\varepsilon}$ by the infinite sum $\sum_{m=1}^{\infty}$ , and apply (7.1). For those with $\gamma_{j}\leq d$ , we see that

[TABLE]

for any $\eta>0$ . If $d<n-1$ , $\eta$ can be set small enough, so that the secondary term has $H$ -degree $n-b(n,d)$ . If $d=n-1$ , the secondary term has degree $n-1+\eta$ .

The required properties of the coefficients $b^{\prime}_{j}(\mathcal{L})$ can be checked straightforwardly, so we omit the proof.

*Remark**.*

One may wonder what the formula for $N(\mathcal{L},n,H)$ would be. In this case, the skewness of $\mathcal{L}$ induces no subtlety at all, and simply

[TABLE]

for any $\eta>0$ . The proof is identical to the argument in Section 3 of [12]; indeed, observe that $N(\mathcal{L},n,H)=N(\mathbb{Z}^{n},n,H)$ for any full-rank $\mathcal{L}$ of covolume 1.

7.2. Formula for $P_{\mathcal{S}}(\mathcal{L},d,H)$

Let $\mathcal{S}\subseteq\mathcal{L}$ be a sublattice of rank $e\leq n-d$ . By choosing the basis $\{v_{1},\ldots,v_{n}\}$ of $\mathcal{L}$ so that $\{v_{n-e+1},\ldots,v_{n}\}$ is a basis of $\mathcal{S}$ , and proceeding analogously as in Section 3 with $\mathcal{L}/\mathcal{S}$ instead of $\bar{\mathcal{L}}$ , we obtain an estimate of $P_{\mathcal{S}}(\mathcal{L},d,H)$ analogous to that of $P(\mathcal{L},d,H)$ in (1.2), with the coefficients $b_{j}$ being a product of reciprocals of $\lambda_{i}(\mathcal{L})$ and $\lambda_{i}(\mathcal{L}/\mathcal{S})$ (here we identify $\mathcal{L}/\mathcal{S}$ with the projection of $\mathcal{L}$ onto the orthogonal complement of $\mathrm{span}_{\mathbb{R}}(\mathcal{S})$ in $\mathbb{R}^{n}$ ). However, the reciprocal of $\lambda_{i}(\mathcal{L}/\mathcal{S})$ could be arbitrarily large, which may cause difficulties in some applications of Theorem 1.3. For instance, suppose one wants to compute

[TABLE]

Here one is eventually led to sum the multiples of the reciprocals of $\lambda_{i}(\mathcal{L}/\mathcal{A})$ over sublattices $\mathcal{A}$ of height bounded by $H_{1}$ . It seems to be a nontrivial task to show that such a sum is asymptotically small.

Fortunately, with minor modifications to our proof of Theorem 1.3, it is possible to provide a formula for $P_{\mathcal{S}}(\mathcal{L},d,H)$ independent of $\mathcal{S}$ , avoiding the above complication altogether. In this section, we explain what modifications are to be made.

Consider first the base cases $d=1$ or $n-1$ . If $d=1$ , $P_{\mathcal{S}}(\mathcal{L},1,H)=P(\mathcal{L},1,H)-P(\mathcal{S},1,H)$ , and bounding the contribution from $P(\mathcal{S},1,H)$ in terms of $\mathcal{L}$ using $\lambda_{i}(\mathcal{S})\geq\lambda_{i}(\mathcal{L})$ (because $\mathcal{S}\subseteq\mathcal{L}$ ), we obtain the same type of estimate as in (2.1). In case $d=n-1$ , we must have $\mathrm{rk\,}\mathcal{S}=1$ , and thus for $\mathcal{B}\in\mathrm{Gr}(\mathcal{L},n-1)$ , $\mathcal{B}\cap\mathcal{S}=\{0\}$ if and only if $\mathcal{B}^{\perp}\cap\mathcal{S}^{\perp}=\{0\}$ ; hence the proof follows from the $d=1$ case and the duality theorem.

For other values of $d$ , we proceed by induction on $n$ , and split $P_{\mathcal{S}}=P^{1}_{\mathcal{S}}+P^{2}_{\mathcal{S}}$ as in Section 3. For $P^{2}_{\mathcal{S}}$ , we simply bound it by $P^{2}$ . As for $P^{1}_{\mathcal{S}}$ , observe that, analogously to (3.3), we can write

[TABLE]

where $\mathfrak{L}(M)$ denotes the lattice spanned by the row vectors of $M$ . The idea is that the main contribution of the above sum comes from those $B$ with $\mathfrak{L}(B\bar{L})\cap\bar{\mathcal{S}}=\{0\}$ , where $\bar{\mathcal{S}}$ is the projection of $\mathcal{S}$ onto $\bar{\mathcal{L}}$ . Since $\mathfrak{L}(B\bar{L})\cap\bar{\mathcal{S}}=\{0\}$ implies $\mathfrak{L}((hB;b)L)\cap\mathcal{S}=\{0\}$ , we may write

[TABLE]

where

[TABLE]

But since $P^{1,1}_{\mathcal{S}}+P^{1,2}_{\mathcal{S}}=P^{1}$ , it also holds that

[TABLE]

Therefore all we need to show is that $P^{1,2}_{\mathcal{S}}$ is small, or equivalently, that $P^{1,1}_{\mathcal{S}}$ is close to $P^{1}$ .

Estimating $P^{1,1}_{\mathcal{S}}$ amounts to considering an analogous expression to (3.5) where the sum over $\mathcal{B}$ is further restricted to those for which $\mathcal{B}\bar{L}\cap\bar{\mathcal{S}}=\{0\}$ . With the same restriction added to all the subsequent computations, all of the arguments in Section 4 goes through, including Lemma 4.1, since by the induction hypothesis $P$ and $P_{\mathcal{S}}$ satisfy the same asymptotics on lattices of rank $\leq n-1$ . As for the error terms of $P^{1,1}_{\mathcal{S}}$ , we may simply bound them by those of $P^{1}$ , namely (5.2) for $i=1,\ldots,d$ , and so there are no changes to make. This shows that $P^{1}$ and $P^{1,1}_{\mathcal{S}}$ satisfy the same asymptotics, and hence that $P^{1,2}_{\mathcal{S}}$ is bounded by terms of leading degree strictly less than $n$ .

To count the number of error terms, recall that $P_{\mathcal{S}}=P^{1}+O(P^{2})+O(P^{1,2}_{\mathcal{S}})$ . $P^{1}+O(P^{2})$ have $<n^{3n}$ error terms, and since $P^{1,2}_{\mathcal{S}}=P^{1}-P^{1,1}_{\mathcal{S}}$ , it has at most $2n^{3n}$ error terms. Thus the number of error terms in our formula for $P_{\mathcal{S}}$ is no more than $3n^{3n}$ .

It remains to determine the main error term. If $d\leq n/2$ , the argument of Section 6.3 carries over, showing that it is the same as in Theorem 1.3. If $d>n/2$ , extend $\mathcal{S}$ to a sublattice $\mathcal{S}^{\prime}\subseteq\mathcal{L}$ of rank precisely $n-d$ . Then

[TABLE]

on the one hand, and on the other hand,

[TABLE]

Now we can argue as in Section 6.4, using duality, to show that $P_{\mathcal{S^{\prime}}}$ has the leading error term of the same size, and therefore so does $P_{\mathcal{S}}$ . This completes the proof of Theorem 1.5; for $N_{\mathcal{S}}$ , one may proceed as in the last section.

7.3. Flag varieties of type $(e,d)$

Let $\mathcal{L}\subseteq\mathbb{R}^{n}$ be a lattice, and let $1\leq e<d<n$ . Our goal is to estimate the sum

[TABLE]

Here $b=b(d,e)$ . Note that there is also the constraint

[TABLE]

coming from the definition of height.

Estimating the sum over the main term is very similar to the computation in Section 4, so we will be brief. The largest term of (7.2), obtained by applying Lemma 4.1 to $a(d,e)H/(\det\mathcal{W})^{n}$ , comes from the integral

[TABLE]

where $\varepsilon_{e}=\min\{\det\mathcal{B}:\mathcal{B}\subseteq\mathcal{L},\dim\mathcal{B}=e\}$ and likewise for $\varepsilon_{d}$ . The smaller terms can be computed similarly, and it turns out the largest error term is of $H$ -degree $1$ , and the second largest is of $H$ -degree $1-b(n,d)/(n-e)$ , both coming from the leading error term of $P(\mathcal{L},d,(H/\varepsilon_{e}^{d})^{1/(n-e)})$ . One obtains a total of $2|E_{n,d}|<2E_{n}$ error terms from here.

The harder part of (7.2) is the sum over the error term, namely

[TABLE]

To bound this, we employ our method in Section 5 above. In order to avoid repetitive and unenlightening computations, we only show how to compute the first two largest $H$ -degree terms, and suppress the $\lambda_{i}(\mathcal{L})$ factors from the expressions for the error terms.

As in Section 5, for each $0\leq d^{\prime}\leq d$ , we restrict the sum in (7.4) to those $\mathcal{W}$ for which $d^{\prime}$ is the smallest number such that

[TABLE]

Then we can bound (7.4) by

[TABLE]

In order to work on the inner sum, we first determine the range of $\det\mathcal{W}$ . By Minkowski’s second, we have $(\det\mathcal{W}_{d^{\prime}})^{d/d^{\prime}}\ll\det\mathcal{W}$ . On the other hand, again by Minkowski’s second we have $\det\mathcal{W}_{d^{\prime}}H^{(e-d^{\prime})/nd}\leq\det\mathcal{W}_{e}$ , so (7.3) implies

[TABLE]

In the last line, we also used $\det\mathcal{W}_{d^{\prime}}\ll(\det\mathcal{W})^{d^{\prime}/d}$ .

If $d^{\prime}=0$ , the outer sum of (7.5) is vacuous, and the inner sum can be computed as in the main term estimate above, yielding $O(H+H^{1-b(n+d-e)/nd}+H^{1-b(n,d)/n})$ up to lower $H$ -degree terms. Similarly, we obtain the same bound in case $d^{\prime}=d$ . Each of these cases add at most $2E_{n}$ error terms to our estimate.

So assume $1\leq d^{\prime}\leq d-1$ . We will apply Lemma 5.1. To do so, it is required that $\lambda_{d^{\prime}+1}(\mathcal{W})\gg\lambda_{n}(\mathcal{L})$ , which is true provided $\lambda_{n}(\mathcal{L})\ll H^{1/nd}$ . Thus, assuming $H$ is sufficiently large, the inner sum of (7.5) is bounded by a constant times

[TABLE]

We divide into cases according to whether $b(n+d-e)/d-d^{\prime}<0$ or not:

(i)

If $b(n+d-e)/d-d^{\prime}<0$ , then (7.5) is

[TABLE]

(7.6) imposes the additional constraint $\det\mathcal{W}^{\prime}\ll H^{d^{\prime}/dn}$ to this sum. Therefore, by Lemma 4.1, the contribution from the largest $H$ -degree term in our estimate (1.2) of $P(\mathcal{L},d^{\prime},\mathrm{(const)}H^{d^{\prime}/dn})$ to (7.5) is

[TABLE]

up to the two largest $H$ -degree terms, as desired. The contributions from the smaller terms of $P(\mathcal{L},d^{\prime},\mathrm{(const)}H^{d^{\prime}/dn})$ will be discussed later. 2. (ii)

If $b(n+d-e)/d-d^{\prime}>0$ , then we instead have

[TABLE]

Estimating in the same manner as in Case (i) above, this is

[TABLE]

and since $1-2b/d+(d^{\prime}-b)/(n-e)\geq 0$ , we are done. 3. (iii)

In the rare, yet possible, case that $b(n+d-e)/d-d^{\prime}=0$ , we proceed similarly, and bound (7.5) by

[TABLE]

In addition, in all three cases above, the contribution from the leading error term of $P(\mathcal{L},d^{\prime},H^{d^{\prime}/dn})$ is of size $O(H^{1-b(n,d^{\prime})d^{\prime}/dn})$ , and the number of error terms obtained is at most $2E_{n}$ . This completes the error estimate. We note that the related computation in Thunder ([20]), lines 5-6 on p.185, contains a minor error: if $d-e=1$ , the integral there diverges.

In summary, we estimated (7.2) to be

[TABLE]

where $E_{n,e,d}$ is an index set of cardinality at most $(d+2)\cdot 2E_{n}\leq 3n^{3n+1}$ , $b_{j}(\mathcal{L})$ ’s are appropriate inverse products of $\lambda_{i}(\mathcal{L})$ ’s, and the implied constant depends on $n$ only. The largest $\gamma_{j}$ is $1$ , and the second largest is one of

[TABLE]

When $d\leq n/2$ , it is always $1-b(n,d)/n=1-1/dn$ , but otherwise both of the other two are possible.

Bibliography22

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] W. Banaszczyk. New bounds in some transference theorems in the geometry of numbers. Math. Ann. 296 (1993), no. 4, 625-635.
2[2] A. Borel. Introduction to arithmetic groups. University Lecture Series, Vol. 73. The American Mathematical Society, Providence, RI, 2019.
3[3] P. J. Davis. 6. Gamma function and related functions, in M. Abramowitz and I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, New York: Dover Publications, 1972.
4[4] J. Franke, Y. Manin, and Y. Tschinkel. Rational points of bounded height on Fano varieties. Invent. Math. 95 (1989), no. 2, 421-435.
5[5] S. Kim. Random lattice vectors in a set of size O ( n ) 𝑂 𝑛 O(n) . Int. Math. Res. Not. (2020), 2020(5): 1385-1416.
6[6] S. Kim. Mean value formulas on sublattices and flags of the random lattice. J. Number Theory, to appear.
7[7] P. Le Boudec. Height of rational points on random Fano hypersurfaces. ar Xiv:2006.02288 v 1.
8[8] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovász. Factoring polynomials with rational coefficients. Math. Ann. 261 (1982), no. 4, 515-534.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Counting rational points of a Grassmannian

Abstract.

Key words and phrases:

2020 Mathematics Subject Classification:

1. Introduction

1.1. Main result

Theorem 1.1** (Schmidt [12], Theorem 1).**

Theorem 1.2** (Thunder [20], Theorem 3).**

Theorem 1.3’****.

Theorem 1.3**.**

Remark*.*

Remark*.*

Theorem 1.4**.**

Theorem 1.5**.**

1.2. Applications

1.2.1. Rational points of flag varieties

Theorem 1.6** (Thunder [20], Theorem 5).**

Corollary 1.1**.**

1.2.2. Mean value theorems over lattices

Theorem 1.7** (Siegel [17], Theorem on p.341).**

Theorem 1.8** (Rogers [9], (essentially) Theorem 4).**

Theorem 1.9** (Kim [6], Theorem 3).**

Corollary 1.2** (Kim [6], Corollary to Theorem 5).**

1.3. Method of proof

1.4. Organization

1.5. Acknowledgment

2. Some backgrounds

2.1. Definitions, notations, and conventions

Generalities

Gr(M,d)\mathrm{Gr}(\mathcal{M},d)Gr(M,d) and the determinant/height

Orthogonality notions

2.2. Base cases

3. Division into two parts

3.1. Preliminaries

3.2. Computing P2(L,d,H)P^{2}(\mathcal{L},d,H)P2(L,d,H)

3.3. Some lemmas

Lemma 3.1**.**

Proof.

Lemma 3.2**.**

Proof.

Lemma 3.3**.**

Proof.

Lemma 3.4**.**

Proof.

3.4. A computable expression for P1(L,d,H)P^{1}(\mathcal{L},d,H)P1(L,d,H)

Lemma 3.5**.**

Proof.

Lemma 3.6**.**

Proof.

Lemma 3.7**.**

Proof.

4. Main term of (3.5)

Lemma 4.1**.**

5. Error term of (3.5)

5.1. When Lˉ\bar{\mathcal{L}}Lˉ is “not skewed”

Lemma 5.1**.**

Proof.

5.2. The skewed case

5.3. The number of the error terms

6. Summary, and a proof of Theorem 1.3

6.1. A polynomial expression for P(L,d,H)P(\mathcal{L},d,H)P(L,d,H)

Lemma 6.1**.**

Proof.

6.2. The number of the error terms

6.3. The primary error term, d≤n/2d\leq n/2d≤n/2

6.4. The primary error term, d>n/2d>n/2d>n/2

7. Proofs of the variants

7.1. Formula for N(L,d,H)N(\mathcal{L},d,H)N(L,d,H)

Remark*.*

7.2. Formula for PS(L,d,H)P_{\mathcal{S}}(\mathcal{L},d,H)PS​(L,d,H)

7.3. Flag varieties of type (e,d)(e,d)(e,d)

Theorem 1.1 (Schmidt [12], Theorem 1).

Theorem 1.2 (Thunder [20], Theorem 3).

Theorem 1.3’.

Theorem 1.3.

*Remark**.*

*Remark**.*

Theorem 1.4.

Theorem 1.5.

Theorem 1.6 (Thunder [20], Theorem 5).

Corollary 1.1.

Theorem 1.7 (Siegel [17], Theorem on p.341).

Theorem 1.8 (Rogers [9], (essentially) Theorem 4).

Theorem 1.9 (Kim [6], Theorem 3).

Corollary 1.2 (Kim [6], Corollary to Theorem 5).

$\mathrm{Gr}(\mathcal{M},d)$ and the determinant/height

3.2. Computing $P^{2}(\mathcal{L},d,H)$

Lemma 3.1.

Lemma 3.2.

Lemma 3.3.

Lemma 3.4.

3.4. A computable expression for $P^{1}(\mathcal{L},d,H)$

Lemma 3.5.

Lemma 3.6.

Lemma 3.7.

Lemma 4.1.

5.1. When $\bar{\mathcal{L}}$ is “not skewed”

Lemma 5.1.

6.1. A polynomial expression for $P(\mathcal{L},d,H)$

Lemma 6.1.

6.3. The primary error term, $d\leq n/2$

6.4. The primary error term, $d>n/2$

7.1. Formula for $N(\mathcal{L},d,H)$

*Remark**.*

7.2. Formula for $P_{\mathcal{S}}(\mathcal{L},d,H)$

7.3. Flag varieties of type $(e,d)$