Sums of squares in real quadratic fields and Hilbert modular groups

Fernando Chamizo; Roberto J. Miatello

arXiv:1812.10725·math.NT·February 5, 2020

Sums of squares in real quadratic fields and Hilbert modular groups

Fernando Chamizo, Roberto J. Miatello

PDF

TL;DR

This paper employs spectral theory of Hilbert-Maass forms for real quadratic fields to analyze the asymptotic behavior of sums related to representations as sums of two squares in the ring of integers.

Contribution

It introduces a novel application of spectral theory to study sums of two squares in real quadratic fields, providing new asymptotic results.

Findings

01

Derived asymptotic formulas for sums of representations as sums of two squares.

02

Connected spectral theory of Hilbert-Maass forms with classical number theory problems.

03

Extended understanding of sums of squares in the context of real quadratic fields.

Abstract

We use the spectral theory of Hilbert-Maass forms for real quadratic fields to obtain the asymptotics of some sums involving the number of representations as a sum of two squares in the ring of integers.

Equations182

n \leq x \sum r (n) = π x + O (x^{α}) .

n \leq x \sum r (n) = π x + O (x^{α}) .

n \leq x \sum r (n) r (n + 1) = 8 x + O (x^{2/3})

n \leq x \sum r (n) r (n + 1) = 8 x + O (x^{2/3})

\sum_{\begin{subarray}{c}0\leq\lambda<V_{1}\\ 0\leq\lambda^{\sigma}<V_{2}\end{subarray}}r(\lambda)=\frac{\pi^{2}}{\Delta}V_{1}V_{2}+O\big{(}(V_{1}V_{2})^{\alpha}\big{)}\qquad\text{for any }\alpha>\frac{2}{3}

\sum_{\begin{subarray}{c}0\leq\lambda<V_{1}\\ 0\leq\lambda^{\sigma}<V_{2}\end{subarray}}r(\lambda)=\frac{\pi^{2}}{\Delta}V_{1}V_{2}+O\big{(}(V_{1}V_{2})^{\alpha}\big{)}\qquad\text{for any }\alpha>\frac{2}{3}

r(\lambda)=\#\big{\{}(\xi,\eta)\in\mathcal{O}^{2}\;:\;\lambda=\xi^{2}+\eta^{2}\big{\}}.

r(\lambda)=\#\big{\{}(\xi,\eta)\in\mathcal{O}^{2}\;:\;\lambda=\xi^{2}+\eta^{2}\big{\}}.

N_{D}(V_{1},V_{2})=\sum_{\begin{subarray}{c}0\leq\lambda<V_{1}\\ 0\leq\lambda^{\sigma}<V_{2}\end{subarray}}r(\lambda)r(\lambda+1)\qquad\text{and}\qquad C_{D}=\frac{32\Delta}{\big{(}2-\chi(2)+2\chi(4)\big{)}\sum_{n=1}^{\Delta}n^{2}\chi(n)}

N_{D}(V_{1},V_{2})=\sum_{\begin{subarray}{c}0\leq\lambda<V_{1}\\ 0\leq\lambda^{\sigma}<V_{2}\end{subarray}}r(\lambda)r(\lambda+1)\qquad\text{and}\qquad C_{D}=\frac{32\Delta}{\big{(}2-\chi(2)+2\chi(4)\big{)}\sum_{n=1}^{\Delta}n^{2}\chi(n)}

N_{D}(V_{1},V_{2})=C_{D}V_{1}V_{2}+O\big{(}(V_{1}V_{2})^{3/4}\big{)}.

N_{D}(V_{1},V_{2})=C_{D}V_{1}V_{2}+O\big{(}(V_{1}V_{2})^{3/4}\big{)}.

N_{D}(V_{1},V_{2})=C_{D}V_{1}V_{2}+O\big{(}V_{1}^{3/4}V_{2}^{1/2}\big{)}

N_{D}(V_{1},V_{2})=C_{D}V_{1}V_{2}+O\big{(}V_{1}^{3/4}V_{2}^{1/2}\big{)}

cosh ρ (z, w) = 1 + 2 u (z, w) with u (z, w) = \frac{∣ z - w ∣ ^{2}}{4ℑ z ℑ w} .

cosh ρ (z, w) = 1 + 2 u (z, w) with u (z, w) = \frac{∣ z - w ∣ ^{2}}{4ℑ z ℑ w} .

k\in C_{0}^{\infty}\big{(}[0,\infty)\big{)}\longmapsto h(t)=\int_{\mathbb{H}}k\big{(}u(z,i)\big{)}y^{1/2+it}\;d\mu(z).

k\in C_{0}^{\infty}\big{(}[0,\infty)\big{)}\longmapsto h(t)=\int_{\mathbb{H}}k\big{(}u(z,i)\big{)}y^{1/2+it}\;d\mu(z).

h is holomorphic in S_{δ} and ∣ h (z) ∣ ≪ (∣ z ∣ + 1)^{- 2 - δ} for z \in S_{δ} .

h is holomorphic in S_{δ} and ∣ h (z) ∣ ≪ (∣ z ∣ + 1)^{- 2 - δ} for z \in S_{δ} .

\mathcal{O}=\begin{cases}\mathbb{Z}[\sqrt{D}]&\text{if }D\not\equiv 1\pmod{4},\\ \mathbb{Z}\big{[}(1+\sqrt{D})/2\big{]}&\text{if }D\equiv 1\pmod{4}.\end{cases}

\mathcal{O}=\begin{cases}\mathbb{Z}[\sqrt{D}]&\text{if }D\not\equiv 1\pmod{4},\\ \mathbb{Z}\big{[}(1+\sqrt{D})/2\big{]}&\text{if }D\equiv 1\pmod{4}.\end{cases}

\Gamma_{\mathcal{O}}=\big{\{}(\gamma,\gamma^{\sigma})\,:\,\gamma\in\textrm{PSL}_{2}(\mathcal{O})\big{\}}

\Gamma_{\mathcal{O}}=\big{\{}(\gamma,\gamma^{\sigma})\,:\,\gamma\in\textrm{PSL}_{2}(\mathcal{O})\big{\}}

Δ_{z_{1}} ψ_{ℓ} = λ_{ℓ_{1}} ψ_{ℓ}, Δ_{z_{2}} ψ_{ℓ} = λ_{ℓ_{2}} ψ_{ℓ} .

Δ_{z_{1}} ψ_{ℓ} = λ_{ℓ_{1}} ψ_{ℓ}, Δ_{z_{2}} ψ_{ℓ} = λ_{ℓ_{2}} ψ_{ℓ} .

\lambda_{\ell_{j}}=\frac{1}{4}-t_{\ell_{j}}^{2}=\Big{(}\frac{1}{2}+it_{\ell_{j}}\Big{)}\Big{(}\frac{1}{2}-it_{\ell_{j}}\Big{)}\qquad\text{with}\quad t_{\ell_{j}}\in[0,\infty)\cup i(0,1/2].

\lambda_{\ell_{j}}=\frac{1}{4}-t_{\ell_{j}}^{2}=\Big{(}\frac{1}{2}+it_{\ell_{j}}\Big{)}\Big{(}\frac{1}{2}-it_{\ell_{j}}\Big{)}\qquad\text{with}\quad t_{\ell_{j}}\in[0,\infty)\cup i(0,1/2].

u(\mathbf{z},\mathbf{w})=\big{(}u(z_{1},w_{1}),u(z_{2},w_{2})\big{)}\qquad\text{for}\quad\mathbf{z}=(z_{1},z_{2}),\,\mathbf{w}=(w_{1},w_{2})\in\mathbb{H}^{2}.

u(\mathbf{z},\mathbf{w})=\big{(}u(z_{1},w_{1}),u(z_{2},w_{2})\big{)}\qquad\text{for}\quad\mathbf{z}=(z_{1},z_{2}),\,\mathbf{w}=(w_{1},w_{2})\in\mathbb{H}^{2}.

K(\mathbf{z},\mathbf{w})=\sum_{\gamma\in\Gamma}k\big{(}u(\gamma(\mathbf{z}),\mathbf{w})\big{)}.

K(\mathbf{z},\mathbf{w})=\sum_{\gamma\in\Gamma}k\big{(}u(\gamma(\mathbf{z}),\mathbf{w})\big{)}.

K (z, w) = K (γ (z), w) = K (z, γ (w)) for every γ \in Γ .

K (z, w) = K (γ (z), w) = K (z, γ (w)) for every γ \in Γ .

N_{k} = λ \in O \sum r (λ) r (λ + 1) k (λ, λ^{σ}) with k : R^{2} \to C .

N_{k} = λ \in O \sum r (λ) r (λ + 1) k (λ, λ^{σ}) with k : R^{2} \to C .

{a^{2} + c^{2} + D (b^{2} + d^{2}) = n, 2 ab + 2 c d = m .

{a^{2} + c^{2} + D (b^{2} + d^{2}) = n, 2 ab + 2 c d = m .

\sum \sum_{0 \leq m < 4 n} \frac{r ^{2} ( n + m D )}{n ^{α}} < \infty for α > 3.

\sum \sum_{0 \leq m < 4 n} \frac{r ^{2} ( n + m D )}{n ^{α}} < \infty for α > 3.

\mathop{\sum\!\sum}_{0\leq m<4n}\frac{r^{2}(n+m\sqrt{D})}{n^{\alpha}}\ll\mathop{\sum\!\sum}_{0\leq m<4n}\frac{r(n+m\sqrt{D})}{n^{\alpha-1-\epsilon}}\ll\sideset{}{{}^{\prime}}{\sum}_{a,b,c,d}\big{(}a^{2}+Db^{2}+c^{2}+Dd^{2}\big{)}^{1+\epsilon-\alpha}

\mathop{\sum\!\sum}_{0\leq m<4n}\frac{r^{2}(n+m\sqrt{D})}{n^{\alpha}}\ll\mathop{\sum\!\sum}_{0\leq m<4n}\frac{r(n+m\sqrt{D})}{n^{\alpha-1-\epsilon}}\ll\sideset{}{{}^{\prime}}{\sum}_{a,b,c,d}\big{(}a^{2}+Db^{2}+c^{2}+Dd^{2}\big{)}^{1+\epsilon-\alpha}

\Gamma=\big{\{}(\gamma,\gamma^{\sigma})\;:\;\gamma\in M/\{\pm\text{\rm Id}\}\big{\}}\quad\text{with}\quad M=\left\{\begin{pmatrix}a&b\\ c&d\end{pmatrix}\in\textrm{\rm SL}_{2}(\mathcal{O})\,:\,a+d,b+c\in 2\mathcal{O}\right\}.

\Gamma=\big{\{}(\gamma,\gamma^{\sigma})\;:\;\gamma\in M/\{\pm\text{\rm Id}\}\big{\}}\quad\text{with}\quad M=\left\{\begin{pmatrix}a&b\\ c&d\end{pmatrix}\in\textrm{\rm SL}_{2}(\mathcal{O})\,:\,a+d,b+c\in 2\mathcal{O}\right\}.

\mathcal{N}_{k}=2\sum_{\gamma\in\Gamma}k\big{(}u(\gamma(\mathbf{i}),\mathbf{i})\big{)}\qquad\text{where}\quad\mathbf{i}=(i,i).

\mathcal{N}_{k}=2\sum_{\gamma\in\Gamma}k\big{(}u(\gamma(\mathbf{i}),\mathbf{i})\big{)}\qquad\text{where}\quad\mathbf{i}=(i,i).

\displaystyle\mathcal{C}:=\big{\{}(A,B,C,D)\in\mathcal{O}^{4}\,:\,A^{2}+B^{2}=C^{2}+D^{2}+1\big{\}}

\displaystyle\mathcal{C}:=\big{\{}(A,B,C,D)\in\mathcal{O}^{4}\,:\,A^{2}+B^{2}=C^{2}+D^{2}+1\big{\}}

(A, B, C, D)

\mathcal{N}_{k}=\sum_{(A,B,C,D)\in\mathcal{C}}k\big{(}C^{2}+D^{2},(C^{2}+D^{2})^{\sigma}\big{)}=\sum_{\tau\in M}k\big{(}u(\tau(i),i),u(\tau^{\sigma}(i),i)\big{)}.

\mathcal{N}_{k}=\sum_{(A,B,C,D)\in\mathcal{C}}k\big{(}C^{2}+D^{2},(C^{2}+D^{2})^{\sigma}\big{)}=\sum_{\tau\in M}k\big{(}u(\tau(i),i),u(\tau^{\sigma}(i),i)\big{)}.

\Lambda_{0}=\big{\{}\ell\,:\,\text{$\lambda$}_{\ell}\text{ totally exceptional}\big{\}},\qquad\Lambda_{j}=\big{\{}\ell\not\in\Lambda_{0}\,:\,\Im t_{\ell j}\in(0,1/2)\big{\}}\quad j=1,2.

\Lambda_{0}=\big{\{}\ell\,:\,\text{$\lambda$}_{\ell}\text{ totally exceptional}\big{\}},\qquad\Lambda_{j}=\big{\{}\ell\not\in\Lambda_{0}\,:\,\Im t_{\ell j}\in(0,1/2)\big{\}}\quad j=1,2.

H_{j} = n = 0 \sum \infty 2^{2 n} t \in I_{n} sup ∣ h_{j} (t) ∣ with I_{0} = [0, 2) and I_{n} = [2^{n}, 2^{n + 1}) for n \geq 1

H_{j} = n = 0 \sum \infty 2^{2 n} t \in I_{n} sup ∣ h_{j} (t) ∣ with I_{0} = [0, 2) and I_{n} = [2^{n}, 2^{n + 1}) for n \geq 1

M = \frac{h _{1} ( i /2 ) h _{2} ( i /2 )}{∣Γ\ H ^{2} ∣} + ℓ \in Λ_{0} \sum h_{1} (t_{ℓ 1}) h_{2} (t_{ℓ 2}) \overline{ψ}_{ℓ} (z) ψ_{ℓ} (w) .

M = \frac{h _{1} ( i /2 ) h _{2} ( i /2 )}{∣Γ\ H ^{2} ∣} + ℓ \in Λ_{0} \sum h_{1} (t_{ℓ 1}) h_{2} (t_{ℓ 2}) \overline{ψ}_{ℓ} (z) ψ_{ℓ} (w) .

K(\mathbf{z},\mathbf{w})=\mathcal{M}+O_{\mathbf{z},\mathbf{w},\Gamma}\big{(}H_{1}H_{2}+H_{1}\sup_{\ell\in\Lambda_{2}}|h_{2}(t_{\ell_{2}})|+H_{2}\sup_{\ell\in\Lambda_{1}}|h_{1}(t_{\ell_{1}})|\big{)}.

K(\mathbf{z},\mathbf{w})=\mathcal{M}+O_{\mathbf{z},\mathbf{w},\Gamma}\big{(}H_{1}H_{2}+H_{1}\sup_{\ell\in\Lambda_{2}}|h_{2}(t_{\ell_{2}})|+H_{2}\sup_{\ell\in\Lambda_{1}}|h_{1}(t_{\ell_{1}})|\big{)}.

K (z, w) = ℓ \sum h (t_{ℓ}) \overline{ψ}_{ℓ} (z) ψ_{ℓ} (w) + + 2 κ \sum c_{κ} μ \in L_{κ} \sum \int_{(R^{+})^{2}} h (t + μ) \overline{E} (κ; i t, i μ; z) E (κ; i t, i μ; w) d t_{1} d t_{2}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Sums of squares in real quadratic fields and Hilbert modular groups

Fernando Chamizo

and

Roberto J. Miatello

Abstract.

We use the spectral theory of Hilbert-Maass forms for real quadratic fields to obtain the asymptotics of some sums involving the number of representations as a sum of two squares in the ring of integers.

2010 Mathematics Subject Classification:

11F72, 11F41, 11N45

The first named author is partially supported by the MTM2017-83496-P grant of the MICINN (Spain) and by “Severo Ochoa Programme for Centres of Excellence in R&D” (SEV-2015-0554)

1. Introduction

The asymptotic study of the average of the arithmetical function $r(n)$ giving the number of representations of $n$ as a sum of two squares is the goal of the celebrated Gauss circle problem. It asks for the infimum of the exponents $\alpha$ ’s satisfying

[TABLE]

The left-hand side counts the number of lattice points in a circle of radius $\sqrt{x}$ and $\pi x$ gives its area. Gauss used an approximation of this kind when studying the class number of quadratic forms [Gau81]. A simple geometric reasoning, already employed by Gauss, shows (1.1) for $\alpha=1/2$ . Sierpiński [Sie74] proved the estimate for $\alpha=1/3$ with a rather complicated argument (see [Hux96] for a short modern elementary approach). On the other hand, $\alpha=1/3$ can be obtained in a more direct way using the Euclidean spectral expansion (i.e., classical Fourier analysis) of radial functions on $\mathbb{R}^{2}$ [Lan93, VIII§8].

A finer asymptotic property to be studied about $r(n)$ is its self-correlation. In [Iwa02] it is proved that

[TABLE]

and there are similar formulas replacing $r(n+1)$ by $r(n+k)$ (see [Cha99] for the uniformity). Although (1.1) for $\alpha=1/3$ and (1.2) seem unrelated, from the analytic point of view one can run both proofs along similar lines changing Euclidean spectral expansions based on Fourier series by hyperbolic spectral expansions based on Maass forms and Eisenstein series. This second situation is by far more involved and one has to bypass unsettled problems like the existence of exceptional eigenvalues or the $L^{\infty}$ bounds of the eigenfunctions, which become trivial in the Euclidean setting. It is noteworthy to mention that the error term in (1.2) remains unimproved while van der Corput method and other finer techniques of exponential sums have proved (1.1) for some $\alpha<1/3$ [Hux96], namely the best known result [BW17] allows to take $\alpha$ slightly smaller than $0.314$ .

Several authors have considered the analogue of Gauss circle problem in totally real number fields [Sie36a], [Sch65], [Sch62], [Rau88] where $n\in\mathbb{Z}$ is replaced by $\lambda\in\mathcal{O}$ , with $\mathcal{O}$ the ring of integers. A basic issue is to find a real number field analogue of $n\leq x$ (note that the existence of infinitely many units prevents us from using the norm). Following Siegel [Sie36a], a natural condition is to limit the size of every Galois conjugate. In the real quadratic case of discriminant $\Delta$ , the main result in [Sch62] implies that

[TABLE]

where $\lambda^{\sigma}$ is the real conjugate of $\lambda$ and $r(\lambda)$ is defined in the natural way:

[TABLE]

The purpose of this paper is to get an analogue of (1.2) for real quadratic fields by applying the hyperbolic circle problem for products of two upper half-planes. This latter problem was studied in general in [BGM11] for multiple products corresponding to totally real number fields of arbitrary degree. The underlying analysis involves the spectral theory of Hilbert-Maass forms.

To state our main results we use the abbreviations

[TABLE]

where $\Delta$ is the discriminant of the quadratic real field $\mathbb{Q}(\sqrt{D})$ we are considering and $\chi$ is the character corresponding to the Kronecker symbol $\Big{(}\frac{\Delta}{\cdot}\Big{)}$ . See the next section for more details about the notation.

Firstly we state a result like (1.3) for the self-correlation.

Theorem 1.1.

For $V_{1},V_{2}\to\infty$

[TABLE]

Secondly, we are interested in the global uniformity in $V_{1}$ and $V_{2}$ that is not considered in [BGM11]. Note that there are limiting arithmetical situations, for instance if $V_{2}$ is like $V_{1}^{-1}$ we are essentially considering solutions of Pell’s equation which have an exponential spacing, too sparse to be captured by harmonic analysis and by an asymptotic formula.

Theorem 1.2.

For $0<V_{2}<1$ and $V_{1}V_{2}^{2}\to\infty$ , if there are no exceptional eigenvalues then

[TABLE]

and if there exist exceptional eigenvalues then we have to add $O\big{(}V_{1}^{1/2+c}V_{2}^{1/4}\big{)}$ with $c=\sup\Im t_{\ell_{1}}$ .

Remark. In principle one could suspect a chaotic behavior of $C_{D}$ because of the arithmetic nature of the character sum but it is not difficult to prove that $C_{D}\asymp D^{-3/2}$ . A precise result is included in §7. It may sound surprising that the existence or not of exceptional eigenvalues plays a role in Theorem 1.2 but not in Theorem 1.1. In the terminology in [BGM11], we have a large spectral gap and the influence of the exceptional eigenvalues in Theorem 1.1 is absorbed by the error term. The fundamental result here is the bound of Kim and Shahidi [KS02] for the size of the potential exceptional eigenvalues that allows to take $c=1/9$ in Theorem 1.2.

2. Notation and basic concepts

We follow mainly the notation of [BM09] and [BGM11]. We recall it briefly reviewing at the same time the basic concepts. We use Ladau’s $O$ -notation and Vinogradov’s $\ll$ -notation indistinctly.

The Poincaré half-plane $\mathbb{H}$ is the Riemannian manifold given by the upper half-plane $\Im z>0$ endowed with the hyperbolic metric $y^{-2}(dx^{2}+dy^{2})$ , which induces the invariant measure $d\mu(z)=y^{-2}dxdy$ . It is possible to give an explicit formula for the corresponding hyperbolic distance $\rho$ , namely

[TABLE]

The group $\textrm{PSL}_{2}(\mathbb{R})=\textrm{SL}_{2}(\mathbb{R})/\{\pm\textrm{Id}\}$ acts faithfully on $\mathbb{H}$ in the standard way by linear fractional transformations and in fact it coincides with the group of orientation preserving isometries of $\mathbb{H}$ . This implies in particular that $u$ is invariant, meaning $u(z,w)=u\big{(}\gamma(z),\gamma(w)\big{)}$ for any $z,w\in\mathbb{H}$ and $\gamma\in\textrm{PSL}_{2}(\mathbb{R})$ .

For each discrete subgroup $\Gamma<\textrm{PSL}_{2}(\mathbb{R})$ such that $\Gamma\backslash\mathbb{H}$ has finite volume, the spectral theory of automorphic forms allows to expand any $f\in L^{2}(\Gamma\backslash\mathbb{H})$ in terms of the eigenfunctions of the Laplace-Beltrami operator $\Delta=-y^{2}\big{(}\partial^{2}_{x}+\partial^{2}_{y}\big{)}$ . In some sense, the role of the Fourier transform is played in this context by the Selberg transform

[TABLE]

As in the case of the Fourier transform, we can relax a lot the $C_{0}^{\infty}$ regularity still having a sound and useful Selberg transform. It is easier to introduce the conditions in terms of the transform itself, taking for granted the existence of the integral. Following Selberg [Ber54, Satz 3.4], we ask for the existence of a strip $S_{\delta}=\{z\,:\,|\Im z|<1/2+\delta\}$ with $\delta>0$ such that

[TABLE]

This is satisfied when $k\in C_{0}^{\infty}$ (see [Iwa02, §1.8]).

A novelty with respect to the Euclidean setting is that in the cases of arithmetical relevance e.g., $\Gamma=\textrm{PSL}_{2}(\mathbb{Z})$ , there is a discrete spectrum (corresponding mainly to Maass cusp forms) and a continuous spectrum (corresponding to Eisenstein series). This non-classical harmonic analysis built with nonholomorphic automorphic forms has had a profound impact on analytic number theory specially since the development of Kuznetsov’s formula [Haf85].

We focus on the case of real quadratic number fields $\mathbb{Q}(\sqrt{D})$ with $D\in\mathbb{Z}_{>1}$ squarefree. The corresponding ring of integers is

[TABLE]

To parallel the previous case $\textrm{PSL}_{2}(\mathbb{Z})$ , the natural object to work with is the full Hilbert modular group

[TABLE]

acting on $\mathbb{H}^{2}=\mathbb{H}\times\mathbb{H}$ where $\gamma^{\sigma}$ denotes the action by the nontrivial element in the Galois group of $\mathbb{Q}(\sqrt{D})$ (the real conjugation) on the entries of $\gamma$ . It turns out that $\Gamma_{\mathcal{O}}$ is a discrete subgroup of $\textrm{PSL}_{2}(\mathbb{R})^{2}$ and $\Gamma_{\mathcal{O}}\backslash\mathbb{H}^{2}$ has finite volume. In general the groups with these two properties are called lattices and they are said to be irreducible if the projections on each factor of $\textrm{PSL}_{2}(\mathbb{R})^{2}$ are dense. This avoids artificial examples like $\textrm{PSL}_{2}(\mathbb{Z})^{2}$ that can be “reduced” to discrete groups acting on $\mathbb{H}$ .

If $\Gamma$ is an irreducible lattice, spectral theory allows to analyze $L^{2}(\Gamma\backslash\mathbb{H}^{2})$ in terms of the simultaneous eigenfunctions $\psi=\psi(z_{1},z_{2})$ of $\Delta_{z_{1}}$ and $\Delta_{z_{2}}$ (where the subscript indicates the variable). Imposing $\psi\in L^{2}(\Gamma\backslash\mathbb{H}^{2})$ we have a discrete sequence of couples of eigenvalues $\{\text{$ \lambda $}_{\ell}\}_{\ell}$ with $\text{$ \lambda $}_{\ell}=(\lambda_{\ell_{1}},\lambda_{\ell_{2}})$ and corresponding orthonormal eigenfunctions $\psi_{\ell}$ ,

[TABLE]

We reserve the label $\ell=0$ for the trivial couple $\text{$ \lambda $}_{0}=(0,0)$ and consequently $\psi_{0}=|\Gamma\backslash\mathbb{H}^{2}|^{-1/2}$ . It is said that $\text{$ \lambda $}_{\ell}$ is exceptional if $0<\lambda_{\ell_{1}}<1/4$ or $0<\lambda_{\ell_{2}}<1/4$ , and it is said to be totally exceptional if both conditions hold simultaneously. The relevance of the exceptional $\text{$ \lambda $}_{\ell}$ is that the analogue of the Fourier transform has a quite different behavior at them. To emphasize this point we write

[TABLE]

(Note that we slightly divert from [BGM11]). In this way, $t_{01}=t_{02}=i/2$ and $\text{$ \lambda $}_{\ell}$ is exceptional if $\Im t_{\ell_{1}}$ or $\Im t_{\ell_{2}}$ belong to $(0,1/2)$ . Although it is conjectured that there are no exceptional $\text{$ \lambda $}_{\ell}$ in the cases of arithmetic interest (this is the generalization of a famous conjecture due to Selberg [Sel65]), in principle there might be infinitely many such $\text{$ \lambda $}_{\ell}$ . On the other hand, only finitely many can be totally exceptional (because the set $\{\text{$ \lambda $}_{\ell}\}$ is a discrete set) and the result of Kim and Shahidi [KS02] implies $\Im t_{\ell_{1}},\Im t_{\ell_{2}}<1/9$ for the lattices $\Gamma$ in this paper.

As in the one-dimensional case, it turns out that $\{\psi_{\ell}\}_{\ell}$ does not span $L^{2}(\Gamma\backslash\mathbb{H}^{2})$ if $\Gamma\backslash\mathbb{H}^{2}$ is not compact and a continuous spectrum corresponding to Eisenstein series enters into the game. The corresponding spectral theorem is technical in nature and we only need a particular case, so we have limited its application to the proof of a single lemma. The reader preferring not to enter into the details of the proof can use Lemma 4.1 as a black box embodying the spectral theorem. We refer the reader to [BGM11] and [BM09, Ch.1] for more extensive comments on the spectral theorem (see also [HC68] for a more comprehensive theory).

Rather than the expansion of functions in $L^{2}(\Gamma\backslash\mathbb{H}^{2})$ , we need to expand a type of automorphic kernels. To introduce them it is convenient to extend the definition of $u$ in (2.1) to $\mathbb{H}^{2}$ in the natural manner:

[TABLE]

Given an irreducible lattice $\Gamma$ and $k:[0,\infty)^{2}\longrightarrow\mathbb{C}$ decaying rapidly enough at zero and infinity, for instance $k$ compactly supported, we can construct an automorphic kernel

[TABLE]

Using the fact that $u$ is invariant by isometries, we deduce that $K$ is actually automorphic in both variables, that is

[TABLE]

3. An arithmetic lemma

To study the sum of $r(\lambda)r(\lambda+1)$ by analytic methods it is convenient to consider more general weighted sums

[TABLE]

A preliminary consideration is whether this sum actually makes sense when $k$ decays rapidly enough. We state a general elementary result of this kind although, for the aims of this paper, we could restrict ourselves to compactly supported functions.

Lemma 3.1.

If $k(x,y)\ll\big{(}x+y+1)^{-\alpha}$ with $\alpha>3$ for $x,y\geq 0$ then $\mathcal{N}_{k}$ is well-defined.

Proof.

Note first that if $\lambda$ is a sum of two squares then so is $\lambda^{\sigma}$ and if both are positive then we can restrict the sum to $\lambda,\lambda^{\sigma}>0$ .

If $\lambda=n+m\sqrt{D}$ , expanding $\lambda=(a+b\sqrt{D})^{2}+(c+d\sqrt{D})^{2}$ we see that $r(\lambda)$ counts the number of integral or half-integral solutions of

[TABLE]

The positivity of the first equation shows at once that $r(\lambda)$ is well-defined i.e., $r(\lambda)<\infty$ . In fact using that the number of representations of an integer as a sum of two squares in $\mathbb{Z}$ tends to zero when divided by any positive power [HW79, Th. 338], we have the trivial bound $r(\lambda)=O(n^{1+\epsilon})$ for any $\epsilon>0$ . Note also that, necessarily, in order to have a solution one must have $|m|<4n$ .

The inequality $2r(\lambda)r(\lambda+1)\leq r^{2}(\lambda)+r^{2}(\lambda+1)$ and the equation $\lambda+\lambda^{\sigma}=2n$ reduce the assertion to proving that

[TABLE]

Using the trivial bound we have

[TABLE]

where we disregard the value $a=b=c=d=0$ in the last sum. It is plain that the latter series converges when $\alpha>3+\epsilon$ , for instance by comparing with the integral $\int_{B^{\prime}}\|\vec{x}\|^{-2-\delta}d\vec{x}$ , $\delta>0$ , where $B^{\prime}$ is the exterior of the unit ball in $\mathbb{R}^{4}$ . ∎

The key point to apply spectral methods is to translate $\mathcal{N}_{k}$ into an automorphic kernel (2.4). The argument is an adaptation of that in the 1-dimensional case in [Iwa02, Cor.12.2].

Lemma 3.2.

Consider the lattice in $\textrm{\rm PSL}_{2}(\mathbb{R})^{2}$ defined as

[TABLE]

Then for $k$ as in Lemma 3.1 we have

[TABLE]

Proof.

The map

[TABLE]

clearly establishes a bijection between $\mathcal{C}$ and $M$ . On the other hand, if $\tau$ denotes the last matrix a calculation proves $u\big{(}\tau(i),i\big{)}=C^{2}+D^{2}$ . Hence

[TABLE]

This proves the result because $\pm\tau$ give rise to the same element in $\Gamma$ . Note that the sign changes do not affect the values of $\lambda=u(\tau(i),i)$ and $\lambda^{\sigma}=u(\tau^{\sigma}(i),i)$ . ∎

4. A rough spectral bound

Here we state the consequence of the application of the spectral theorem to automorphic kernels in the form needed for our purposes. It will be convenient to classify the labels of the exceptional $\text{$ \lambda $}_{\ell}$ in three sets:

[TABLE]

Lemma 4.1.

Let $k_{1}$ and $k_{2}$ be continuous functions $k_{j}:[0,\infty)\longrightarrow\mathbb{C}$ with Selberg transforms $h_{j}$ satisfying (2.2). Consider the automorphic kernel (2.4) with $k(x,y)=k_{1}(x)k_{2}(y)$ . Define

[TABLE]

and

[TABLE]

Then we have

[TABLE]

Remark. The dependence of the $O$ constant on $\mathbf{z}$ and $\mathbf{w}$ could be made explicit but it is irrelevant in our application. Of course if $\Lambda_{0}=\emptyset$ the sum over $\ell\in\Lambda_{0}$ must be omitted in $\mathcal{M}$ and if there are no exceptional eigenvalues, the same applies to the suprema over $\Lambda_{1}$ and $\Lambda_{2}$ in the error term.

Proof.

The spectral expansion of $K$ , the analogue of the Poisson summation formula, as given in [BGM11, (39)], reads

[TABLE]

where $h(t_{1},t_{2})=h_{1}(t_{1})h_{2}(t_{2})$ , $\kappa$ runs over the finitely many inequivalent cusps, $c_{\kappa}$ are positive constants, $\mathcal{L}_{\kappa}$ is a lattice in $\mathbb{R}^{2}$ and $E$ denotes the Eisenstein series. In the first sum the terms with $\ell\in\Lambda_{0}\cup\{0\}$ contribute exactly as $\mathcal{M}$ . Let $K^{*}=K-\mathcal{M}$ , we have to prove that it is bounded by the error term in the statement. Using that $|ab|\leq(|a|^{2}+|b|^{2})/2$ we have for $\mathbf{v}=\mathbf{z}$ or $\mathbf{v}=\mathbf{w}$

[TABLE]

Now we need a form of Bessel’s inequality that allows to bound, for a fixed $\mathbf{z}\in\mathbb{H}^{2}$ and every $(n_{1},n_{2})\in\mathbb{Z}_{\geq 0}^{2}$ , the expression

[TABLE]

where

[TABLE]

The instance of Bessel’s inequality we need is [BGM11, Th. 4.2]

[TABLE]

The intuitive interpretation is that $\psi_{\ell}$ and $E$ behave as constants on average.

If we divide the integral in (4.2) into the dyadic pieces $\pm(t_{j}+\mu_{j})\in I_{n_{j}}$ indicated by $\mathcal{X}_{c}$ and, using the positivity, we apply (4.3) to each of them, we have that the last term in (4.2) contributes at most

[TABLE]

The same argument works to get this bound for the contribution of the first term in the right-hand side of (4.2) when $t_{\ell 1}$ and $t_{\ell 2}$ are real. The remaining terms have $\ell\in\Lambda_{1}\cup\Lambda_{2}$ and we can proceed in the same way keeping the supremum of $h_{j}$ if $\ell\in\Lambda_{j}$ . For instance, the terms with $\ell\in\Lambda_{1}$ contribute at most

[TABLE]

and the sum is $O(H_{2})$ by (4.3).

Therefore, we have proved that $|K^{*}(\mathbf{z},\mathbf{w})|$ is bounded by the error term appearing in the statement. ∎

5. Volume computations

The main term (4.1) in the spectral expansion depends on the volume of the fundamental region and it becomes closely related to the constant $C_{D}$ appearing in the asymptotic formulas in Theorem 1.1 and Theorem 1.2.

Our aim in this section is to prove the following result:

Proposition 5.1.

Let $\Gamma$ be the subgroup of the full Hilbert modular group $\Gamma_{\mathcal{O}}$ introduced in Lemma 3.2. The volume of a fundamental region of $\Gamma\backslash\mathbb{H}^{2}$ is given by

[TABLE]

where $\chi$ is the character corresponding to the Kronecker symbol $\Big{(}\frac{\Delta}{\cdot}\Big{)}$ and $\Delta$ is the discriminant of $\mathbb{Q}(\sqrt{D})$ i.e., $\Delta=D$ if $4\mid D-1$ and $\Delta=4D$ otherwise.

To prove it we are going to use an old result due to Siegel [Sie36b] and the computation of the index $[\Gamma_{\mathcal{O}}:\Gamma]$ . We give indeed an explicit description of the representatives of the subgroups.

The elements of the full Hilbert modular group and its subgroups are pairs of matrices related by the real conjugation. This redundant presentation is important when the action on $\mathbb{H}^{2}$ is considered but from the group theoretical point of view we get an isomorphic group dropping the second matrix in the pair. For the sake of brevity, in the next result and in the rest of the section we identify $(\gamma,\gamma^{\sigma})$ and $\gamma$ when the action on $\mathbb{H}^{2}$ is irrelevant.

Proposition 5.2.

Define

[TABLE]

and

[TABLE]

where $\Omega=\{1,\omega,\overline{\omega}\}$ and $\Omega^{*}=\{\omega,\overline{\omega}\}$ . Then a complete set of representatives of the cosets $\Gamma_{\mathcal{O}}/\Gamma$ is given by

[TABLE]

Hence the index in these three cases is respectively, 6, 9 and 15. This can be written in an artificial but compact way with the Kronecker symbol.

Corollary 5.3.

We have $[\Gamma_{\mathcal{O}}:\Gamma]=6-3\chi(2)+6\chi(4)$ with $\chi$ as in Proposition 5.1.

We divide the proof of Proposition 5.2 according whether $8$ divides $D-5$ or not. In this second case we benefit from a simple description of the group $\Gamma$ given in the following result which allows a substantial reduction in the computations. In its proof and in that of Proposition 5.2 we will use that for $4\nmid D-1$ , we have $\eta^{2}\in 2\mathcal{O}$ and when writing each $x\in\mathcal{O}$ as $x=x_{1}+x_{2}\eta$ , $x_{1},x_{2}\in\mathbb{Z}$ we have $x\in 2\mathcal{O}$ if and only if $x_{1}$ and $x_{2}$ are even.

Lemma 5.4.

If $8\nmid D-5$ we have $\Gamma=C^{-1}\Gamma_{0}(2\mathcal{O})C$ where $C=ST_{1}$ and $\Gamma_{0}(2\mathcal{O})$ is the subgroup of matrices in $\Gamma_{\mathcal{O}}$ with lower left entry in $2\mathcal{O}$ . Moreover, two matrices in $\Gamma_{\mathcal{O}}$ belong to the same left coset of $\Gamma_{0}(2\mathcal{O})$ if and only if the determinant of the matrix formed by their first columns belongs to $2\mathcal{O}$ .

Proof.

Note the computations

[TABLE]

The first one shows $\Gamma\subset C^{-1}\Gamma_{0}(2\mathcal{O})C$ because if $a+d,b+c\in 2\mathcal{O}$ then $a+c-b-d\in 2\mathcal{O}$ .

The second computation shows that $\Gamma\supset C^{-1}\Gamma_{0}(2\mathcal{O})C$ , it reduces to check that $ad-1\in 2\mathcal{O}$ implies $a+d\in 2\mathcal{O}$ . We consider two cases depending whether or not $8$ divides $D-1$ .

If $8\mid D-1$ , write $a=a_{1}+a_{2}\omega$ and $d=d_{1}+d_{2}\omega$ and note $\omega^{2}-\omega\in 2\mathcal{O}$ . Expanding $ad$ we get that $ad-1\in 2\mathcal{O}$ if and only if $2\nmid a_{1}d_{1}$ and $2\mid a_{1}d_{2}+a_{2}d_{1}+a_{2}d_{2}$ or equivalently if $a_{1}$ , $d_{1}$ are both odd and $a_{2}$ , $d_{2}$ are both even. Hence $a+d\in 2\mathcal{O}$ .

If $8\nmid D-1$ , write $a=a_{1}+a_{2}\eta$ and $d=d_{1}+d_{2}\eta$ . Expanding $ad$ we see that $ad-1\in 2\mathcal{O}$ implies $2\nmid a_{1}d_{1}+a_{2}d_{2}$ and $2\mid a_{1}d_{2}+a_{2}d_{1}$ in particular $a_{1}$ and $a_{2}$ have different parity. In fact we can assume $2\nmid a_{1}$ (by the symmetry $a_{1}\leftrightarrow a_{2}$ , $d_{1}\leftrightarrow d_{2}$ ) then $2\mid a_{2}$ and we conclude $2\nmid d_{1}$ , $2\mid d_{2}$ that gives $a+d\in 2\mathcal{O}$ as expected.

Finally, $\gamma_{1},\gamma_{2}\in\Gamma_{\mathcal{O}}$ belong to the same coset if and only if $\gamma_{2}^{-1}\gamma_{1}\in\Gamma_{0}(2\mathcal{O})$ and the last part of assertion in the result reduces to write the formula for the lower left entry of this product. ∎

of Proposition 5.2 for $8\nmid D-5$ .

We check first that different elements in $\mathcal{R}_{D}$ represent different cosets. A calculation shows that $T_{u}^{-1}ST_{v}\in\Gamma$ for $u,v\in\mathcal{O}$ if and only if $u-v,v^{2}\in 2\mathcal{O}$ . Then $T_{u}$ and $ST_{v}$ are in different cosets when $v\in\{1,\eta+1\}$ if $4\nmid D-1$ and when $v\in\{1,\omega,\overline{\omega}\}$ if $8\mid D-1$ . Clearly $T_{u}$ and $T_{v}$ belong to different cosets when $u-v\not\in 2\mathcal{O}$ , since $T_{u}^{-1}T_{v}=T_{v-u}$ , and the same applies to $ST_{u}$ and $ST_{v}$ . It only remains to check that in the case $8\mid D-1$ the elements $T_{1}ST_{\omega}$ and $T_{1}ST_{\overline{\omega}}$ do not share coset with the other elements. As $T_{u}$ , $ST_{\omega}$ and $ST_{\overline{\omega}}$ are in different cosets, the same holds for $T_{u}$ and $T_{1}ST_{v}$ with $v\in\{\omega,\overline{\omega}\}$ . Writing $u=a+bv$ and using $v^{2}-v\in 2\mathcal{O}$ , after a calculation $(ST_{u})^{-1}T_{1}ST_{v}\in\Gamma$ imposes $2\mid a+1$ , $2\mid b+1$ and $2\mid a+b+1$ which leads to a contradiction.

We focus firstly on the case $4\nmid D-1$ . By Lemma 5.4 we have to prove that for each element in $\Gamma_{\mathcal{O}}$ there is an element in $C\mathcal{R}_{D}C^{-1}$ belonging to the same coset of $\Gamma_{0}(2\mathcal{O})$ . The first column of the matrices in $C\mathcal{R}_{D}C^{-1}$ is given by the following vectors, except for adding to the coordinates elements of $2\mathcal{O}$ ,

[TABLE]

With the criterion given at the end of Lemma 5.4 it is enough to prove that if we add to these vectors a column with elements $a,b\in\mathcal{O}$ at least one of the corresponding determinants is in $2\mathcal{O}$ . Clearly we can assume $a,b\in\{0,1,\eta,\eta+1\}$ . Note that the vectors corresponding to Id and $ST_{1}$ take care of all the cases with $a$ or $b$ zero. Then there are nine cases to be considered. Three of them have $a=b$ and the determinant with the second vector is zero. By the same reason, we can also disregard the three cases in which $a$ and $b$ form the vectors corresponding to $T_{\eta}$ , $T_{\eta+1}$ and $ST_{\eta+1}$ . The remaining cases are $(a,b)=(\eta+1,\eta)$ , $(\eta+1,1)$ and $(\eta,1)$ and we get determinants in $2\mathcal{O}$ using the vectors corresponding respectively to $T_{\eta}$ , $T_{\eta+1}$ and $ST_{\eta+1}$ .

We deal now with the case $8\mid D-1$ . To prove that for each element in $\Gamma_{\mathcal{O}}$ there exists an element in $C\mathcal{R}_{D}C^{-1}$ in the same coset of $\Gamma_{0}(2\mathcal{O})$ we proceed as before. This case is much simpler and no calculations are needed because after excluding the cases $b=0$ , $a=0$ and $a=b$ using as above respectively Id, $ST_{1}$ and $T_{1}$ , we have only six possibilities with $a,b\in\{1,\omega,\overline{\omega}\}$ and they are all covered since each remaining element in $C\mathcal{R}_{D}C^{-1}$ treats at least the case corresponding to its own first column. ∎

If $8\mid D-5$ the group $\Gamma$ is not a conjugate of $\Gamma_{0}(2\mathcal{O})$ . In this case, and actually also when $8\mid D-1$ , there is a simple set of generators of $\Gamma_{\mathcal{O}}$ . Considering some relations among them it is possible to simplify any word to one of the representatives indicated below multiplied by an element of $\Gamma$ . The drawback of this method is that it leads to distinguish a number of cases that require somewhat tedious calculations. The advantage is that it gives a unified treatment of the case $4\mid D-1$ (see the remarks after the proof) and it potentially works for other subgroups.

of Proposition 5.2 for $8\mid D-5$ .

A result due to Vaseršteĭn [Vas72] (see also [Lie81], [May07, §1.2.2] and [Eve16, §5.1]) assures that the group $\Gamma_{\mathcal{O}}$ is generated by $S$ , $T_{1}$ and $T_{\omega}$ . We note that $S^{2}=-\textrm{Id}$ and $S\in\Gamma$ . It is clear that, modulo $\Gamma$ we can write any element $g\in\Gamma_{\mathcal{O}}$ as an alternating product of factors equal to $S$ and $T_{u}$ with $u\in{\mathcal{O}}$ , that we will call generically a word. We employ the usual notation $g_{1}\sim g_{2}$ , or $g_{1}$ and $g_{2}$ equivalent modulo $\Gamma$ , to mean that $g_{1},g_{2}\in\Gamma_{\mathcal{O}}$ belong to the same left coset i.e., $g_{2}^{-1}g_{1}\in\Gamma$ . Note that $gS\sim g$ and then we can always consider words with $T_{u}\not\in\Gamma$ to the right.

By the multilinear properties of the multiplication of matrices, replacing in a product $T_{u}$ by $T_{u+w}$ with $w$ in $2\mathcal{O}$ changes the entries of the final result in elements in this ideal. Recalling the definition of $\Gamma$ , it is easy to see that two matrices in $\Gamma_{\mathcal{O}}$ with entries differing in elements of $2\mathcal{O}$ belong to the same coset. Hence it is enough to consider products involving translations $T_{u}$ with $u\in\Omega$ because for any $u\in\mathcal{O}$ we can find $w\in 2\mathcal{O}$ such that $u-w\in\{0\}\cup\Omega$ . Note that any word of length one is equivalent to exactly a one element of $C_{1}$ .

A calculation shows

[TABLE]

For $v=1$ it lies in $\Gamma$ . This implies that for any $g\in\Gamma_{\mathcal{O}}$ ,

[TABLE]

Hence any element of the form $gT_{1}\not\in\Gamma$ is equivalent to $T_{1}$ or $ST_{1}$ .

The set $C_{2}$ contains representatives of all the words of length 2 and they are clear nonequivalent. We have

[TABLE]

and it is easy to deduce that $C_{1}\cup C_{2}$ does not contain equivalent elements. As shown before, the words $T_{u}ST_{v}$ can be simplified if $v=1$ , hence they are all equivalent to some element in $C_{1}\cup C_{2}\cup C_{3}$ . Noticing

[TABLE]

it follows from (5.3) that $C_{1}\cup C_{3}$ does not contain equivalent elements and by (5.1) (recall $S^{-1}=-S$ ) any element in $C_{2}$ is not equivalent to an element in $C_{3}$ .

For the words of length four $ST_{u}ST_{v}$ we could consider in principle $u\in\Omega$ , $v\in\Omega^{*}$ which makes six possibilities but some calculations show that $(T_{\overline{\omega}}ST_{\omega})^{-1}ST_{\overline{\omega}}ST_{\omega}$ and $(T_{1}ST_{\overline{\omega}})^{-1}ST_{1}ST_{\omega}$ are respectively

[TABLE]

that belong to $\Gamma$ using (5.4). Taking conjugates we deduce

[TABLE]

Then the only possibilities for elements not equivalent to those in $C_{1}\cup C_{2}\cup C_{3}$ are the ones considered in $C_{4}$ .

Assuming that the elements in $\mathcal{R}_{D}$ are not equivalent, we are going to prove that they form a complete set of representatives. This follows by an inductive argument if we prove that the words with length 5 are equivalent to shorter words because we have seen that the words of length at most 4 are equivalent to elements in $\mathcal{R}_{D}$ . By (5.5) and (5.2) it is enough to consider words with $ST_{\omega}ST_{\omega}$ or $ST_{\overline{\omega}}ST_{\overline{\omega}}$ to the right. In fact, taking real conjugates if necessary, it is enough to prove that $T_{u}ST_{\omega}ST_{\omega}\sim ST_{\omega}ST_{\omega}$ , or equivalently

[TABLE]

Using (5.4) we have that $\omega-\omega^{2}+1\in 2\mathcal{O}$ and then the trace of this matrix belongs to $2\mathcal{O}$ . In the same way, $(\omega^{2}-1)^{2}-\omega^{2}=\omega^{4}-3\omega^{2}+1$ differs from $\omega-3\overline{\omega}+1\in 2\mathcal{O}$ in an element of $2\mathcal{O}$ . Hence the matrix is in $\Gamma$ .

It remains to prove that $\mathcal{R}_{D}$ does not contain equivalent elements. Recall that we knew that this was the case for $C_{1}\cup C_{2}\cup C_{3}$ . Note that $(ST_{\overline{\omega}}ST_{\overline{\omega}})^{-1}ST_{\omega}ST_{\omega}=T_{\overline{\omega}}ST_{1}ST_{\omega}$ which is not in $\Gamma$ by (5.1). Then $ST_{\omega}ST_{\omega}\not\sim ST_{\overline{\omega}}ST_{\overline{\omega}}$ . By real conjugation we can restrict ourselves to check that $g=ST_{\omega}ST_{\omega}$ is not equivalent to any any element in $C_{1}\cup C_{2}\cup C_{3}$ . From (5.6), $T_{u}g\sim g$ and then $g$ is not equivalent to any element in $C_{1}\cup C_{3}$ . Clearly $h^{-1}g\in C_{2}\cup C_{3}$ for $h\in C_{2}$ , then $g\not\sim h$ also in this case and the proof is complete. ∎

As mentioned before, it is possible to extend the previous proof to cover the case $8\mid D-1$ . We now sketch the alternative proof following the same lines as in the case when $8|D-5$ .

The calculation (5.1) does not depend on the properties of $D$ and we can deduce as before that $gT_{1}$ is equivalent to Id, $T_{1}$ or $ST_{1}$ . In fact (5.1) implies

[TABLE]

The two latter establish the main difference with the case $8\mid D-5$ and are due to the fact that (5.4) must be replaced by $\omega\overline{\omega}\in 2\mathcal{O}$ and $\omega^{2}-\overline{\omega}+1\in 2\mathcal{O}$ when $8\mid D-5$ . Multiplying to the left by $T_{1}$ one deduces (recall $\omega+1-\overline{\omega}\in 2\mathcal{O}$ )

[TABLE]

Hence any word of length at most 3 is equivalent to one of the elements in $C_{1}\cup C_{2}\cup\big{\{}T_{1}ST_{\omega},T_{1}ST_{\overline{\omega}}\big{\}}$ . These elements are inequivalent (this can be obtained essentially following the scheme of the previous proof, where a bigger set is considered).

To conclude that it is a full set of representatives we have to prove that any word of length 4 can be reduced to one of its elements. By (5.7) and (5.8) it is enough to consider $ST_{1}ST_{\omega}$ and $ST_{1}ST_{\overline{\omega}}$ . The following calculation shows $ST_{1}ST_{\omega}\sim T_{1}ST_{\overline{\omega}}$

[TABLE]

and by conjugation $ST_{1}ST_{\overline{\omega}}\sim T_{1}ST_{\omega}$ .

Summarizing, the relations (5.5) were not enough to exclude words of length 4 when $8\mid D-5$ while the two last formulas in (5.7) cause the collapse of the potential new possibilities in the case $8\mid D-1$ .

of Proposition 5.1.

Siegel proved a closed formula [Sie36b, (19)] for $|\Gamma_{\mathcal{O}}\backslash\mathbb{H}^{d}|$ when $\mathcal{O}$ is the ring of integers of a totally real number field. In the quadratic case, it reads

[TABLE]

where $\zeta_{D}$ is the Dedekind zeta function of $\mathbb{Q}(\sqrt{D})$ . We have $\zeta_{D}(2)=\zeta(2)L(2,\chi)=\frac{1}{6}\pi^{2}L(2,\chi)$ with $L$ the Dirichlet $L$ -function associated to the character $\chi$ . By the functional equation of $L$ in its asymmetric form [MV07, Cor.10.9] (note that $\chi$ is primitive and even)

[TABLE]

By Proposition 5.2, the volume of the fundamental region for $\Gamma$ is $6-3\chi(2)+6\chi(4)$ times that for $\Gamma_{\mathcal{O}}$ , hence

[TABLE]

Now we appeal to the known formula [Was97, Th.4.2] (as an aside, we note that the evaluation of $L(1-n,\chi)$ plays an important role in the definition of $p$ -adic $L$ -functions)

[TABLE]

As $\chi$ is even,

[TABLE]

and the sum of both sides equal $\Delta\sum_{n=1}^{\Delta}\chi(n)=0$ , hence both vanish. Then we can replace $B_{2}(n/\Delta)$ by $(n/\Delta)^{2}$ . ∎

6. Proof of the main results

After Lemma 3.2 and Lemma 4.1, the proof of Theorem 1.1 and Theorem 1.2 boils down to approximate sharp cuts by smooth kernels and estimating Selberg transforms. Following [Cha96], with an hyperbolic version of a classical Euclidean device, it is possible to reduce the whole problem to estimate the Selberg transform of the characteristic function of an interval (which is done in [Cha96, Lemma 2.4]). With this idea in mind we write $\chi_{V}$ and $h_{V}$ respectively for the characteristic function of $[0,V]$ and its Selberg transform

[TABLE]

The slow decay of $h_{V}$ would cause serious convergence problems when applying the spectral theorem. The idea introduced in [Cha96] is to replace $\chi_{V}$ by a manageable approximation in such a way that its Selberg transform is like a product of two $h_{V}$ ’s at different scales. It doubles the decay without requiring to estimate new transforms. The key argument is a simple one and it is summarized in the following result.

Lemma 6.1.

For $0<v<V$ consider the function $F:\mathbb{H}^{2}\rightarrow\mathbb{R}$ given by

[TABLE]

Then there exists $f:[0,\infty)\rightarrow\mathbb{R}$ such that $F(z,w)=f\big{(}u(z,w)\big{)}$ and

[TABLE]

Moreover the Selberg transform of $f$ is $h_{V}h_{v}$ .

Proof.

Note that $F(\gamma z,\gamma w)=F(z,w)$ for $\gamma\in\text{PSL}_{2}(\mathbb{R})$ because $d\mu$ is the invariant measure, then $F(z,w)$ only depends on $\rho(z,w)$ and it assures the existence of $f$ . Using geodesic polar coordinates [Iwa02, (1.17)] it is plain that $\int_{\mathbb{H}}\chi_{v}\big{(}u(\zeta,w)\big{)}\;d\mu(\zeta)=4\pi v$ (the area of a hyperbolic circle) and hence $0\leq f\leq 1$ .

Take $\widetilde{V},\widetilde{v}>0$ such that $2V+1=\cosh\widetilde{V}$ and $2v+1=\cosh\widetilde{v}$ . Then by (2.1) we can write

[TABLE]

By the triangle inequality, if $\rho(z,i)\geq\widetilde{V}+\widetilde{v}$ the integral vanishes. As $0\leq f\leq 1$ , we can rewrite this as $f\big{(}(\cosh x-1)/2\big{)}\leq\chi_{\widetilde{V}+\widetilde{v}}(x)$ for $x\geq 0$ that means $f\leq\chi_{V^{+}}$ with $2V^{+}+1=\cosh(\widetilde{V}+\widetilde{v})$ and the addition formula for $\cosh$ gives the claimed formula for $V^{+}$ . In the same way, if $\rho(z,i)\leq\widetilde{V}-\widetilde{v}$ then $\rho(\zeta,i)<\widetilde{v}$ implies $\rho(z,\zeta)<\widetilde{V}$ and consequently we can omit $\chi_{\widetilde{V}}$ to get $f\big{(}u(z,i)\big{)}=1$ . This can be rephrased as $f\big{(}(\cosh x-1)/2\big{)}\geq\chi_{\widetilde{V}-\widetilde{v}}(x)$ for $x\geq 0$ and proceeding as before, $f\geq\chi_{V^{-}}$ .

The last part of the statement reduces to an application of Fubini’s theorem using that

[TABLE]

See [CRRC13, Lemma 2.2] for the details and a general result. ∎

Theorem 1.1 and Theorem 1.2 will be proved by an application of Lemma 4.1, for different choices of the parameters $v_{1}$ and $v_{2}$ in the following result.

Proposition 6.2.

Let $V_{j}>0$ and $0<v_{j}<C\min(1,V_{j})$ , $j=1,2$ where $0<C<1$ is an absolute constant. Then there exists $V_{j}^{\prime}>0$ with $V_{j}^{\prime}=V_{j}+O\big{(}V_{j}v_{j}^{1/2}+(V_{j}v_{j})^{1/2}\big{)}$ such that

[TABLE]

where the Selberg transform of $k_{j}$ is $(4\pi v_{j})^{-1}h_{V_{j}^{\prime}}h_{v_{j}}$ .

Remark. The constant $C$ can be substituted by an explicit numerical value by following the steps in the proof, but this value is unimportant for our arguments, so we shall not to get into this calculation.

Proof.

Let us abbreviate $M_{j}=\max(1,V_{j}^{-1/2})$ ; then $0<C^{-1/2}v_{j}^{1/2}M_{j}<1$ .

Take in Lemma 6.1 $v=v_{j}$ and $V=V_{j}(t)$ with $V_{j}(t)=V_{j}+tC^{-1/4}V_{j}M_{j}v_{j}^{1/2}$ and $t\in[-1,1]$ . For $t$ in this interval, the hypothesis $0<v<V$ is satisfied choosing $C$ small enough because $C^{-1/4}M_{j}v_{j}^{1/2}\leq C^{1/4}\to 0$ when $C\to 0^{+}$ . Let $f_{t_{j}}$ be the corresponding function $f$ in Lemma 6.1, which ensures that $f_{-1_{j}}\leq\chi_{V_{j}}\leq f_{1_{j}}$ under

[TABLE]

We have

[TABLE]

hence the inequalities hold choosing $C$ small enough.

Consequently, defining $K_{t}$ as the automorphic kernel (2.4) that corresponds to $k_{t}(x,y)=f_{t1}(x)f_{t2}(y)$ , we have by Lemma 3.2

[TABLE]

The intermediate value theorem implies $N_{D}(V_{1},V_{2})=2K_{t_{0}}(V_{1},V_{2})$ for some $t_{0}\in[-1,1]$ and the result follows with $V_{j}^{\prime}=V_{j}(t_{0})$ . ∎

For the sake of completeness, we include here the estimates we need for the Selberg transform. A more precise result with asymptotic formulas is given in [Cha96].

Lemma 6.3.

The Selberg transform $h_{V}$ of the characteristic function of $[0,V]$ satisfies for $t\geq 0$

[TABLE]

and for $t$ pure imaginary $0<\Im t\leq 1/2$ we have

[TABLE]

Proof.

This is a consequence of Lemma 2.4 in [Cha96]. ∎

The logarithm appearing in these estimates cannot be avoided when $V$ is large and $t$ close to [math]. It is reflected on some average results for the hyperbolic circle problem [PR94].

Now, we are ready to prove our main results.

of Theorem 1.1.

Take in Proposition 6.2 $v_{1}=v_{2}=(V_{1}V_{2})^{-1/2}$ , hence we have $V_{1}^{\prime}=V_{1}+O\big{(}V_{1}^{3/4}V_{2}^{-1/4}\big{)}$ and $V_{2}^{\prime}=V_{2}+O\big{(}V_{2}^{3/4}V_{1}^{-1/4}\big{)}$ . With the notation of Lemma 4.1, for $t\in I_{n}$ , (6.1) gives

[TABLE]

and a term $\log(V_{j}+1)$ must be introduced for $t\in I_{0}$ . Then $H_{j}\ll V_{j}^{1/2}(V_{1}V_{2})^{1/8}$ . On the other hand, the $1/9$ bound for $\Im t_{\ell j}$ from [KS02] implies by (6.2)

[TABLE]

Consequently, the sum over $\ell\in\Lambda_{0}$ in Lemma 4.1 contributes $O\big{(}(V_{1}V_{2})^{11/18}\big{)}$ and, as $h_{V_{j}^{\prime}}(i/2)=4\pi V_{j}^{\prime}$ and $h_{v_{j}}(i/2)=4\pi v_{j}$ , we obtain the main term (4.1) of the automorphic kernel

[TABLE]

where we have used expression (1.5) and Proposition 5.1. On the other hand, the error term in Lemma 4.1 is

[TABLE]

and clearly this is $O\big{(}(V_{1}V_{2})^{3/4}\big{)}$ . ∎

of Theorem 1.2.

Take in Proposition 6.2 $v_{1}=(V_{1}V_{2}^{2})^{-1/2}$ and $v_{2}=V_{2}^{-1/2}$ , that implies $V_{1}^{\prime}=V_{1}+O\big{(}V_{1}^{1/2}V_{2}^{-1}\big{)}$ and $V_{2}^{\prime}=V_{2}+O\big{(}V_{1}^{-1/4}V_{2}^{1/2}\big{)}$ . The Selberg transforms of $h_{V_{1}^{\prime}}$ , $h_{v_{1}}$ and $h_{v_{2}}$ admit bounds as in the previous proof that give

[TABLE]

and $H_{1}\ll V_{1}^{1/2}(V_{1}V_{2}^{2})^{1/8}=V_{1}^{5/8}V_{2}^{1/4}$ .

The difference is that for $h_{V_{2}^{\prime}}$ we have to use the second bound in (6.1) because $V_{2}^{\prime}<1$ . Then in this case

[TABLE]

which leads to $H_{2}\ll V_{1}^{1/8}V_{2}^{1/4}$ .

If there are no exceptional eigenvalues, the error term in Lemma 4.1 is simply $O(H_{1}H_{2})=O\big{(}V_{1}^{3/4}V_{2}^{1/2}\big{)}$ and we get the first part of the result.

For the second part, note that we only need a bound for $\Im t_{\ell_{j}}$ when $j=1$ because the second formula of (6.2) does not involve $t$ . Under our assumption

[TABLE]

Then the error term in Lemma 4.1 includes two new terms following $V_{1}^{3/4}V_{2}^{1/2}$ , namely, it is

[TABLE]

Finally, note that the finite sum in $\mathcal{M}$ contributes $O\big{(}V_{1}^{1/2+c}V_{2}\big{)}$ that is absorbed by the error term. ∎

7. Appendix. Numerical considerations

A problem to carry out numerical calculations related to (3.1) is that $r$ is essentially a divisor function in $\mathcal{O}[i]$ and it seems hard to implement in an efficient way. Here we make some comments for the reader interested in doing direct computations and we display some of the data obtained in this way.

A simple calculation, already displayed in the proof of Lemma 3.1 shows that for $\lambda=x+y\sqrt{D}\in\mathcal{O}$ the value of $r(\lambda)$ is the number of solutions $a,b,c,d$ of

[TABLE]

Here, $a+b\sqrt{D},c+d\sqrt{D}\in\mathcal{O}$ . Equivalently, we consider $a,b,c,d\in\mathbb{Z}$ if $D\equiv 1\ (4)$ and we consider $2a,2b,2c,2d,a-b,c-d\in\mathbb{Z}$ if $D\not\equiv 1\ (4)$ .

The solutions of (7.1) enjoy some symmetries that are useful for numerical calculations.

Proposition 7.1.

Given $\lambda=x+y\sqrt{D}\in\mathcal{O}$ , let $M_{j}$ and $M^{*}_{k}$ be the number of solutions $a+b\sqrt{D},c+d\sqrt{D}\in\mathcal{O}$ of (7.1) satisfying the mutually exclusive conditions indicated in this list:

[TABLE]

Then $r(\lambda)=8(M_{1}+M_{2}+M_{3}+M_{4})+2(M_{1}^{*}+M_{2}^{*})$ .

Proof.

We have a group of transformations acting on the solutions, $G\cong\mathbb{Z}_{2}\times\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ , generated by

[TABLE]

This action is free (fixed point free) except for the set of solutions $S^{*}$ satisfying one of the following four sets of conditions:

[TABLE]

The solutions satisfying the first or the second set of conditions amount $2M_{1}^{*}$ and the rest of the conditions give $2M_{2}^{*}$ solutions. In the complement of $S^{*}$ the action of $G$ gives equivalence classes with eight elements and we can always select exactly one of these eight elements satisfying the conditions indicated for $M_{1}$ , $M_{2}$ , $M_{3}$ and $M_{4}$ . This follows imposing an ordering between [math], $a$ and $c$ when possible i.e., when there are not coincidences among them. In this way $M_{1}$ counts the ordered case and $M_{2}$ , $M_{3}$ and $M_{4}$ the cases with coincidences. ∎

For the numerical calculation of $N_{D}(V_{1},V_{2})$ note that $0\leq\lambda<V_{1}$ and $0\leq\lambda^{c}<V_{2}$ is equivalent to

[TABLE]

The first one readily imposes upper bounds on $|a|$ , $|b|$ , $|c|$ and $|d|$ . If we let $a$ , $b$ , $c$ and $d$ vary in these ranges and, for each combination, we add 1 to the position $x$ , $y$ of a matrix with $x$ and $y$ as in (7.1), then this matrix will store the required values of $r(\lambda)$ and the evaluation of $r(\lambda)r(\lambda+1)$ in (3.1) corresponds to a sum of products of adjacent entries.

Now Proposition 7.1 allows to reduce in a factor of approximately $1/8$ the ranges of $a$ , $b$ , $c$ and $d$ to be considered. In the implementation the loop corresponding to $M_{1}$ takes the bulk of the calculations because the rest of the loops involve a lesser number of free variables.

The actual issue for direct computations of this kind is the limit of the fast access allocatable memory to store the matrix. The size of the stored elements can be reduced to one fourth noting that $r(\lambda)=r(\lambda^{\sigma})$ and that $y$ is always even by (7.1). With these reductions, when $D\not\equiv 1\ (4)$ in the more demanding case $V_{1}\approx V_{2}\approx V$ (less sparse matrix), the memory to be allocated for the values of $x$ and $y$ is of around

[TABLE]

integers. When $V_{1}$ and $V_{2}$ are very unbalanced, meaning $V_{1}$ large and $V_{2}<1$ , then $y$ is determined by $x$ and one only needs room for the around $V_{1}/2$ integer values of $x$ . If $D\equiv 1\ (4)$ these figures should be multiplied by $2$ .

We mention here a couple of tables for $D=2$ that we have obtained by implementing the previous idea in an average PC with a simple C program.

As in the case of the Gauss circle problem, the error term in the asymptotic approximation of $N_{2}(V,V)$ oscillates and to get a reliable idea on the limits for a general upper bound, rather than picking large special values of $V$ , it is more informative to consider

[TABLE]

Extensive computations prove

[TABLE]

As an aside, a possibility that we have not explored is to carry out numerical calculations via Lemma 3.2 with a description of the group in terms of words and relations (cf. [PR94]). It looks promising when $4\mid D-1$ because $\Gamma_{\mathcal{O}}$ has simple generators and Proposition 5.2 gives a full description of the cosets but when $4\nmid D-1$ the natural generators are linked to the class group [Eve16, §5.2] and the calculations for large values of $D$ could be more difficult. In both cases one should have some control on $u(\gamma(\mathbf{i}),\mathbf{i})$ in terms of the length of $\gamma$ .

In the case of unbalanced arguments of $N_{2}$ , the numerical experiments suggest that the asymptotic formula

[TABLE]

in Theorem 1.2 could hold in some ranges beyond the condition $V_{1}V_{2}^{2}\to\infty$ , but even for reasonably large $V_{1}$ the values of $N_{2}(V_{1},V_{2})$ are too small to draw trustworthy conjectures. For instance, if we define

[TABLE]

then we have

[TABLE]

It is tempting to claim that $G(V)$ goes to $1$ when $V\to\infty$ .

We finish with some considerations about the constant $C_{D}$ giving the limit of $N_{D}(V,V)/V^{2}$ . Here it is a table of some values

[TABLE]

As we mentioned before, and the table confirms, $C_{D}$ decays as $D^{-3/2}$ . This decay can be proved theoretically with explicit constants. Namely, we have

Proposition 7.2.

With the notation as in (1.5), we have

[TABLE]

This is in accordance with the previous table. In which the minimal value of $\Delta^{3/2}C_{D}$ is around $84.12$ (reached at $D=1001$ ) and the maximal value is around $181.02$ (reached at $D=2$ ).

Proof.

The formula (5.9) due to Siegel for the volume of the fundamental region of the full Hilbert modular group and Corollary 5.3 imply

[TABLE]

Clearly

[TABLE]

Using $\zeta(2)=\pi^{2}/6$ , $\zeta(4)=\pi^{4}/90$ and comparing with Proposition 5.1,

[TABLE]

Substituting in the definition of $C_{D}$ , we complete the proof. ∎

Sharper bounds can be proved separating congruence classes modulo 8. For instance, if $8\mid D-1$ the previous argument leads to

[TABLE]

and in fact $\Delta^{3/2}C_{D}\to 64$ if we choose $D=4\prod_{p\leq N}p+1$ with $N\to\infty$ to force $\chi(p)=1$ for small primes. For example, taking $N=17$ we have $D=38798761$ and $\Delta^{3/2}C_{D}\approx 64.84$ .

Bibliography29

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[Ber 54] S. Berg. Part 1 of Selberg’s Gottingen lectures: a rough transcription. Available in the website of the Institute for Advanced Study https://publications.ias.edu/selberg , 1954.
2[BGM 11] R. W. Bruggeman, F. Grunewald, and R. J. Miatello. New lattice point asymptotics for products of upper half-planes. Int. Math. Res. Not. IMRN , (7):1510–1559, 2011.
3[BM 09] R. W. Bruggeman and R. J. Miatello. Sum formula for SL 2 subscript SL 2 \rm SL_{2} over a totally real number field. Mem. Amer. Math. Soc. , 197(919):vi+81, 2009.
4[BW 17] J. Bourgain and N. Watt. Mean square of zeta function, circle problem and divisor problem revisited. ar Xiv:1709.04340 [math.AP], 2017.
5[Cha 96] F. Chamizo. Some applications of large sieve in Riemann surfaces. Acta Arith. , 77(4):315–337, 1996.
6[Cha 99] F. Chamizo. Correlated sums of r ( n ) 𝑟 𝑛 r(n) . J. Math. Soc. Japan , 51(1):237–252, 1999.
7[CRRC 13] F. Chamizo, D. Raboso, and S. Ruiz-Cabello. Exotic approximate identities and Maass forms. Acta Arith. , 159(1):27–46, 2013.
8[Eve 16] L.M. Everhart. On generators of Hilbert modular groups of totally real number fields. Master’s thesis, UNC Greensboro, Greensboro, 2016. Available online https://libres.uncg.edu/ir/uncg/f/Everhart_uncg_0154 M_11992.pdf .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Sums of squares in real quadratic fields and Hilbert modular groups

Abstract.

2010 Mathematics Subject Classification:

1. Introduction

Theorem 1.1**.**

Theorem 1.2**.**

2. Notation and basic concepts

3. An arithmetic lemma

Lemma 3.1**.**

Proof.

Lemma 3.2**.**

Proof.

4. A rough spectral bound

Lemma 4.1**.**

Proof.

5. Volume computations

Proposition 5.1**.**

Proposition 5.2**.**

Corollary 5.3**.**

Lemma 5.4**.**

Proof.

of Proposition 5.2 for 8∤D−58\nmid D-58∤D−5.

of Proposition 5.2 for 8∣D−58\mid D-58∣D−5.

of Proposition 5.1.

6. Proof of the main results

Lemma 6.1**.**

Proof.

Proposition 6.2**.**

Proof.

Lemma 6.3**.**

Proof.

of Theorem 1.1.

of Theorem 1.2.

7. Appendix. Numerical considerations

Proposition 7.1**.**

Proof.

Proposition 7.2**.**

Proof.

Theorem 1.1.

Theorem 1.2.

Lemma 3.1.

Lemma 3.2.

Lemma 4.1.

Proposition 5.1.

Proposition 5.2.

Corollary 5.3.

Lemma 5.4.

of Proposition 5.2 for $8\nmid D-5$ .

of Proposition 5.2 for $8\mid D-5$ .

Lemma 6.1.

Proposition 6.2.

Lemma 6.3.

Proposition 7.1.

Proposition 7.2.