Symmetry projected Jastrow mean field wavefunction in variational Monte   Carlo

Ankit Mahajan; Sandeep Sharma

arXiv:1902.07690·cond-mat.str-el·June 19, 2019

Symmetry projected Jastrow mean field wavefunction in variational Monte Carlo

Ankit Mahajan, Sandeep Sharma

PDF

TL;DR

This paper introduces an extension of variational Monte Carlo methods to optimize symmetry-projected Jastrow mean field wavefunctions, enabling efficient treatment of complex correlated quantum systems with multiple symmetries.

Contribution

The authors develop a low-scaling VMC algorithm for optimizing symmetry-projected Jastrow mean field wavefunctions, including various advanced wavefunction forms like JAGP and RVB, with applications to benchmark systems.

Findings

01

Significant correlation energy captured in benchmark systems.

02

Efficient optimization with multiple broken symmetries.

03

Ability to compute reduced density matrices and other observables.

Abstract

We extend our low-scaling variational Monte Carlo (VMC) algorithm to optimize the symmetry projected Jastrow mean field (SJMF) wavefunctions. These wavefunctions consist of a symmetry-projected product of a Jastrow and a general broken-symmetry mean field reference. Examples include Jastrow antisymmetrized geminal power (JAGP), Jastrow-Pfaffians, and resonating valence bond (RVB) states among others, all of which can be treated with our algorithm. We will demonstrate using benchmark systems including the nitrogen molecule, a chain of hydrogen atoms, and the 2-D Hubbard model that a significant amount of correlation can be obtained by optimizing the energy of the SJMF wavefunction. This can be achieved at a relatively small cost when multiple symmetries including spin, particle number, and complex conjugation are simultaneously broken and projected. We also show that reduced density…

Figures2

Click any figure to enlarge with its caption.

Tables6

Table 1. Table 1: The table lists the various mean-field wavefunctions and the symmetries that are broken in them.

Ansatz	Broken symmetries
GHF	$S_{z}$ , $S^{2}$ , $K$
UHF	$S^{2}$ , $K$
RHF	$K$
GBCS	$N$ , $S_{z}$ , $S^{2}$ , $K$
UBCS	$N$ , $S^{2}$ , $K$
RBCS	$N$ , $K$

Table 2. Table 2: Errors in the ground state energy (E h ) for the \ce H8 linear chain calculated using various wavefunctions.

Wavefunction	$d = 1.4$	$d = 1.8$	$d = 2.4$
RHF	0.085	0.124	0.219
UHF	0.085	0.121	0.135
AGP	0.055	0.088	0.172
$S_{z}$ GHF	0.040	0.053	0.068
K $S_{z}$ GHF	0.021	0.029	0.037
K $S_{z}$ Pfaffian	0.010	0.016	0.025
J-K $S_{z}$ Pfaffian	0.0001	0.0002	0.0002

Table 3. Table 3: Errors in energy (mE h ) per electron for the ground state of the \ce H50 chain with open boundary conditions as a function of interatomic distance (Bohr). Exact energies were obtained using DMRG. MC statistical errors are less than 0.02 mE h /electron.

$d$	J-KRHF		J-KAGP		J-KUHF		J-K $S_{z}$ GHF		J-K $S_{z}$ Pfaffian
	VMC	GFMC	VMC	GFMC	VMC	GFMC	VMC	GFMC	VMC	GFMC
1.6	1.93	0.84	1.61	0.68	1.66	0.70	0.92	0.30	0.68	0.22
1.8	2.64	1.14	2.02	0.91	2.17	0.98	0.94	0.26	0.79	0.23
2.5	3.59	1.60	2.96	1.26	3.43	1.47	0.76	0.18	0.62	0.12

Table 4. Table 4: Errors in ground state energy per electron (units of 10 − 3 t superscript 10 3 𝑡 10^{-3}t ) for the 3 2 × 3 2 3 2 3 2 3\sqrt{2}\times 3\sqrt{2} tilted Hubbard model with periodic boundary conditions for different U / t 𝑈 𝑡 U/t values at half-filling. MC statistical errors are less than 0.1 × 10 − 3 t / 0.1\times 10^{-3}\ t/ electron.

$U / t$	J- $S_{z}$ GHF		J-K $S_{z}$ GHF		J- $S_{z}$ Pfaffian		J-K $S_{z}$ Pfaffian
	VMC	GFMC	VMC	GFMC	VMC	GFMC	VMC	GFMC
4	7.3	2.6	5.1	1.9	6.8	2.3	4.9	1.9
8	10.2	4.1	7.0	2.0	10.1	3.9	6.8	2.0
10	8.9	2.5	5.7	1.4	7.6	2.3	5.4	1.2
20	3.9	0.7	2.9	0.2	3.9	0.6	2.9	0.2

Table 5. Table 5: Ground state energy per electron (units of t 𝑡 t ) for the 7 2 × 7 2 7 2 7 2 7\sqrt{2}\times 7\sqrt{2} tilted Hubbard model with periodic boundary conditions for different U 𝑈 U values. The energies in the first two columns use the J-K S z subscript 𝑆 𝑧 S_{z} GHF wavefunction. The E ref ref {}_{\text{ref}} results are taken from reference 68 . MC statistical errors are less than 0.1 × 10 − 3 t / 0.1\times 10^{-3}\ t/ electron.

$U / t$	VMC	GFMC	E $_{ref}$
2	-1.1920	-1.1939	-1.1962
4	-0.8566	-0.8598	-0.8620
8	-0.5183	-0.5221	-0.5237

Table 6. Table 6: Density correlation function ⟨ n i n j ⟩ delimited-⟨⟩ subscript 𝑛 𝑖 subscript 𝑛 𝑗 \langle n_{i}n_{j}\rangle for the 3 2 × 3 2 3 2 3 2 3\sqrt{2}\times 3\sqrt{2} tilted Hubbard model with periodic boundary conditions for U / t = 4 𝑈 𝑡 4 U/t=4 at half-filling for different values of distance between sites i 𝑖 i and j 𝑗 j . MC statistical errors are less than 0.001 0.001 0.001 .

Wavefunction	$d = 1$	$d = \sqrt{2}$	$d = 2$	$d = \sqrt{5}$	$d = 3$
DMRG	0.944	0.993	0.992	0.991	0.998
J-K $S_{z}$ GHF	0.942	0.992	0.993	0.992	0.997
J-K $S_{z}$ Pfaffian	0.941	0.993	0.992	0.990	0.997

Equations110

⟨ H ⟩

⟨ H ⟩

= n \sum ρ_{n} E_{L} [n] = ⟨ E_{L} [n] ⟩_{ρ_{n}}

E_{L} [n]

E_{L} [n]

= m \sum H_{n, m} \frac{⟨ m ∣Ψ ( p )⟩}{⟨ n ∣Ψ ( p )⟩},

E_{L} [n, ϵ] =

E_{L} [n, ϵ] =

r (m \leftarrow n) = (\frac{ρ ( m )}{ρ ( n )})^{1/2} = \frac{⟨ m ∣Ψ ( p )⟩}{⟨ n ∣Ψ ( p )⟩}

r (m \leftarrow n) = (\frac{ρ ( m )}{ρ ( n )})^{1/2} = \frac{⟨ m ∣Ψ ( p )⟩}{⟨ n ∣Ψ ( p )⟩}

m^{(i)} =

m^{(i)} =

n^{(i)} =

Δ p_{j} =

Δ p_{j} =

∣ ψ ⟩ = \hat{C} \hat{P} ∣ ϕ_{0} ⟩,

∣ ψ ⟩ = \hat{C} \hat{P} ∣ ϕ_{0} ⟩,

[\hat{H}, \hat{N}] = [\hat{H}, \hat{S}] = [\hat{H}, \hat{K}] = 0,

[\hat{H}, \hat{N}] = [\hat{H}, \hat{S}] = [\hat{H}, \hat{K}] = 0,

∣ GHF ⟩ = k = 1 \prod N_{e} \overset{a}{^}_{k}^{†} ∣0 ⟩,

∣ GHF ⟩ = k = 1 \prod N_{e} \overset{a}{^}_{k}^{†} ∣0 ⟩,

\overset{a}{^}_{k}^{†} = p = 1 \sum M σ =↑, ↓ \sum θ_{k}^{p σ} \overset{a}{^}_{p σ}^{†},

\overset{a}{^}_{k}^{†} = p = 1 \sum M σ =↑, ↓ \sum θ_{k}^{p σ} \overset{a}{^}_{p σ}^{†},

∣ UHF ⟩ = k = 1 \prod N_{↑} \overset{a}{^}_{k ↑}^{†} l = 1 \prod N_{↓} \overset{a}{^}_{l ↓}^{†} ∣0 ⟩,

∣ UHF ⟩ = k = 1 \prod N_{↑} \overset{a}{^}_{k ↑}^{†} l = 1 \prod N_{↓} \overset{a}{^}_{l ↓}^{†} ∣0 ⟩,

\overset{a}{^}_{k ↑}^{†} = p = 1 \sum M θ_{k ↑}^{p} \overset{a}{^}_{p ↑}^{†}, \overset{a}{^}_{l ↓}^{†} = p = 1 \sum M θ_{l ↓}^{p} \overset{a}{^}_{p ↓}^{†} .

\overset{a}{^}_{k ↑}^{†} = p = 1 \sum M θ_{k ↑}^{p} \overset{a}{^}_{p ↑}^{†}, \overset{a}{^}_{l ↓}^{†} = p = 1 \sum M θ_{l ↓}^{p} \overset{a}{^}_{p ↓}^{†} .

∣ GBCS ⟩ = exp (p σ, q γ \sum F_{p σ, q γ} \overset{a}{^}_{p σ}^{†} \overset{a}{^}_{q γ}^{†}) ∣0 ⟩,

∣ GBCS ⟩ = exp (p σ, q γ \sum F_{p σ, q γ} \overset{a}{^}_{p σ}^{†} \overset{a}{^}_{q γ}^{†}) ∣0 ⟩,

∣ Pf ⟩ = \hat{P}_{N} ∣ GBCS ⟩ = \frac{1}{p !} [mσ, nγ \sum F_{mσ, nγ} \overset{a}{^}_{mσ}^{†} \overset{a}{^}_{nγ}^{†}]^{N_{e} /2} ∣0 ⟩ .

∣ Pf ⟩ = \hat{P}_{N} ∣ GBCS ⟩ = \frac{1}{p !} [mσ, nγ \sum F_{mσ, nγ} \overset{a}{^}_{mσ}^{†} \overset{a}{^}_{nγ}^{†}]^{N_{e} /2} ∣0 ⟩ .

∣ UBCS ⟩ = exp (p, q \sum F_{p ↑, q ↓} \overset{a}{^}_{p ↑}^{†} \overset{a}{^}_{q ↓}^{†}) ∣0 ⟩,

∣ UBCS ⟩ = exp (p, q \sum F_{p ↑, q ↓} \overset{a}{^}_{p ↑}^{†} \overset{a}{^}_{q ↓}^{†}) ∣0 ⟩,

∣ GHF ⟩ = k = 1 \prod N_{e} \overset{a}{^}_{k}^{†} ∣0 ⟩ = p = 1 \prod N_{e} /2 (\overset{a}{^}_{2 p - 1}^{†} \overset{a}{^}_{2 p}^{†}) ∣0 ⟩ \propto p = 1 \sum N_{e} /2 \overset{a}{^}_{2 p - 1}^{†} \overset{a}{^}_{2 p}^{†}^{N_{e} /2} ∣0 ⟩ = mσ, nγ \sum p = 1 \sum N_{e} /2 θ_{2 p - 1}^{mσ} θ_{2 p}^{nγ} \overset{a}{^}_{mσ}^{†} \overset{a}{^}_{nγ}^{†}^{N_{e} /2} ∣0 ⟩,

∣ GHF ⟩ = k = 1 \prod N_{e} \overset{a}{^}_{k}^{†} ∣0 ⟩ = p = 1 \prod N_{e} /2 (\overset{a}{^}_{2 p - 1}^{†} \overset{a}{^}_{2 p}^{†}) ∣0 ⟩ \propto p = 1 \sum N_{e} /2 \overset{a}{^}_{2 p - 1}^{†} \overset{a}{^}_{2 p}^{†}^{N_{e} /2} ∣0 ⟩ = mσ, nγ \sum p = 1 \sum N_{e} /2 θ_{2 p - 1}^{mσ} θ_{2 p}^{nγ} \overset{a}{^}_{mσ}^{†} \overset{a}{^}_{nγ}^{†}^{N_{e} /2} ∣0 ⟩,

F = θ A θ^{T},

F = θ A θ^{T},

\hat{J}

\hat{J}

= p σ \geq q γ \prod [1 - \overset{n}{^}_{p σ} \overset{n}{^}_{q γ} (1 - J_{p σ, q γ})],

J_{p σ, q γ} = exp (v_{p σ, q γ}) .

J_{p σ, q γ} = exp (v_{p σ, q γ}) .

⟨ m ∣ ψ ⟩ = ⟨ m ∣ \hat{C} \hat{P} ∣ ϕ_{0} ⟩ = C [m] ⟨ m ∣ \hat{P} ∣ ϕ_{0} ⟩,

⟨ m ∣ ψ ⟩ = ⟨ m ∣ \hat{C} \hat{P} ∣ ϕ_{0} ⟩ = C [m] ⟨ m ∣ \hat{P} ∣ ϕ_{0} ⟩,

C [m] = p σ, q γ \prod occ. J_{p σ, q γ},

C [m] = p σ, q γ \prod occ. J_{p σ, q γ},

⟨ m ∣ \hat{P}_{K} \hat{P}_{S_{z}} \hat{P}_{N} ∣ ϕ_{0} ⟩ = Re [⟨ m ∣ ϕ_{0} ⟩],

⟨ m ∣ \hat{P}_{K} \hat{P}_{S_{z}} \hat{P}_{N} ∣ ϕ_{0} ⟩ = Re [⟨ m ∣ ϕ_{0} ⟩],

⟨ m ∣ GHF ⟩ = det (θ [m]),

⟨ m ∣ GHF ⟩ = det (θ [m]),

⟨ m ∣ GBCS ⟩ = ⟨ m ∣ PF ⟩ = pf (F [m]),

⟨ m ∣ GBCS ⟩ = ⟨ m ∣ PF ⟩ = pf (F [m]),

pf (A) = P \sum (- 1)^{P} A_{i_{1} j_{1}} A_{i_{2} j_{2}} \dots A_{i_{M} j_{M}},

pf (A) = P \sum (- 1)^{P} A_{i_{1} j_{1}} A_{i_{2} j_{2}} \dots A_{i_{M} j_{M}},

pf [0 - C^{T} C 0] \propto det (C) .

pf [0 - C^{T} C 0] \propto det (C) .

⟨ m ∣ BCS ⟩ = ⟨ m ∣ AGP ⟩ = det (F [m]),

⟨ m ∣ BCS ⟩ = ⟨ m ∣ AGP ⟩ = det (F [m]),

E_{L} [n] = \frac{\sum _{m} ⟨ n ∣ H ∣ m ⟩ ⟨ m ∣ ψ ⟩}{⟨ n ∣ ψ ⟩} = m \sum H_{n, m} \frac{⟨ m ∣ ψ ⟩}{⟨ n ∣ ψ ⟩},

E_{L} [n] = \frac{\sum _{m} ⟨ n ∣ H ∣ m ⟩ ⟨ m ∣ ψ ⟩}{⟨ n ∣ ψ ⟩} = m \sum H_{n, m} \frac{⟨ m ∣ ψ ⟩}{⟨ n ∣ ψ ⟩},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Symmetry Projected Jastrow Mean-field Wavefunction in Variational Monte Carlo

Ankit Mahajan

[email protected]

Department of Chemistry and Biochemistry, University of Colorado Boulder, Boulder, CO 80302, USA

Sandeep Sharma

[email protected]

Department of Chemistry and Biochemistry, University of Colorado Boulder, Boulder, CO 80302, USA

Abstract

We extend our low-scaling variational Monte Carlo (VMC) algorithm to optimize the symmetry projected Jastrow mean field (SJMF) wavefunctions. These wavefunctions consist of a symmetry-projected product of a Jastrow and a general broken-symmetry mean field reference. Examples include Jastrow antisymmetrized geminal power (JAGP), Jastrow-Pfaffians, and resonating valence bond (RVB) states among others, all of which can be treated with our algorithm. We will demonstrate using benchmark systems including the nitrogen molecule, a chain of hydrogen atoms, and 2-D Hubbard model that a significant amount of correlation can be obtained by optimizing the energy of the SJMF wavefunction. This can be achieved at a relatively small cost when multiple symmetries including spin, particle number, and complex conjugation are simultaneously broken and projected. We also show that reduced density matrices can be calculated using the optimized wavefunctions, which allows us to calculate other observables such as correlation functions and will enable us to embed the VMC algorithm in a complete active space self-consistent field (CASSCF) calculation.

1 Introduction

Variational Monte Carlo (VMC) is one of the most versatile methods available for obtaining the wavefunction and energy of a system.Foulkes et al. (2001); Nightingale and Umrigar (1999); Toulouse et al. (2016); Kolorenč and Mitas (2011); Becca and Sorella (2017); McMillan (1965); Ceperley et al. (1977) Compared to deterministic variational methods, VMC allows much greater flexibility in the functional form of the wavefunction. In particular, if one can calculate the overlap of the wavefunction with a Slater determinant at polynomial cost, then it is possible to perform an efficient VMC calculation. Thus sometimes VMC is the only feasible method for calculating wavefunctions of challenging quantum systems.

While the accuracy of VMC is limited by the flexibility of the wavefunction ansatz, projector Monte Carlo (PMC) does not suffer from this shortcoming. However, the cost of performing an unbiased PMC calculation for fermionic systems scales exponentially (with the exception of special cases that display usable symmetries) due to the fermion sign problem. The most common way of overcoming this exponential scaling is by introducing the so-called fixed-node bias. The cost of the fixed-node PMC algorithm is polynomial with the system size, but it no longer delivers unbiased energies. The bias, or the error of PMC, strongly depends on the accuracy of the nodal structure which is often obtained from a VMC calculation. Thus, VMC in addition to being extremely powerful in its own right, also determines the accuracy of various flavors of PMC calculations such as diffusion Monte Carlo (DMC),Grimm and Storer (1971); Anderson (1975); Ceperley and Alder (1980) Green’s function Monte Carlo (GFMC)Runge (1992); Trivedi and Ceperley (1990); Sorella (1998) and Auxilliary field quantum Monte Carlo (AFQMC).Zhang and Krakauer (2003); Motta and Zhang

The most commonly used version of VMC is the real-space-VMC, while its orbital-space counterpart is predominately restricted to use with model Hamiltonians such as Hubbard, Heisenberg, etc. The major reason for this is that the cost of performing orbital-space VMC on an ab-initio Hamiltonian scales a factor of $O(N)$ worse than both, (1) the cost of performing real-space VMC on an ab-initio Hamiltonian and (2) the cost of performing orbital-space VMC on a model Hamiltonian. Recently, we have demonstrated that this cost discrepancy can be removed by introducing three algorithmic improvementsSabzevari and Sharma (2018). The most significant of these allowed us to efficiently screen the two-electron integrals which are obtained by projecting the ab-initio Hamiltonian onto a finite orbital basis. The efficient screening reduced the cost of local energy calculation from $O(N^{4})$ to $O(N^{2})$ and lowered the overall cost of the algorithm for obtaining a system-size-independent stochastic error from $O(N^{5})$ to $O(N^{4})$ bringing it on par with other VMC calculations.

In this follow-up work, we will use this newly introduced algorithm to study the performance of the various SJMF states. Projection of symmetry broken mean field (without the Jastrow) wavefunctions is a well-established technique in nuclear physicsRing and Schuck (2004); Blaizot and Ripka (1986) and electronic structure theory.Bach et al. (1994); Löwdin (1955) Recently, these wavefunctions have received renewed attention due to the work of Scuseria et al.Scuseria et al. (2011); Jiménez-Hoyos et al. (2012), who have shown that several symmetries can simultaneously be broken to recover a significant part of the strong correlation at a mean-field cost. They have also shown that a greater amount of strong electron correlation can be captured by including a linearized form of the Jastrow factor with these spin-projected reference states.Henderson and Scuseria (2013) Attempts to include dynamical correlation in this context have also appeared including perturbation theory,Tsuchimochi and Ten-no (2016) configuration interaction,Tsuchimochi and Ten-no (2016) and coupled cluster.Qiu et al. (2018)

Unfortunately not all these symmetry projected mean field wavefunctions (e.g., AGP) are size consistent. This shortcoming can be remedied by using local number projectors, which take the form of Hilbert space Jastrow factors as shown by Neuscamman.Neuscamman (2012, 2013) In real space VMC, use of Jastrow factors with AGP wavefunctions was first proposed by Sorella et al.Casula and Sorella (2003); Casula et al. (2004); Sorella et al. (2007) In addition to making the wavefunction size-consistent, the Jastrow correlator also recovers some dynamical correlation missing from the symmetry projected mean field reference.Neuscamman (2016) The drawback is that it is no longer possible to calculate the energy efficiently using a deterministic algorithm and one has to resort to the VMC algorithm. Imada and co-workersTahara and Imada (2008, 2008); Kurita et al. (2015); Zhao et al. (2017); Darmawan et al. (2018) have performed VMC calculations using these SJMF wavefunctions to study model Hamiltonians. Here, we use the full exponential form of the Jastrow on top of a reference that breaks number, spin, and complex conjugation symmetries as the wavefunction ansatz in variational Monte Carlo. We will show that this general wavefunction can be used to study molecular systems across potential energy surfaces and model systems over their parameter space.

The rest of the article is organized as follows. We begin by recapitulating the most important aspects of our VMC algorithm in Section 2.1. We then discuss the wavefunction ansatz and details of symmetry breaking and projection (2.2), here we explain the notation used by various researchers and the relations between them. In section 2.3, we present the computational details for evaluating local energy and its gradient efficiently. Finally, we present benchmark results of calculations on the dissociation of the \ceN2 molecule, hydrogen chain, and the two dimensional Hubbard model (3).

2 Theory and Methods

2.1 Algorithm

In VMC, the energy of a suitably parametrized wavefunction is minimized. The energy of a wavefunction $\Psi(\mathbf{p})$ , where $\mathbf{p}$ is the set of wavefunction parameters, is calculated by performing a Monte Carlo summation.

[TABLE]

where, $E_{L}[\mathbf{n}]=\frac{\langle\mathbf{n}|H|\Psi(\mathbf{p})\rangle}{\langle\mathbf{n}|\Psi(\mathbf{p})\rangle}$ is the local energy of a Slater determinant $\mathbf{n}$ and $\rho_{\mathbf{n}}=\frac{|\langle\Psi(\mathbf{p})|\mathbf{n}\rangle|^{2}}{\sum_{\mathbf{k}}|\langle\Psi(\mathbf{p})|\mathbf{k}\rangle|^{2}}$ is the probability distribution of the determinants in the wavefunction. There are three aspects of a VMC algorithm: (1) efficient calculation of the local energy, (2) sample determinants according to the probability distribution $\rho_{\mathbf{n}}$ and (3) optimizing the parameters $\mathbf{p}$ to minimize the energy of the wavefunction. We have recently proposed a set of improvements to all these steps to reduce the cost and lower the scaling of the algorithm. These will be summarized below, but we refer the reader to our original publication Ref.16 for more details.

Reduced scaling evaluation of the local energy

The local energy $E_{L}[\mathbf{n}]$ of a determinant $|\mathbf{n}\rangle$ is calculated as follows

[TABLE]

where the sum is over all determinants $\mathbf{m}$ that have a nonzero transition matrix element ( $H_{\mathbf{n},\mathbf{m}}=\langle\mathbf{n}|H|\mathbf{m}\rangle$ ) with $\mathbf{n}$ . The number of such non-zero matrix elements $H_{\mathbf{n},\mathbf{m}}$ is on the order of $n_{e}^{2}n_{o}^{2}$ , where $n_{e}$ is the number of electrons and $n_{o}$ is the number of open orbitals. This number increases as a fourth power of the system size and naive use of this formula results in an algorithm that scales poorly with the size of the problem. To reduce the cost of calculating the local energy we truncate the summation over all $\mathbf{m}$ to just a summation over those $\mathbf{m}$ , that have a Hamiltonian transition matrix element larger than a user-specified threshold i.e.

[TABLE]

where $\epsilon$ on the summation denotes that only those terms are included for which $|H_{\mathbf{n},\mathbf{m}}|>\epsilon$ . Note that in the limit that $\epsilon\rightarrow 0$ , we recover the exact local energy, $E_{L}[\mathbf{n},\epsilon]\rightarrow E_{L}[\mathbf{n}]$ . It is useful to note that when a local basis set is used the number of elements $H_{\mathbf{n},\mathbf{m}}$ that have a magnitude larger than a fixed non-zero $\epsilon$ scales quadratically with the size of the system. Thus if we are able to efficiently screen the transition matrix elements for a given $\epsilon\neq 0$ , no matter how small the $\epsilon$ is, we are guaranteed to obtain a quadratically scaling evaluation of the local energy. This trick of screening matrix elements is inspired by the heat-bath configuration interaction (HCI) algorithm.Holmes et al. (2016)

Continuous time Monte Carlo for sampling determinants

The usual procedure for generating determinants $\mathbf{n}$ according to a probability distribution $\rho_{\mathbf{n}}$ uses the Metropolis-Hastings algorithm in which a random walk is performed to generate a Markov chain. The efficiency of this algorithm depends on the proposal probability distribution used in simulating the random walks. A good proposal probability distribution will lead to large steps with very few rejections, but in practice, it is not easy to suggest such a distribution. In this work, we bypass the need for devising complicated proposal probability distributions, by using the continuous time Monte Carlo (CTMC).Bortz et al. (1975); Gillespie (1976) This is an alternative to the Metropolis-Hastings algorithm and has the advantage that every proposed move is accepted. Here, the CTMC algorithm is realized by using the following steps:

Starting from a determinant $\mathbf{n}$ calculate the quantity

[TABLE]

for all determinants $\mathbf{m}$ that are connected to $\mathbf{n}$ by a single excitation or a double excitation. 2. 2.

Calculate the quantity $t_{\mathbf{n}}=\frac{1}{\sum_{\mathbf{m}}r(\mathbf{m}\leftarrow\mathbf{n})}$ and assign that weight to the walker $\mathbf{n}$ . 3. 3.

Next, a new determinant is selected, without rejection, out of all the determinants $\mathbf{m}$ with a probability proportional to $r(\mathbf{m}\leftarrow\mathbf{n})$ .

We note that in the VMC algorithm, the quantities $\left|\frac{\langle\mathbf{m}|\Psi(\mathbf{p})\rangle}{\langle\mathbf{n}|\Psi(\mathbf{p})\rangle}\right|$ are already used in the calculation of the local energy (see Equation 3) and just by storing those quantities the CTMC algorithm can be used with almost no overhead once the local energy has been calculated.

AMSGrad algorithm for optimizing the energy

The optimized wavefunction ( $\Psi(\mathbf{p})$ ) is obtained by minimizing its energy with respect to its parameters $\mathbf{p}$ . This is a challenging optimization problem because the energy is a non-linear function of the wavefunction parameters. Further, the gradient of the energy with respect to the parameters is noisy because a stochastic method is used. Several first-order optimization algorithms (that only require gradients) such as the conjugate gradient method become unstable when the gradient is noisy. Booth and coworkersSchwarz et al. (2017) first proposed the use of adaptive stochastic gradient descent (SGD) optimization in VMC. In our previous work, we have demonstrated that the SGD method called AMSGradReddi et al. (2018) can be used to effectively optimize the VMC wavefunctions. In AMSGrad an exponentially decaying moving average of the first and second moments $\mathbf{m}$ and $\mathbf{n}$ are respectively calculated

[TABLE]

where, $\beta_{1}$ and $\beta_{2}$ are used to determine the rate of decay. These first and second moments are then used to update the parameters ( $\mathbf{p}$ )

[TABLE]

where, $\alpha$ determines the step size. In equations 6 and 7, the product and division are element-wise operations. AMSgrad has the advantage that the CPU and memory cost scales linearly with the number of wavefunction parameters and in our tests, it outperformed simple gradient descent. In the calculations performed in this paper, we have used the parameters $\alpha=0.01,\beta_{1}=0.1,\beta_{2}=0.01$ which give satisfactory convergence rates (in some cases we have to run a few iterations with less aggressive parameters until reasonable estimates of the first and second moments $\mathbf{m}$ and $\mathbf{n}$ are built up).

2.2 Wavefunctions

The accuracy of the VMC results depends critically on the wavefunction ansatz employed. The ansatz needs to be general enough to capture the relevant physics of the system, however, to be computationally tractable with the VMC algorithm, the wavefunction must allow efficient computation (at polynomial cost) of the overlap with a Slater determinant. A wavefunction that satisfies both these requirements is used in this work and has the form

[TABLE]

where $\hat{C}$ is a correlator and $\hat{P}$ is the projector that restores symmetries of the symmetry-broken mean-field reference $|\phi_{0}\rangle$ . A combination of different correlators and references can be used to represent the ground state of the system under study. In this section, we study each of these terms in detail.

Mean field Reference

The reference describes uncorrelated electrons, in other words, it is the ground state of a general quadratic Hamiltonian. Here we will allow the mean field wavefunction ( $\phi_{0}$ ) of the system to break the symmetries of the Hamiltonian, which gives the wavefunction additional variational flexibility resulting in lower energies. However, the wavefunction also has a projector ( $\hat{P}$ ), that restores these symmetries. The functional form of the resulting wavefunction $\hat{P}|\phi_{0}\rangle$ depends on the symmetries that are broken and restored, which we will describe in this section.

In a finite system, the true ground state obeys all the symmetries of the Hamiltonian. On the other hand, the VMC wavefunction is an approximate ansatz and forcing it to obey these symmetries can only restrict its variational freedom thereby raising its energy.Lykos and Pratt (1963) One can get around this by allowing the reference to break the symmetries and then projecting it onto the desired symmetry sector. This has the advantage of affording the wavefunction more variational freedom as well as correct symmetries. More physically, breaking symmetries introduces quantum fluctuations in the reference necessary for representing multideterminant states while projection serves to filter out unwanted fluctuations.

We note an important point the regarding optimization of such wavefunctions. One could either optimize the symmetry-broken reference without projectors and apply the projectors after optimization. Alternatively, one could optimize the symmetry-broken reference in the presence of projectors. The wavefunction produced by the former procedure is in the variational space explored by the latter. Thus the variation after projection approach is more general and always gives lower energies. In this work, we will use the wavefunction obtained by performing the variation after projection approach.

Here we will break and restore the particle number, spin, and complex conjugation symmetries. Molecular electronic systems in the absence of spin-orbit coupling and Hubbard model with real hopping parameters obey all these symmetries, i.e.

[TABLE]

where $\hat{N}$ is the number operator, $\mathbf{\hat{S}}$ is the vector spin operator, and $\hat{K}$ is the complex conjugation operator. The number ( $U(1)$ ) and spin ( $SU(2)$ ) symmetries are continuous, while complex conjugation is discrete. The projection after variation approach has previously been used in deterministic algorithms,Scuseria et al. (2011); Jiménez-Hoyos et al. (2012) where continuous symmetry projectors were written by discretizing the integrals obtained by using the generator coordinate method, while the discrete projector $\hat{P}_{K}$ was implemented by diagonalizing a $2\times 2$ matrix. In VMC, these expensive integrals can be avoided for certain symmetries. To see this, recall that the central quantity of interest is the overlap ( $\langle\mathbf{n}|\hat{P}|\phi_{0}\rangle$ ) of the wavefunction ( $\hat{P}|\phi_{0}\rangle$ ) with a walker which is simply a Slater determinant ( $|\mathbf{n}\rangle$ ). In VMC, instead of applying the projector on to the symmetry broken mean-field wavefunction, we apply it to the walker $|\mathbf{n}\rangle$ . Thus, the $\hat{N}$ and $\hat{S}_{z}$ projections can be done by using walkers with the desired number of electrons and spin component. The $\hat{S}^{2}$ projection still needs to be done using an integral projector and we will not be using it here. Complex conjugation symmetry can be restored by simply taking the real part of the overlap ( $\langle\mathbf{n}|\hat{P}|\phi_{0}\rangle$ ).

First, let’s consider reference states that have a fixed particle number, i.e. those obeying the particle number symmetry. These are the Slater determinants widely used in Hartree-Fock (HF) methods. The most general form of a Slater determinant used is the generalized HF (GHF) which is given by

[TABLE]

where $N_{e}$ is the number of electrons and $\hat{a}_{k}^{\dagger}$ creates an electron in the molecular orbital $k$ given by

[TABLE]

where $\hat{a}_{p\sigma}^{\dagger}$ creates an electron in the spatially local atomic orbital $p$ with spin $\sigma$ , $M$ is the number of atomic orbitals, and $\theta^{p\sigma}_{k}$ are complex numbers. In this paper, we order the spin orbital indices such that all spin up orbitals come before all the spin down ones, i.e. $i\uparrow<j\downarrow$ for all $i$ and $j$ . Note that the GHF molecular orbitals are not separable into spatial and spin parts in general, i.e. their spatial and spin degrees of freedom can be entangled. By putting constraints on the $\theta$ matrix we can obtain specialized forms of Slater determinants. The unrestricted Hartree Fock (UHF) wavefunction is given by

[TABLE]

where

[TABLE]

In restricted Hartree Fock (RHF), the state is further restricted by the requirement $\theta^{p}_{k\uparrow}=\theta^{p}_{k\downarrow}$ .

Now, let’s look at reference states that break the particle number symmetry. Here we will only consider systems with an even number of electrons, although extension to the odd case is possible. The most general such state for a system with even number of electrons is given by

[TABLE]

where $p$ and $q$ are the spatial orbital indices, while $\sigma$ and $\gamma$ are spin indices. $F_{p\sigma,q\gamma}$ are complex numbers and $F_{p\sigma,q\gamma}=-F_{q\gamma,p\sigma}$ due to fermionic anticommutation. This is a generalized Bardeen-Cooper-Schrieffer (GBCS) wavefunction. Its number projected form ( $\hat{P}_{N}|\text{GBCS}\rangle$ ) is known as PfaffianBecca and Sorella (2017)

[TABLE]

By allowing only opposite spin triplet pairings, i.e.

[TABLE]

we get the unrestricted BCS (UBCS) wavefunction. Further restricting the pairing matrix to be symmetric ( $F_{p\uparrow,q\downarrow}=F_{q\uparrow,p\downarrow}$ ) ensuring that each pairing is a singlet, we get the conventional restricted BCS (RBCS) wavefunction. Its number projected form ( $\hat{P}_{N}|\text{RBCS}\rangle$ ) is known as antisymmetrized geminal product (AGP).Casula and Sorella (2003); Casula et al. (2004)

Despite their distinct appearance, Slater determinants and pairing wavefunctions are closely relatedMisawa et al. (2019). We can express the GHF wavefunction in a pairing form as follows:

[TABLE]

where in the third line we have ignored an unimportant normalization factor and used the fact that products of pairs of creation operators commute with each other. This shows that GHF is a special form of Pfaffians given in Equation 13. The explicit relation between the GHF coefficient matrix and the corresponding Pfaffian pairing matrix is thus given (to within an unimportant overall normalization factor) by

[TABLE]

where $A$ is a $N_{e}\times N_{e}$ block diagonal matrix with $\begin{bmatrix}0&1\\ -1&0\end{bmatrix}$ as blocks. We can similarly prove that RHF is a special case of AGP.

Correlators

Correlation between the motion of different electrons is not completely captured by the reference outlined above. The correlators ( $\hat{C}$ ) acting upon a reference encode these correlations explicitly and can in principle, with sufficiently large correlators, account for 100% of the electron correlation. We have previously used correlated product states (CPS)Neuscamman et al. (2011, 2012) as correlators, however, here we use Hilbert space Jastrows because of their more compact representation. They are completely equivalent to two-electron CPS. We use the following form of the Jastrow:

[TABLE]

where $n_{p\sigma}$ and $n_{q\gamma}$ are number operators for the spin orbitals $p\sigma$ and $q\gamma$ , respectively and $J_{p\sigma,q\gamma}$ are the variational parameters related to $v_{p\sigma,q\gamma}$ in the exponential form by

[TABLE]

The Jastrow is not invariant to unitary rotations of these spin orbitals and thus a judicious choice is necessary to ensure good quality. Although the optimal choice of the spin orbital basis is far from obvious, the use of local basis ensures that the wavefunction is size consistent due to its ability to perform local particle number projections.Neuscamman (2012, 2013, 2016) Jastrows in local basis includes the onsite Gutzwiller factorsGutzwiller (1963) as well as long-range density correlations.Capello et al. (2005, 2005) Thus, in this work, we use local bases to represent the Jastrows. It has also been shown that Jastrow is a limited form of coupled cluster doubles operator,Neuscamman (2016) that impart some dynamical correlation to the wavefunction.

2.3 Computational details

At each iteration of the VMC algorithm, the overlap between a walker and the wavefunction is needed. This overlap can be calculated by using the expression

[TABLE]

where we have used the fact that the Jastrow is diagonal in the configuration space of the local orbitals. $|\mathbf{m}\rangle$ is a determinant in the local orbital Hilbert space used in the definition of references orbitals (Eq 10) and Jastrow factors (Eq. 16). Let us examine each of these terms in more detail.

•

The overlap with the Jastrow is simply given by

[TABLE]

where the product is over all pairs of occupied spin orbitals. This computation has $O(N^{2})$ cost.

•

For the projectors considered here, we have

[TABLE]

where we have used the fact that the walker $|\mathbf{m}\rangle$ is an $\hat{S_{z}}$ and $\hat{N}$ eigenstate with desired eigenvalues.

•

When $|\phi_{0}\rangle$ is the GHF state, we get

[TABLE]

where $\text{det}(\theta[\mathbf{m}])$ is the determinant of the $N\times N$ matrix $\theta\left[\mathbf{m}\right]$ which itself is the slice of the coefficient matrix obtained by using the rows corresponding to spin orbitals occupied in $|\mathbf{m}\rangle$ . Overlaps can be calculated similarly for UHF and RHF. For the GBCS wavefunction, we have

[TABLE]

where $F\left[\mathbf{m}\right]$ is the slice of the pairing matrix obtained by using rows and columns corresponding to spin orbitals occupied in $|\mathbf{m}\rangle$ . Pfaffian of a skew-symmetric matrix $[A]_{2M\times 2M}$ is defined as

[TABLE]

where the sum is over all partitions $\{(i_{k},j_{k})\}$ of the $2M$ indices with $i_{k}<j_{k}$ and $(-1)^{\mathcal{P}}$ is the parity of the partition $\mathcal{P}$ . It is possible to calculate the Pfaffian of a skew-symmetric matrix in $O(N^{3})$ (same as the determinant calculation) steps using the Parlett-Reid algorithm which is based on Gaussian transformations.Wimmer (2012) Pfaffian has the property

[TABLE]

The pairing matrix for BCS has the form on the left side of the above equation. Thus

[TABLE]

where $F\left[\mathbf{m}\right]$ is the slice of the AGP pairing matrix with rows and columns corresponding to up and down spin orbitals occupied in $|\mathbf{m}\rangle$ , respectively.

Local energy calculation

The local energy $E_{L}[\mathbf{n}]$ of a determinant $|\mathbf{n}\rangle$ is calculated as follows

[TABLE]

where the sum is over all determinants $\mathbf{m}$ that have a nonzero transition matrix element ( $H_{\mathbf{n},\mathbf{m}}=\langle\mathbf{n}|H|\mathbf{m}\rangle$ ) with $\mathbf{n}$ . Note that for the molecular Hamiltonian only determinants connected by at most two electron excitations have a nonzero transition matrix element. For performing fast VMC calculations it is essential to be able to calculate the overlap ratios

[TABLE]

efficiently. A naive calculation of the correlator overlap ratio by individually calculating both numerator and denominator has cost $O(N^{2})$ . We can reduce this cost by calculating and storing the following vector at the start of the calculation:

[TABLE]

which has a length equal to the number of local spin orbitals. The overlap ratio with a determinant $|\mathbf{m}\rangle$ obtained from $|\mathbf{n}\rangle$ by a single excitation $i\sigma\rightarrow a\sigma$ is given in terms of $v_{\mathbf{n}}$ as

[TABLE]

A similar equation, but still with $O(1)$ cost, can be obtained for double excitations. As the walker moves during a simulation, changing by at most two excitations, the $v_{\mathbf{n}}$ vector is updated in $O(N)$ steps.

A naive implementation of the overlap ratios for the reference would have an $O(N^{3})$ cost. Here also, we can store intermediate quantities and reduce this cost by virtue of the fact that the determinants in the ratio differ by at most a double excitation.

[TABLE]

For GHF Slater determinants, we have

[TABLE]

An excitation of one electron amounts to a change of a row in the determinant. For a determinant $|\mathbf{m}\rangle$ obtained from the excitation $i\sigma\rightarrow a\sigma$ , the ratio can be calculated using the Woodbury lemma111M. Brookes, The Matrix Reference Manual, (2011); see online at http://www.ee.imperial.ac.uk/hp/staff/dmb/matrix/intro.html. :

[TABLE]

Thus, by precalculating and storing the matrix $R[\mathbf{n}]_{a\sigma,i\sigma}$ , the ratios of determinants can be calculated with an $O(1)$ cost. Similar expressions can be derived for double excitations that allow calculations of overlap ratios to be evaluated at $O(1)$ cost. Calculation and update of the $R[\mathbf{n}]_{a\sigma,i\sigma}$ matrix require the inverse of $\theta[\mathbf{n}]$ . We can avoid calculating the inverse at every step, an $O(N^{3})$ scaling operation, by updating it using the Sherman-Morrison formula in $O(N^{2})$ steps. Similar relations can also be derived for RHF and UHF references. As outlined earlier, the AGP overlaps are given by determinants as well which implies that their overlap ratios can be calculated using similar manipulations with an $O(N^{2})$ cost.

Overlap ratio for the GBCS wavefunction is given byBajdich et al. (2006, 2008)

[TABLE]

Again instead of calculating both overlaps separately, the ratio can be calculated at a reduced cost by using an identity analogous to the determinant lemma, given byBecca and Sorella (2017); Morita et al. (2015); Misawa et al. (2019)

[TABLE]

where $C$ is a $2m\times 2m$ skew-symmetric matrix and $B$ is $N\times 2m$ matrix, chosen to affect the update in the top equation for an $m$ electron excitation. Since $m$ is at most 2, the Pfaffians on the right hand side can be calculated explicitly. For a single excitation $i\sigma\rightarrow a\sigma$ by choosing the $B$ and $C$ matrices appropriately, we get

[TABLE]

where $F^{c}[\mathbf{n}]$ is the slice of the pairing matrix obtained by using rows and columns corresponding to unoccupied and occupied spin orbitals in $|\mathbf{n}\rangle$ , respectively. For a double excitation $i\sigma\rightarrow a\sigma$ and $j\gamma\rightarrow b\gamma$ (assuming $i\sigma<j\gamma$ ), we get

[TABLE]

where $F^{r}[\mathbf{n}]$ is the slice of the pairing matrix obtained by using rows and columns corresponding to occupied and unoccupied spin orbitals in $|\mathbf{n}\rangle$ , respectively. We precalculate the $R[\mathbf{n}]$ and $F^{c}[\mathbf{n}]F[\mathbf{n}]^{-1}F^{r}[\mathbf{n}]$ matrices, and update them before each Monte Carlo iteration in $O(N^{2})$ cost. To avoid the expensive direct calculation of the inverse in this expression, we instead use the inverse update identity

[TABLE]

Gradient overlap ratios

Gradient overlap ratios are given by

[TABLE]

where $\mathbf{p}$ denotes the vector of all wavefunction parameters and

[TABLE]

is the wavefunction derivative with respect to the $i$ th parameter. For Jastrow parameters, we have

[TABLE]

where $n_{p}$ and $n_{q}$ are occupation numbers and we have suppressed the spin indices for clarity. Thus, we get

[TABLE]

Since the parameters in the reference ( $\phi_{i}$ ) are complex, we need to consider derivatives with respect to their real and imaginary parts. For the derivative with respect to the real part, we get

[TABLE]

Similarly for the derivative with respect to the imaginary part

[TABLE]

For Slater determinants

[TABLE]

For Pfaffians

[TABLE]

3 Results

Before discussing results we make a few remarks about notation for symmetry restored wavefunctions. All the reported VMC energies refer to wavefunctions that are $N$ , $S_{z}$ and $K$ eigenfunctions. We use the prefix $S_{z}$ and K if these symmetries are broken and restored in the reference state. For example, K $S_{z}$ GHF denotes a complex conjugation and $S_{z}$ projected GHF wavefunction, while $S_{z}$ GHF denotes an $S_{z}$ projected GHF wavefunction which does not break the complex conjugation symmetry. We will denote number symmetry restored GBCS and RBCS wavefunctions by their conventional names Pfaffian and AGP, respectively. Jastrow factors are denoted by adding the prefix J to the wavefunction name.

The initial guess for Slater determinant calculations was computed using the Hartree-Fock modules in PySCF.Sun et al. (2018) For pairing wavefunctions, we used the converged result of the corresponding Slater determinant calculations as the initial guess. We used our selected CI program DiceSharma et al. (2017); Li et al. (2018); Smith et al. (2017) to obtain full configuration interaction (FCI) energies and MOLPROWerner et al. (2012) to get complete active space perturbation theoryWerner (1996); Celani and Werner (2000) (CASPT2) energies for \ceN2. For several systems, we have also performed the fixed node Green’s function Monte Carlo (GFMC) calculations.Van Bemmel et al. (1994); ten Haaf et al. (1995) The GFMC calculations use the VMC wavefunctions as the trial state and deliver variational energies that are strictly between the VMC results and the true ground state energy. (The details of the GFMC algorithm will be reported in a forthcoming publication.

3.1 \ceHydrogen chain

Hydrogen chains are important systems in their own right and embody some of the more interesting physics of real systems. They include long range Coulomb interactions and by changing the interatomic distance the strength of the correlation can be modulated. Exact results for large chains at all bond lengths can be obtained using the Density Matrix Renormalization Group (DMRG) algorithm. They are a challenging benchmark system for our method, particularly so, because our wavefunctions do not make use of the fact that the underlying system is one dimensional.

We first present results on the small open \ceH8 chain using the minimal sto-6g basis to illustrate the quality of different wavefunctions discussed above. This system is small enough (4900 determinants in the $S_{z}=0$ subspace) to allow deterministic sampling (every determinant is visited once) of the wavefunction, so there are no stochastic errors in the results. Fig 1 shows the ground state potential energy curves obtained for symmetric dissociation of the chain, and Table 2 shows the errors in the energies relative to the FCI energy for three different geometries. The RHF and AGP wavefunctions are not size-consistent and do not approach the correct dissociation limit. All the other wavefunctions shown here converge to the exact FCI limit. For this system UHG and GHF energies are identical. Restoring the broken $S_{z}$ symmetry of the GHF wavefunction recovers slightly more than half of the missing correlation energy, while additionally breaking and restoring the complex conjugation symmetry cuts the remaining error by another 50%. Breaking and restoring the number symmetry of the K $S_{z}$ GHF wavefunction leading to K $S_{z}$ Pfaffian results in further significant improvement in accuracy. The J-K $S_{z}$ Pfaffian wavefunction (not shown in the plot) has energy errors less than 0.2 mEh for all the geometries considered. We have not restored the $S^{2}$ symmetry in any of these calculations, and work is currently underway to utilize this and other symmetries, e.g. point group symmetry, within our VMC implementation.

Table 3 shows the errors in ground state energies of an open \ceH50 chain at different interatomic distances. We used the sto-6g minimal basis in this calculation. This much larger system has $\sim 10^{28}$ determinants in the $S_{z}=0$ subspace. For all geometries, we used the screening parameter value of $\epsilon=10^{-6}$ , which was sufficiently small to obtain results that are converged to all shown decimal places.Sabzevari and Sharma (2018) The energy per electron obtained from the J-K $S_{z}$ Pfaffian wavefunction is within 1 mEh of the exact DMRG results. The fixed node GFMC calculations performed using the converged VMC wavefunction as the trial wavefunction recover a significant amount of correlation while also giving a strictly variational result. The GFMC energies per electron obtained using the J-K $S_{z}$ Pfaffian trial wavefunction are within 0.2 mEh of the DMRG energies. From these results, it is clear that Pfaffian wavefunctions seem to offer only a marginal improvement over GHF states in this case. This is in contrast to the results of the non-Jastrow calculations on the \ceH8 chain where Pfaffian results were significantly superior to the GHF results, which indicates that the Jastrow factors are able to make up for the missing correlation between the Pfaffians and GHF in this system. Another important observation about the calculations is that the Pfaffian wavefunctions are significantly more difficult to optimize and many more iterations are needed to obtain (apparent) convergence. It is possible that more sophisticated, albeit expensive, optimization schemes such as the linear methodUmrigar et al. (2007); Toulouse and Umrigar (2007, 2008) may converge to lower energies for these wavefunctions.

3.2 \ceN2

Correctly dissociating the N2 molecule is a significant challenge for several electronic structure methods. Here we perform several SJMF calculations with various broken symmetries. The Jastrow factors are defined over orthogonal local orbitals and to obtain these orbitals we first symmetrically orthogonalized the atomic orbitals using Lowdin’s ( $\mathbf{S}^{-1/2}$ ) procedure. We performed an additional unitary transformation that performs rotations among orbitals on the same nitrogen atom to obtain $sp$ hybrid orbitals. In our testing, these hybrid orbitals were found to give lower energies and faster convergence compared to bare Lowdin orbitals.

Fig. 2 shows the errors in ground state energy in the 6-31g basis. The Jastrow-K $S_{z}$ Pfaffian wavefunction gives better absolute energies than CASPT2, with $p$ valence active space (minimal active space required for qualitatively correct energies), for all geometries considered. It has a lower non-parallelity error as well. Although the Jastrow helps capture a significant amount of correlation in this basis set, our calculations with larger DZ/TZ basis sets have shown it to not be an efficient way of obtaining dynamical correlation. This suggests that a perturbation theory or CI expansions on top of these wavefunctions may be a better way to add dynamical correlation.

3.3 Two dimensional tilted Hubbard model

In this section, we consider the Hubbard model on a two dimensional tilted square lattice, with the Hamiltonian

[TABLE]

where $\langle\ \rangle$ denotes nearest neighbors. We report calculations on the [math] tilted $3\sqrt{2}\times 3\sqrt{2}$ square lattice with periodic boundary conditions for which exact Lanczos diagonalization results are available.Becca et al. (2000) Table 4 shows errors in the ground state energy at half-filling. It is again evident that breaking and restoration of complex conjugation symmetry lower the energy significantly.

In table 5, we show the results for the much bigger half-filled $7\sqrt{2}\times 7\sqrt{2}$ lattice containing 98 sites. Since exact energies for this lattice are not available, we compare our energies to GFMC energies reported in reference 68. These were obtained using a Jastrow-Slater trial wavefunction with backflow correlation included and were shown to converge to a thermodynamic limit with an error of $0.0015t$ relative to the exact AFQMC limit.

Correlation functions can be used to extract useful physical information from a wavefunction. Their accuracy reflects the quality of the wavefunction. Here we calculate the density-density correlation functions given by

[TABLE]

This function can be calculated using Monte Carlo sampling in a manner analogous to energy and gradient calculations:

[TABLE]

where, $N^{ij}_{L}[\mathbf{n}]=\frac{\langle\mathbf{n}|n_{i}n_{j}|\Psi(\mathbf{p})\rangle}{\langle\mathbf{n}|\Psi(\mathbf{p})\rangle}$ is the local correlation function that is averaged during a Monte Carlo run, and $i$ and $j$ are spatial orbital indices. Note that, unlike the energy, local correlation function does not satisfy the zero variance principle and we expect the results to be noisier than the energies. In table 6, the values of the correlation function are shown for the $3\sqrt{2}\times 3\sqrt{2}$ lattice with $U/t=4$ . In this lattice, only five unique $\langle n_{i}n_{j}\rangle$ values exist due to symmetry. For reference we use the values obtained using DMRG, which for this small two-dimensional system, gives correlation function values very close to that of the exact wavefunction. The agreement between the VMC and DMRG wavefunctions is good and is not worse than the error in the total energies.

4 Conclusions

In this paper, we have presented a VMC algorithm for optimizing SJMF wavefunctions. We described a unified hierarchy of wavefunctions that have appeared in different contexts. VMC provides an efficient route to optimizing these wavefunctions and the symmetry projectors used here can be applied in a natural and efficient manner. Our benchmark calculations demonstrate that these wavefunctions are capable of accurately describing strong correlations. In particular, the restoration of complex conjugation symmetry, which has not been used in VMC before to the best of our knowledge, yields significant relaxation of energies and can be potentially used with many other wavefunctions. Because Jastrows are capable of local number and $S_{z}$ projections, the J- $S_{z}$ Pfaffian and J- $S_{z}$ GHF wavefunctions are exactly size-consistent.

Other symmetries, including $S^{2}$ , point group, time reversal, and translational symmetry can also be restored in a VMC approach and work is underway in this direction. This will allow us to obtain more accurate correlation functions. It will be a challenge to make $S^{2}$ projected wavefunctions size-consistent, while also keeping the calculation cost down. Another possible improvement is using more sophisticated optimization methods such as the linear method to avoid the large number of iterations needed to optimize the wavefunctions containing Pfaffians. We are also working towards implementing ways to add dynamical correlation beyond these wavefunctions using the configuration interaction approach and perturbation theory.Sharma et al. (2017); Jeanmairet et al. (2017)

Since the accuracy and performance of projection Monte Carlo techniques depends critically on the quality of the trial wavefunction used, our VMC wavefunctions will be useful in such approaches. It will be interesting to analyze how symmetry breaking and projection affect the nodal structure of the wavefunction. Using the approach developed in Ref. 71, one can use Jastrow correlated wavefunctions in AFQMC as well.

5 Acknowledgements

We thank Cyrus Umrigar for several helpful discussions. The funding for this project was provided by the national science foundation through the grant CHE-1800584.

Bibliography71

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Foulkes et al. (2001) Foulkes, W. M. C.; Mitas, L.; Needs, R. J.; Rajagopal, G. Quantum Monte Carlo simulations of solids. Rev. Mod. Phys. 2001 , 73 , 33–83.
2Nightingale and Umrigar (1999) Nightingale, M. P., Umrigar, C. J., Eds. Quantum Monte Carlo Methods in Physics and Chemistry ; NATO ASI Ser. C 525; Kluwer: Dordrecht, 1999.
3Toulouse et al. (2016) Toulouse, J.; Assaraf, R.; Umrigar, C. J. In Electron Correlation in Molecules – ab initio Beyond Gaussian Quantum Chemistry ; Hoggan, P. E., Ozdogan, T., Eds.; Advances in Quantum Chemistry; Academic Press, 2016; Vol. 73; pp 285–314.
4Kolorenč and Mitas (2011) Kolorenč, J.; Mitas, L. Applications of quantum Monte Carlo methods in condensed systems. Rep. Prog. Phys. 2011 , 74 , 026502.
5Becca and Sorella (2017) Becca, F.; Sorella, S. Quantum Monte Carlo Approaches for Correlated Systems ; Cambridge University Press, 2017.
6Mc Millan (1965) Mc Millan, W. L. Ground State of Liquid He 4 superscript He 4 {\mathrm{He}}^{4} . Phys. Rev. 1965 , 138 , A 442–A 451.
7Ceperley et al. (1977) Ceperley, D.; Chester, G. V.; Kalos, M. H. Monte Carlo simulation of a many-fermion study. Phys. Rev. B 1977 , 16 , 3081–3099.
8Grimm and Storer (1971) Grimm, R.; Storer, R. Monte-Carlo solution of Schrödinger’s equation. J. Comp. Phys. 1971 , 7 , 134 – 156.