Diagrammatic Coupled Cluster Monte Carlo

Charles J. C. Scott; Roberto Di Remigio; T. Daniel Crawford; Alex J.; W. Thom

arXiv:1901.02753·physics.chem-ph·June 19, 2019

Diagrammatic Coupled Cluster Monte Carlo

Charles J. C. Scott, Roberto Di Remigio, T. Daniel Crawford, Alex J., W. Thom

PDF

TL;DR

This paper introduces diagCCMC, a modified coupled cluster Monte Carlo method that efficiently samples connected terms, significantly reducing memory costs and enabling accurate simulations of strongly correlated systems.

Contribution

The paper presents diagCCMC, a novel stochastic algorithm that samples only connected components of the similarity-transformed Hamiltonian, improving memory efficiency in coupled cluster Monte Carlo methods.

Findings

01

Memory cost scales linearly with system size for local, noninteracting systems.

02

Significant memory reduction observed during dissociation of helium chains.

03

Method remains stable and accurate for strongly correlated molecules like stretched N₂.

Abstract

We propose a modified coupled cluster Monte Carlo algorithm that stochastically samples connected terms within the truncated Baker--Campbell--Hausdorff expansion of the similarity transformed Hamiltonian by construction of coupled cluster diagrams on the fly. Our new approach -- diagCCMC -- allows propagation to be performed using only the connected components of the similarity-transformed Hamiltonian, greatly reducing the memory cost associated with the stochastic solution of the coupled cluster equations. We show that for perfectly local, noninteracting systems, diagCCMC is able to represent the coupled cluster wavefunction with a memory cost that scales linearly with system size. The favorable memory cost is observed with the only assumption of fixed stochastic granularity and is valid for arbitrary levels of coupled cluster theory. Significant reduction in memory cost is also shown…

Figures7

Click any figure to enlarge with its caption.

Tables3

Table 1. Table 1 : Correlation energy for different levels of theory and basis sets for \ce N2 with r NN = 3.6 a 0 subscript 𝑟 NN 3.6 subscript 𝑎 0 r_{\mathrm{NN}}=3.6\,a_{0} . Molecular integrals were generated in FCIDUMP format with the Psi4 program package. 107 The STO-3G and 6-31G results were computed using MRCC 103 . The canonical restricted Hartree–Fock orbitals were used, giving E ref = − 106.937 562 E h subscript 𝐸 ref times -106.937562 hartree E_{\mathrm{ref}}=$-106.937\,562\text{\,}\mathrm{\text{ $E_{\textup{ h }}$ }}$ and − 108.360 046 E h times -108.360046 hartree -108.360\,046\text{\,}\mathrm{\text{ $E_{\textup{ h }}$ }} in the STO-3G and 6-31G bases, respectively.

		STO-3G	6-31G
SD	CC	$- 0.589 163$	$- 0.491 480$
SD	diagCCMC	$- 0.799$ ${(2)}^{a}$	$- 0.4921 (7)$
SDT	CC	$- 0.589 923$	$- 0.533 600$
SDT	diagCCMC	$- 0.6092$ ${(8)}^{a}$	$- 0.5341 (9)$
SDTQ	CC	$- 0.523 049$
SDTQ	diagCCMC	$- 0.5244 (9)$	$^{b}$
SDTQ5	CC	$- 0.523 036$
SDTQ5	diagCCMC	$- 0.5249 (6)$	$^{b}$
SDTQ56	CC	$- 0.527 863$
SDTQ56	diagCCMC	$- 0.5271 (8)$	$^{b}$

Table 2. Table 2 : Correlation energy for different levels of theory using 1, 2 and 4 \ce Be replicas in a cc-pVDZ basis set. Note that for these systems CCSDTQ is equivalent to FCI . Molecular integrals were generated in FCIDUMP format with the Q-Chem program package 106 . The canonical Hartree–Fock orbitals for a single-atom calculation were used, and no spin symmetry breaking was observed, giving E ref = − 14.572 341 E h subscript 𝐸 ref times -14.572341 hartree E_{\mathrm{ref}}=$-14.572\,341\text{\,}\mathrm{\text{ $E_{\textup{ h }}$ }}$ .

		$n_{replicas}$
		$1$	$2$	$4$
SD	CC MC	$- 0.045 032 (2)$	$- 0.090 07 (2)$	$- 0.1799 (1)$
SD	diagCCMC	$- 0.045 00 (5)$	$- 0.090 11 (7)$	$- 0.1801 (3)$
SDT	CC MC	$- 0.045 067 (2)$	$- 0.090 12 (2)$	$- 0.180 34 (7)$
SDT	diagCCMC	$- 0.045 12 (4)$	$- 0.0902 (2)$	$- 0.1802 (3)$
SDTQ	CC MC	$- 0.045 070 (2)$	$- 0.090 15 (2)$	$- 0.180 44 (9)$
SDTQ	diagCCMC	$- 0.045 04 (4)$	$- 0.0902 (3)$	$- 0.1807 (3)$
FCI		$- 0.045 072 1 (7)$	$- 0.090 151 (5)$	$- 0.180 36 (6)$

Table 3. Table 3 : Correlation energy for different levels of theory using 1, 2 and 4 \ce Ne replicas in a cc-pVDZ basis set. Molecular integrals were generated in FCIDUMP format with the Psi4 program package 107 and exact CC results obtained using MRCC. 103 The canonical Hartree–Fock orbitals for a single-atom calculation were used and no spin symmetry breaking was observed, giving E ref = − 128.488 776 E h subscript 𝐸 ref times -128.488776 hartree E_{\mathrm{ref}}=$-128.488\,776\text{\,}\mathrm{\text{ $E_{\textup{ h }}$ }}$ .

		$n_{replicas}$
		$1$	$2$	$4$
SD	CC MC	$- 0.190 865 (3)$	$- 0.381 72 (3)$	$- 0.7633 (2)$
SD	diagCCMC	$- 0.190 94 (5)$	$- 0.3817 (1)$	$- 0.7641 (5)$
	CC	$- 0.190 861$	$- 0.381 72$ $3^{b}$	$- 0.763 44$ $6^{b}$
SDT	CC MC	$- 0.191 951 (4)$	$- 0.383 89 (4)$	$- 0.7676 (3)$
SDT	diagCCMC	$- 0.191 85 (10)$	$- 0.3839 (2)$	$- 0.7685 (7)$
	CC	$- 0.191 945$	$- 0.383 89$ $1^{b}$	$- 0.767 78$ $1^{b}$
SDTQ	CC MC	$- 0.192 092 (4)$	$- 0.384 18 (6)$	$^{a}$
SDTQ	diagCCMC	$- 0.1924 (1)$	$- 0.3843 (6)$	$- 0.7668 (7)$
	CC	$- 0.192 095$	$- 0.384 19$ $1^{b}$	$- 0.768 38$ $2^{b}$
SDTQ5	CC MC	$- 0.192 103 (4)$	$- 0.384 36 (9)$	$^{a}$
SDTQ5	diagCCMC	$- 0.1924 (2)$	$- 0.3840 (5)$	$- 0.7686 (5)$
	CC	$- 0.192 106$	$- 0.384 21$ $2^{b}$	$- 0.768 42$ $4^{b}$
SDTQ56	CC MC	$- 0.192 119 (5)$	$^{a}$	$^{a}$
SDTQ56	diagCCMC	$- 0.1919 (1)$	$- 0.3846 (6)$	$- 0.7691 (5)$
	CC	$- 0.192 106$	$- 0.384 21$ $1^{b}$	$- 0.768 42$ $2^{b}$
FCI		$- 0.192 106 (5)$	$^{a}$	$^{a}$

Equations30

∣ CC ⟩ = e^{T} ∣ D_{0} ⟩

∣ CC ⟩ = e^{T} ∣ D_{0} ⟩

T = k \sum T_{k},

T = k \sum T_{k},

T_{k} = i \in k^{th} replacements \sum t_{i} τ_{i} = \frac{1}{( k ! ) ^{2}} a_{1}, a_{2}, \dots, a_{k} i_{1}, i_{2}, \dots, i_{k} \sum t_{a_{1} a_{2} \dots a_{k}}^{i_{1} i_{2} \dots i_{k}} τ_{i_{1} i_{2} \dots i_{k}}^{a_{1} a_{2} \dots a_{k}},

T_{k} = i \in k^{th} replacements \sum t_{i} τ_{i} = \frac{1}{( k ! ) ^{2}} a_{1}, a_{2}, \dots, a_{k} i_{1}, i_{2}, \dots, i_{k} \sum t_{a_{1} a_{2} \dots a_{k}}^{i_{1} i_{2} \dots i_{k}} τ_{i_{1} i_{2} \dots i_{k}}^{a_{1} a_{2} \dots a_{k}},

⟨ D_{0} ⟩ \overset{ˉ}{H}_{N} D_{0}

⟨ D_{0} ⟩ \overset{ˉ}{H}_{N} D_{0}

Ω_{n} (t) = ⟨ D_{n} ⟩ \overset{ˉ}{H}_{N} D_{0}

H_{N}

H_{N}

= pq \sum f_{p}^{q} e_{q}^{p} + \frac{1}{4} pq r s \sum \overset{g}{ˉ}_{pq}^{r s} e_{r s}^{pq} = H - E_{ref},

\overset{ˉ}{H}_{N}

\overset{ˉ}{H}_{N}

+ \frac{1}{3 !} \contraction H_{N} T \contraction H_{N} T T \contraction H_{N} T T T H_{N} T T T + \frac{1}{4 !} \contraction H_{N} T \contraction H_{N} T T \contraction H_{N} T T T \contraction H_{N} T T T T H_{N} T T T T .

∣ Ψ (τ + δ τ) ⟩ = [1 - δ τ (H - S)] ∣ Ψ (τ) ⟩

∣ Ψ (τ + δ τ) ⟩ = [1 - δ τ (H - S)] ∣ Ψ (τ) ⟩

t_{n}^{'} \to t_{n}^{'} - δ τ ⟨ D_{0} ⟩ τ_{n}^{†} [H - S] CCMC,

t_{n}^{'} \to t_{n}^{'} - δ τ ⟨ D_{0} ⟩ τ_{n}^{†} [H - S] CCMC,

t_{n}^{'}

t_{n}^{'}

N_{0}

t_{n}^{'} \to t_{n}^{'} - δ τ N_{0} ⟨ D_{0} ⟩ τ_{n}^{†} [\overset{ˉ}{H} - E_{CC}] D_{0} - δ τ (E_{CC} - S) t_{n},

t_{n}^{'} \to t_{n}^{'} - δ τ N_{0} ⟨ D_{0} ⟩ τ_{n}^{†} [\overset{ˉ}{H} - E_{CC}] D_{0} - δ τ (E_{CC} - S) t_{n},

t_{n} (τ + δ τ) = t_{n} (τ) - δ τ ⟨ D_{0} ⟩ τ_{n}^{†} \overset{ˉ}{H}_{N} (τ) D_{0} .

t_{n} (τ + δ τ) = t_{n} (τ) - δ τ ⟨ D_{0} ⟩ τ_{n}^{†} \overset{ˉ}{H}_{N} (τ) D_{0} .

p_{diagram} = p_{select} p_{hver} p_{cont} p_{int} p_{ext},

p_{diagram} = p_{select} p_{hver} p_{cont} p_{int} p_{ext},

\frac{w _{diagram}}{p _{diagram}} = \frac{( - 1 ) ^{σ} w _{clus} w _{hamil}}{p _{select} p _{hver} p _{cont} p _{int} p _{ext}},

\frac{w _{diagram}}{p _{diagram}} = \frac{( - 1 ) ^{σ} w _{clus} w _{hamil}}{p _{select} p _{hver} p _{cont} p _{int} p _{ext}},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Diagrammatic Coupled Cluster Monte Carlo

Charles J. C. Scott

[email protected]

Department of Chemistry, University of Cambridge, Cambridge, UK

Roberto Di Remigio

[email protected]

Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Tromsø - The Arctic University of Norway, N-9037 Tromsø, Norway

T. Daniel Crawford

Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States

Alex J. W. Thom

Department of Chemistry, University of Cambridge, Cambridge, UK

Abstract

We propose a modified coupled cluster Monte Carlo algorithm that stochastically samples connected terms within the truncated Baker–Campbell–Hausdorff expansion of the similarity transformed Hamiltonian by construction of coupled cluster diagrams on the fly. Our new approach – diagCCMC – allows propagation to be performed using only the connected components of the similarity-transformed Hamiltonian, greatly reducing the memory cost associated with the stochastic solution of the coupled cluster equations. We show that for perfectly local, noninteracting systems, diagCCMC is able to represent the coupled cluster wavefunction with a memory cost that scales linearly with system size. The favorable memory cost is observed with the only assumption of fixed stochastic granularity and is valid for arbitrary levels of coupled cluster theory. Significant reduction in memory cost is also shown to smoothly appear with dissociation of a finite chain of helium atoms. This approach is also shown not to break down in the presence of strong correlation through the example of a stretched nitrogen molecule. Our novel methodology moves the theoretical basis of coupled cluster Monte Carlo closer to deterministic approaches.

LGPL GNU Lesser General Public Licence WF wave function WFT wave function theory LHS left-hand side CC coupled cluster CCS coupled cluster with single substitutions CCSD coupled cluster with single and double substitutions CCSD(T) CCSD with perturbative triples correction CCSDT coupled cluster with single, double and triple substitutions CC2 approximate coupled cluster singles and doubles CC3 approximate coupled cluster singles, doubles and triples PT perturbation theory MBPT many-body perturbation theory MO molecular orbital AO atomic orbital MP Møller–Plesset HF Hartree–Fock QM quantum mechanics QC quantum chemistry RHS right-hand side SCF self-consistent field BCH Baker–Campbell–Hausdorff MC Monte Carlo FCI full configuration interaction DMC diffusion Monte Carlo QMC quantum Monte Carlo VMC variational Monte Carlo RI resolution-of-the-identity FCIQMC full configuration interaction quantum Monte Carlo SCCT stochastic coupled cluster theory RDM reduced density matrix MSQMC Model Space Quantum Monte Carlo CCMC Coupled Cluster Monte Carlo diagCCMC diagrammatic Coupled Cluster Monte Carlo RHF Restricted Hartree–Fock UHF Unrestricted Hartree–Fock

\alsoaffiliation

Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States

{tocentry}

Over the last half-century the coupled cluster (CC) wavefunction Ansatz has proved remarkably effective at representing the solution of the Schrödinger equation in a polynomial scaling number of parameters while providing size-extensive and -consistent results. Despite reducing the full configuration interaction (FCI) $N!$ factorial scaling to polynomial, the computational cost of CC methods, measured in terms of both required CPU floating-point operations and memory, is still an issue. The coupled cluster with single and double substitutions (CCSD) and CCSD with perturbative triples correction (CCSD(T)) approximations provide a balance between computational cost and accuracy that has led to relatively wide adoption, but are eventually precluded for many large systems.

Recent work has made great progress on this issue through application of various approximations, which enable calculations to be performed with reduced memory and computational costs. In particular, various approximations exploiting the locality of electron correlation allow calculations with costs asymptotically proportional to measures of system size. These include approaches based on orbital localisation1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, molecular fragmentation40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, and decompositions, such as resolution-of-the-identity, Cholesky or singular-value, of the two-electron integrals tensors53, 19, 20, 54, 55, 56, 57, 58. However, while providing large efficiencies in CCSD calculations, higher truncation levels will generally exceed available memory resources before such approximations are a reasonable proposition.

In this letter we propose and demonstrate a coupled cluster-based projector Monte Carlo (MC) algorithm that enables automatic exploitation of the wavefunction sparsity for arbitrary excitation orders. Our methodology can be particularly beneficial for localised representations of the wavefunction, but it is not limited by assumptions of locality. The approach can fully leverage the sparsity inherent in the CC amplitudes at higher excitation levels,59 allowing dramatic reductions in memory costs for higher levels of theory.

The CC wavefunction is expressed as an exponential transformation of a reference single-determinant wavefunction $\ket{D_{\textbf{0}}}$ :

[TABLE]

where the cluster operator $T$ is given as a sum of second-quantised excitation operators:

[TABLE]

with the $k$ -th order cluster operators expressed as sums of excitation operators weighted by the corresponding cluster amplitudes:

[TABLE]

in the tensor notation for second quantisation proposed by Kutzelnigg and Mukherjee 60. Upon truncation of the cluster operator to a certain excitation level $l$ and projection of the Schrödinger equation onto the corresponding excitation manifold one obtains the linked energy and cluster amplitudes equations:

[TABLE]

We have introduced the similarity-transformed Hamiltonian, $\bar{H}_{\mathrm{N}}=\mathrm{e}^{-T}H_{\mathrm{N}}\mathrm{e}^{T}$ , and $\ket{D_{\textbf{n}}}$ can be any state within the projection manifold (up to an $l$ -fold excitation of $\ket{D_{\textbf{0}}}$ ). These CC equations are manifestly size-extensive order-by-order and term-by-term and furthermore provide the basis for the formulation of response theory.61

CC methods have to be carefully derived order-by-order and their implementation subsequently carried out, a process that can be rather time-consuming and error-prone.62, 63, 64 It has long been recognised that the use of normal-ordering, 65, 66 Wick’s theorem,67 and the ensuing diagrammatic techniques68 can be leveraged to automate both steps69, 65, 70, 71, 72, 73, 74, 75, though spin-adaptation can still pose significant challenges.76, 77, 78 Consider the normal-ordered, electronic Hamiltonian:

[TABLE]

its similarity transformation admits a Baker–Campbell–Hausdorff (BCH) expansion truncating exactly after the four-fold nested commutator.79, 66 Since all excitation operators are normal-ordered and commuting, the commutator expansion lets us reduce the Hamiltonian-excitation operator products to only those terms which are connected.65, 66 Excitation operators will only appear to the right of the Hamiltonian and only terms where each excitation operator shares at least one index with the Hamiltonian will lead to nonzero terms in the residuals $\Omega_{\textbf{n}}(\bm{t})$ appearing in eqs. (4):

[TABLE]

Moreover, by virtue of Wick’s theorem67, 60, the products of normal-ordered strings appearing in the connected expansion will still be expressed as normal-ordered strings, further simplifying the algebra. The requirement of shared indices between the Hamiltonian and cluster coefficients enables the resulting equations to be solved via a series of tensor contractions between multi-index quantities: the sought-after cluster amplitudes and the molecular one- and two-electron integrals. The iterative process required to solve eqs. (4) is highly amenable for a rapid evaluation on conventional computing architectures,80, 81 but remains non-trivial to parallelise,82 especially for higher truncation orders in the CC hierarchy.77 A proper factorisation of intermediates is essential to achieve acceptable time to solution and memory requirements.

In recent years some of us have been involved in developing a projector MC algorithm to obtain the CC solutions within a stochastic error bar.83, 84, 85, 86 The starting point, as with any projector MC method, is the imaginary-time Schrödinger equation87, 88, 89 obtained after a Wick rotation $\tau\leftarrow\mathrm{i}t$ . Repeated application of the approximate linear propagator to a trial wavefunction will yield the ground-state solution:

[TABLE]

where $S$ is a free parameter that is varied to keep the normalisation of $\Psi(\tau)$ approximately constant. In the CC MC and full configuration interaction quantum Monte Carlo (FCIQMC) approaches, a population of particles in Fock space represents the wavefunction and evolves according to simple rules of spawning, death, and annihilation.87, 83 For a CC Ansatz, unit particles may represent nonunit contributions to CC amplitudes by letting the intermediate normalisation condition vary with the population on the reference determinant: $\braket{D_{\textbf{0}}}{\mathrm{CCMC}(\tau)}=N_{0}(\tau)$ . A factor of $\frac{1}{N_{0}(\tau)}$ is removed from the definition of $T(\tau)$ and this determines the granularity of amplitude representation: amplitude values smaller than $\frac{1}{N_{0}(\tau)}$ are stochastically rounded during the calculation, vide infra. To avoid confusion, we denote the so-modified cluster operators and amplitudes as $T^{\prime}$ and $t_{\textbf{n}}^{\prime}$ , respectively, so $\ket{\mathrm{CCMC}}=N_{0}\mathrm{e}^{\frac{T^{\prime}}{N_{0}}}\ket{D_{\textbf{0}}}$ . Thus, in the unlinked formulation first put forward by Thom,83 the dynamic equation for the amplitudes becomes:

[TABLE]

where we have dropped the $\tau$ -dependence for clarity. CC MC is fully general with respect to the truncation level in the cluster operator and sidesteps the need to store a full representation of the wavefunction at any point. CC MC should allow for the effective solution of the CC equations with a much reduced memory cost, as previously realised in the FCIQMC method.87, 90, 91, 92 However, while various cases demonstrate memory cost reduction, especially in the presence of weak correlation,93 the corresponding increase in computational cost was large even by the standards of projector MC methods and modifications used in related approaches, such as the initiator approximation,90 proved comparatively ineffective.84

Combination with the linked CC formulation seems to be one possible remedy for these issues and is furthermore the basis for decades of theoretical and implementation work in the deterministic community. Franklin et al. 85 have discussed a CC MC algorithm to sample eqs. (4) using the update step:

[TABLE]

The authors however noted that the use of the similarity-transformed Hamiltonian required an ad hoc modification:

[TABLE]

to deal with convergence issues with the projected energy prior to the initialisation of population control. In addition, due to evaluation of $\bar{H}$ via the commutator expansion of the bare Hamiltonian, rather than the sum of connected Hamiltonian-excitation operator products (6), some disconnected terms were included. These extraneous terms in the algorithm of Franklin et al. 85 have been observed to correctly cancel out on average, but render unnecessarily complex the sampling of connected contributions only. Eventually, it is difficult to develop stochastic counterparts to approximations, such as the CCn hierarchy,94, 95 proposed within deterministic CC theory.

We here reconsider the implementation of the linked CC MC algorithm in the light of the diagrammatic techniques used in deterministic CC, an approach we name diagrammatic Coupled Cluster Monte Carlo (diagCCMC). The update equation can be easily derived as a finite difference approximation to the exact imaginary-time dynamics of the coupled cluster wavefunction under the assumption of constant intermediate normalisation:

[TABLE]

This has been noted elsewhere,96 and we will discuss its implications in greater detail in a subsequent communication97, but for now it will suffice to observe that since this is a projector MC approach it will eventually converge to the lowest energy solution of the CC equations. The existence of multiple solutions to the nonlinear CC equations is well-documented,98, 99, 100 and a projector MC approach could result in a different solution to the CC equations than the one found via a deterministic procedure, where iteration stabilises upon whichever solution is approached first from a given starting point. In practice a difference is only observed if a highly truncated form of CC has been applied inappropriately to a system, and even then only in the worst cases.

The second term on the right-hand side is the contribution to the CC vector function $\Omega_{\textbf{n}}(\mathbf{t})$ resulting from the projection upon the determinant $\ket{D_{\textbf{n}}}$ and is representable as a finite sum of enumerable diagrams. Thus, at each iteration, we wish to randomly select $n_{\mathrm{a}}$ diagrams from $\braket{D_{\textbf{0}}}{\tau_{\textbf{n}}^{\dagger}\bar{H}_{\mathrm{N}}}{D_{\textbf{0}}}$ . Each of these will be in the form of an excitation operator, $\tau_{\textbf{{i}}}$ , and corresponding weight, $w_{\textbf{i}}$ , selected with some known, normalised probability, $p_{\mathrm{diagram}}$ , such that we expect to select any given contributing diagram $p_{\mathrm{diagram}}\times n_{a}$ times at each iteration. As by construction $\braket{D_{\textbf{0}}}{\tau_{\textbf{j}}^{\dagger}\tau_{\textbf{i}}}{D_{\textbf{0}}}=\delta_{\textbf{ij}}$ , a selected term can be found to contribute to the update of a single coefficient with no additional sign considerations. Rather than explicitly introduce a particulate representation of the coefficients, as in FCIQMC and previous CC MC approaches, we stochastically round all coefficients $t_{\textbf{n}}$ with magnitude below some strictly positive granularity parameter $\Delta$ . If $|t_{\textbf{n}}|<\Delta$ , then $|t_{\textbf{n}}|$ is rediscretised to either $\Delta$ (with probability $\left|\frac{t_{\textbf{n}}}{\Delta}\right|$ ) or 0 (with probability $1-\left|\frac{t_{\textbf{n}}}{\Delta}\right|$ ).101, 97 This can be shown to be equivalent to a representation with unit particles and constant intermediate normalisation $\frac{1}{\Delta}$ .

We perform diagram selection by reading off terms from right-to-left in $\braket{D_{\textbf{0}}}{\tau_{\textbf{n}}^{\dagger}\bar{H}_{\mathrm{N}}}{D_{\textbf{0}}}$ :

Select a random cluster of excitation operators with probability $p_{\mathrm{select}}$ utilising the even selection scheme86 restricted to clusters of at most 4 excitation operators. This corresponds to simultaneously selecting a term in the BCH expansion (6) and the excitation level of each excitation operator in the commutator. 2. 2.

Select one of the 13 possible $H_{N}$ vertices65, 66 with some probability $p_{\mathrm{hvertex}}$ . 3. 3.

Select the contraction pattern of the chosen cluster and Hamiltonian vertex. This identifies a specific Kucharski–Bartlett sign sequence68, 65, 66 for the diagram we are considering and which excitation operators are associated with which term within the sign sequence with probability $p_{\mathrm{contract}}$ . 4. 4.

Select which indices of each excitation operator will be contracted with the Hamiltonian vertex. Having selected the contraction pattern this is a matter of simple combinatorics, with a given set of indices selected with probability $p_{\mathrm{internal}}$ . 5. 5.

Select the external indices of the Hamiltonian vertex with probability $p_{\mathrm{external}}$ . 6. 6.

Evaluate the index of resulting projection determinant in the update step, i. e. $\bra{D_{\textbf{0}}}\tau_{\textbf{n}}^{\dagger}$ , and the diagrammatic amplitude including all parity factors.

This obtains a single specific diagram with probability:

[TABLE]

where the obvious abbreviations have been used to refer to each of the previously stated probabilities. These are conditional probabilities, as the various events leading to the computed $p_{\mathrm{diagram}}$ are not independent. This procedure to select diagrams can be visualised as graphically building the diagram bottom-up, see Figure 1.

To evaluate the contribution of a selected diagram to our propagation, we slightly modify the standard rules of diagrammatic interpretation. Instead of summing over all indices, and thus having to correct for any potential double counting, our algorithm selects a specific diagram along with a specific set of indices for all lines.

To ensure proper normalisation of our sampling probability, we require there be only a single way to select diagrams related by:

•

The antipermutation of antisymmetrised Goldstone vertex indices.

•

The antipermutation of cluster operator particle or hole indices.

•

The commutation of cluster operators.

All these modifications can be viewed as replacing sums $\frac{1}{2}\sum\limits_{ij}$ with $\sum\limits_{i>j}+\frac{1}{2}\delta_{ij}$ . In the first two cases summation runs over equivalent indices and the $i=j$ term must be zero, while in the third case summation runs over excitation operators and the $i=j$ term corresponds to a diagram with additional symmetry that as such must be treated more carefully to ensure unique selection of a Kucharski–Bartlett sign sequence68, 65, 66. Specifically, we do not require an additional factor of $\frac{1}{2}$ for:

•

Each pair of equivalent internal or external lines.

•

Two cluster operators of the same rank but with different specific indices, provided they have a well-determined ordering on selection.

Additionally, to include the effect of permutation operators $\hat{P}$ for inequivalent external lines we must permute the hole and particle indices of a resulting excitation operator to a unique antisymmetrised ordering for storage. This ensures proper cancellation between all equivalent orderings, which could otherwise differ due to the stochastic sampling. Eventually, the amplitude of the contribution of the selected diagram, $w_{\mathrm{diagram}}$ , is given as the product of the cluster amplitude, $w_{\mathrm{clus}}=\prod_{\textbf{i}}t_{\textbf{i}}$ , and Hamiltonian element, $w_{\mathrm{hamil}}$ , with appropriately determined parity $(-1)^{\sigma}$ . The overall contribution of a single selected diagram to the coefficient $t_{\textbf{n}}$ determined by the open lines of the diagram will be:

[TABLE]

wherever possible we aspire to have $p_{\mathrm{diagram}}\propto|w_{\mathrm{diagram}}|$ .86

We will now demonstrate the ability of diagCCMC to recover energies at high levels of CC theory on the nitrogen molecule in a stretched geometry ( $r_{\mathrm{NN}}=3.6\,a_{0}$ ). It has previously been shown that connected contributions up to hextuples are vital to obtaining high accuracy for this system.102 Correlation energies for a range of basis sets and truncation levels are reported in Table 1, showing agreement within error bars with deterministic results103 and the existing literature in all but the most extreme cases, where convergence to a different solution is observed as noted previously.

We then turn our attention to test systems of beryllium and neon atoms at a variety of truncation levels. Extending these systems by introducing noninteracting replicas illustrates the behaviour of our approach in the presence of locality in comparison to previous Fock-space stochastic methods, namely the original unlinked CC MC (hereafter simply referred to as CC MC) and FCIQMC.

To allow reasonable comparison between diagCCMC, CC MC, and FCIQMC all calculations were performed with:

•

Granularity parameter $\Delta$ equal to $10^{-4}$ . This is the threshold for the stochastic rounding of the cluster amplitudes.

•

$\delta\tau$ and $n_{\text{attempts}}$ such that, on each iteration, a spawning event may have maximum size of $3\times 10^{-4}$ .

For Coupled Cluster Monte Carlo (CCMC) and FCIQMC this corresponds to a stable calculation with reference population of $N_{0}=10^{4}$ and a timestep such that no spawning event produces more than three particles. CC MC and FCIQMC calculations were performed with the HANDE-QMC code104, 105 using the default, uniform excitation generators. For CC MC, we adopted the even selection scheme of Scott and Thom 86. The molecular integrals were generated in FCIDUMP format using the Q-Chem106 and Psi4107 quantum chemistry program packages, see the Supporting Information for more details.108

We report the correlation energies obtained for an isolated \ceBe atom and the noninteracting replicas systems in Table 2. We compare CC results up to and including quadruple excitations with FCIQMC. For these systems CCSDTQ is equivalent to FCI, thus providing a good sanity check for the diagCCMC approach. In addition, results at each level of theory are expected to agree within statistical errors due to the size-consistency of all considered approaches, as is observed.

In order to assess the computational performance of diagCCMC we compare two measures of efficiency:

•

$n_{\mathrm{attempts}}/\delta\tau$ , that is, the number of stochastic samples performed per unit imaginary time. This metric is a measure of the minimum CPU cost, provided that the length of propagation in imaginary time is roughly constant between approaches, or equivalently a roughly constant inefficiency between the approaches.109

•

$n_{\mathrm{states}}$ , that is, the number of occupied excitation operators. This metric is a measure of the minimum memory cost. For a deterministic calculation this would amount to the Hilbert space size for the selected truncation level.

The promise of stochastic methods is to greatly reduce the cost of high-level correlated calculations by naturally exploiting the wavefunction sparsity. Figure 2 reports the ratio of $n_{\mathrm{states}}$ per replica and the size of the Hilbert space for an isolated atom at the given truncation level. For an isolated \ceBe atom, the reduction in memory footprint is clearly evident: all methods compared require significantly less than the full size of the Hilbert space (ratio $<1$ ) to successfully achieve convergence and recover the deterministic results. Unsurprisingly and correctly, diagCCMC requires the same amount of storage as its unlinked counterparts. Notice also that the ratio decreases in going from CCSD to CCSDTQ showing how stochastic methods single out the important portions of the Hilbert space. For perfectly local systems, such as the noninteracting 2- and 4-atom replicas, one also expects the number of states per replica to roughly stay constant. This expectation stems from the linked diagram theorem66 and is met by the diagCCMC approach where at each iteration only connected diagrams are sampled. The same is, quite emphatically, not true for either FCIQMC or CC MC: the number of states per replica approaches and surpasses the size of the single-atom Hilbert space.

In Figure 3 we can see that diagCCMC outperforms each of the corresponding CCMC approaches also when estimating the CPU cost of the calculations on the \ceBe systems here considered. It is particularly striking to note the order of magnitude difference between the diagrammatic and unlinked approaches at the CCSD level of theory even for this tiny system.

The same observation also holds true for higher orders of CC theory, as can clearly be seen from Figure 4 where we plot the $n_{\mathrm{states}}$ metric for an isolated \ceNe atom and its corresponding 2 and 4 noninteracting replicas system. Table 3 reports the correlation energies per replica for a systems of noninteracting \ceNe atoms. diagCCMC affords calculations practically at constant memory cost per replica in contrast with CC MC for which the increasing cost exceeded available computational resources for the higher order excitations.

Finally, we studied the dissociation of a chain of 5 helium atoms as an example of interacting system. The diagrammatic algorithm shows favourable CPU and memory cost for noninteracting systems, further suggesting that it might also straightforwardly leverage localisation in the orbital space to achieve reduced cost for calculations on interacting systems. As a preliminary test for this conjecture, Figure 5 shows the memory cost for the dissociation curve of an interacting chain of five helium atoms. We localised the occupied and virtual orbital sets with the Foster–Boys110 and the Pipek–Mezey111 criteria, respectively. We compare the $n_{\mathrm{states}}$ metric with the memory cost at the dissociation limit for a deterministic and a diagCCMC CCSD calculation. The former (dotted line) is the maximum memory cost for performing CCSD calculations on the isolated atoms: below it, the cost is comparable to that for a wavefunction with excitations localised to each \ceHe atom. The onset of such behaviour is evident from Figure 5, which also shows the recovery of the noninteracting limit at large separations.

In conclusion, we have described a stochastic realisation of linked CC theory that fully exploits the connectedness of the similarity-transformed Hamiltonian, as exemplified in the diagrammatic expansion of the CC equations. Our stochastic diagrammatic implementation avoids the computational and memory cost issues associated with deterministic and unlinked stochastic approaches, by generating diagrams on-the-fly and accumulating the corresponding amplitudes. Finally, we have shown how the stochastic and deterministic implementations can be rationalised within the same framework. This bridges the existing gap between the two strategies: by clearing possible misunderstandings on how and why stochastic methods work and enabling future cross-fertilisation.

{acknowledgement}

C.J.C.S. is grateful to the Sims Fund for a studentship and A.J.W.T. to the Royal Society for a University Research Fellowship under Grant Nos. UF110161 and UF160398. Both are grateful for support under ARCHER Leadership Project grant e507. R.D.R. acknowledges partial support by the Research Council of Norway through its Centres of Excellence scheme, project number 262695 and through its Mobility Grant scheme, project number 261873. R.D.R. is also grateful to the Norwegian Supercomputer Program through a grant for computer time (Grant No. NN4654K). T.D.C. was supported by grants CHE-1465149 and ACI-1450169 from the U.S. National Science Foundation.

{suppinfo}

Additional data related to this publication, including a copy of the diagCCMC code, raw and analysed data files and analysis scripts, is available at the University of Cambridge data repository https://doi.org/10.17863/CAM.34952 and https://doi.org/10.17863/CAM.36097.

We used the goldstone LaTeX package, available on GitHub https://github.com/avcopan/styfiles, to draw the coupled cluster diagrams. We used matplotlib for all the plots in the paper.112

Bibliography112

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Pulay 1983 Pulay, P. Localizability of dynamic electron correlation. Chem. Phys. Lett. 1983 , 100 , 151–154
2Stoll 1992 Stoll, H. On the correlation energy of graphite. J. Chem. Phys. 1992 , 97 , 8449–8454
3Saebo and Pulay 1993 Saebo, S.; Pulay, P. Local Treatment of Electron Correlation. Annu. Rev. Phys. Chem. 1993 , 44 , 213–236
4Hampel and Werner 1996 Hampel, C.; Werner, H.-J. Local treatment of electron correlation in coupled cluster theory. J. Chem. Phys. 1996 , 104 , 6286–6297
5Schütz et al. 1999 Schütz, M.; Hetzer, G.; Werner, H.-J. Low-order scaling local electron correlation methods. I. Linear scaling local MP 2. J. Chem. Phys. 1999 , 111 , 5691–5705
6Schütz and Werner 2000 Schütz, M.; Werner, H.-J. Local perturbative triples correction (T) with linear cost scaling. Chem. Phys. Lett. 2000 , 318 , 370–378
7Schütz 2000 Schütz, M. Low-order scaling local electron correlation methods. III. Linear scaling local perturbative triples correction (T). J. Chem. Phys. 2000 , 113 , 9986–10001
8Schütz and Werner 2001 Schütz, M.; Werner, H.-J. Low-order scaling local electron correlation methods. IV. Linear scaling local coupled-cluster (LCCSD). J. Chem. Phys. 2001 , 114 , 661–681