Multireference Stochastic Coupled Cluster

Maria-Andreea Filip; Charles J. C. Scott; Alex J. W. Thom

arXiv:1812.01898·physics.chem-ph·October 22, 2020

Multireference Stochastic Coupled Cluster

Maria-Andreea Filip, Charles J. C. Scott, Alex J. W. Thom

PDF

TL;DR

This paper introduces a stochastic multireference coupled cluster method that effectively handles strong electron correlation by incorporating multiple reference determinants, enhancing the flexibility and insight of electronic structure calculations.

Contribution

It presents a novel modification of stochastic coupled cluster that includes multiple references, improving the description of strongly correlated systems.

Findings

01

Successfully describes strongly correlated molecules with few references

02

Maintains the advantages of single reference stochastic coupled cluster

03

Provides flexible control over references and excitations

Abstract

We describe a modification of the stochastic coupled cluster algorithm that allows the use of multiple reference determinants. By considering the secondary references as excitations of the primary reference and using them to change the acceptance criteria for selection and spawning, we obtain a simple form of stochastic multireference coupled cluster which preserves the appealing aspects of the single reference approach. The method is able to successfully describe strongly correlated molecular systems using few references and low cluster truncation levels, showing promise as a tool to tackle strong correlation in more general systems. Moreover, it allows simple and comprehensive control of the included references and excitors thereof, and this flexibility can be taken advantage of to gain insight into some of the inner workings of established electronic structure methods.

Figures15

Click any figure to enlarge with its caption.

Tables2

Table 1. Table 1: Values of the calculated correlation energy for H 4 in a minimal basis.

Method	Energy/ $E_{h}$
FCI	-0.117621
CCMCSD	-0.12044(3)
CCMCSDT	-0.12059(7)
CCMCSDTQ	-0.11761(7)
2r-CCMCSD	-0.11763(4)
MRCCSD³⁴	-0.117580
MRCCSD-1⁴¹	-0.117686
MRCCSD-2,3⁴¹	-0.117575
MRACPQ-1⁴²	-0.117102

Table 2. Table 2: Leading triply excited cluster coefficients in the H 8 wavefunction for α / a 0 = 1.0 , 0.1 , 0.0001 𝛼 subscript 𝑎 0 1.0 0.1 0.0001 \alpha/a_{0}=1.0,0.1,0.0001 . a All wavefunctions are normalised such that ⟨ D 0 | Ψ ⟩ = 1 inner-product subscript 𝐷 0 Ψ 1 \braket{D_{0}}{\Psi}=1 b As given in 63 c Values taken from instantaneous snapshots of the stochastic wavefunction.

$α / a_{0}$	${\hat{T}}_{3}$ term	Coefficient^a
$α / a_{0}$	${\hat{T}}_{3}$ term	CCSDTQ^b	SS-CCSDTQ^b	2r-CCMCSD^c	4r-CCMCSD^c	2r-CCMCSDT^c
1.0	$t_{875}^{653}$	$0.0027990300$	$0.0$	$0.0$	$0.0$	$- 0.006938820228$
	$t_{873}^{543}$	$- 0.0027927466$	$0.0$	$0.0$	$0.0$	$- 0.005822500732$
	$t_{871}^{853}$	$0.0026117601$	$0.0$	$0.0$	$0.0$	$- 0.00622210191$
	$t_{875}^{741}$	$- 0.002605554$	$- 0.0026078696$	$0.004204674226$	$0.006681790253$	$0.005893986503$
	$t_{764}^{872}$	$0.0025749909$	$0.0026218416$	$0.0$	$0.0$	$- 0.005829632611$
0.1	$t_{875}^{321}$	$- 0.0353278703$	$- 0.0304035827$	$- 0.02906899421$	$- 0.02533820583$	$- 0.03510182677$
	$t_{873}^{521}$	$0.0341886609$	$0.0294198501$	$- 0.03320221354$	$- 0.2576256265$	$- 0.03417247948$
	$t_{763}^{721}$	$0.0112172942$	$0.0091712186$	$0.009037433273$	$0.01719834927$	$0.01032064989$
	$t_{754}^{721}$	$- 0.0109948657$	$- 0.0089897429$	$- 0.009720349583$	$- 0.009384567264$	$0.01176631082$
	$t_{732}^{521}$	$- 0.0108251409$	$- 0.0087530913$	$0.007848827095$	$0.01093788654$	$0.005768813771$
	$t_{764}^{872}$	–	–	$0.0$	$0.0$	$0.007187521769$

Equations48

Ψ_{CC} = e^{\hat{T}} ∣ D_{0} ⟩,

Ψ_{CC} = e^{\hat{T}} ∣ D_{0} ⟩,

\hat{T} = i \sum t_{i} \overset{a}{^}_{i}

\hat{T} = i \sum t_{i} \overset{a}{^}_{i}

\hat{T} = i \sum \hat{T}_{i}

\hat{T} = i \sum \hat{T}_{i}

⟨ D_{i} ⟩ \hat{H} - E Ψ_{CC} = 0,

⟨ D_{i} ⟩ \hat{H} - E Ψ_{CC} = 0,

⟨ D_{i} ⟩ 1 - δ τ (\hat{H} - E) Ψ_{CC} = ⟨ D_{i} ⟩ Ψ_{CC}

⟨ D_{i} ⟩ 1 - δ τ (\hat{H} - E) Ψ_{CC} = ⟨ D_{i} ⟩ Ψ_{CC}

⟨ D_{i} ⟩ Ψ_{CC} = \pm ⟨ D_{0} ⟩ \overset{a}{^}_{i}^{†} Ψ_{CC} = t_{i} + O (\hat{T}^{2}),

⟨ D_{i} ⟩ Ψ_{CC} = \pm ⟨ D_{0} ⟩ \overset{a}{^}_{i}^{†} Ψ_{CC} = t_{i} + O (\hat{T}^{2}),

t_{i} (τ + δ τ) = t_{i} (τ) - δ τ ⟨ D_{i} ⟩ \hat{H} - E Ψ_{CC}

t_{i} (τ + δ τ) = t_{i} (τ) - δ τ ⟨ D_{i} ⟩ \hat{H} - E Ψ_{CC}

E_{proj} = \frac{⟨ D _{0} ⟩ H ^ Ψ _{CCMC}}{⟨ D _{0} ⟩ Ψ _{CCMC}}

E_{proj} = \frac{⟨ D _{0} ⟩ H ^ Ψ _{CCMC}}{⟨ D _{0} ⟩ Ψ _{CCMC}}

S (τ) = S (τ - A δ τ) - \frac{ζ}{A δ τ} ln \frac{N _{tot} ( τ )}{N _{tot} ( τ - A δ τ )}

S (τ) = S (τ - A δ τ) - \frac{ζ}{A δ τ} ln \frac{N _{tot} ( τ )}{N _{tot} ( τ - A δ τ )}

Ψ_{ν} = μ \sum c_{ν μ} e^{\hat{T}^{μ}} Φ_{μ},

Ψ_{ν} = μ \sum c_{ν μ} e^{\hat{T}^{μ}} Φ_{μ},

e^{\hat{T}^{(i)}} ∣ D_{i} ⟩ = e^{\hat{T}^{(i)}} \overset{a}{^}_{i} ∣ D_{0} ⟩ .

e^{\hat{T}^{(i)}} ∣ D_{i} ⟩ = e^{\hat{T}^{(i)}} \overset{a}{^}_{i} ∣ D_{0} ⟩ .

\hat{T}^{'} = \hat{T}_{1} + \hat{T}_{2} + I^{'}, j, k, a, b, C^{'} \sum \hat{T}_{3} (ab C^{'} I^{'} j k) + k, l, c, d, A, B \sum \hat{T}_{4} (c d A B I J k l)

\hat{T}^{'} = \hat{T}_{1} + \hat{T}_{2} + I^{'}, j, k, a, b, C^{'} \sum \hat{T}_{3} (ab C^{'} I^{'} j k) + k, l, c, d, A, B \sum \hat{T}_{4} (c d A B I J k l)

\hat{T}^{'} = \hat{T}^{int} + \hat{T}^{ext}

\hat{T}^{'} = \hat{T}^{int} + \hat{T}^{ext}

\hat{T}^{int} = I^{'}, A^{'} \sum \hat{T}_{1} (A^{'} I^{'}) + \hat{T}_{2} (A B I J)

\hat{T}^{int} = I^{'}, A^{'} \sum \hat{T}_{1} (A^{'} I^{'}) + \hat{T}_{2} (A B I J)

\hat{T}^{ext} = i, j, ..., a, b, ... \sum \hat{T}_{1} (a i) + \hat{T}_{2} (ab ij) + ...

Ψ = e^{\hat{T}^{ext}} e^{\hat{T}^{int}} ∣ D_{0} ⟩

Ψ = e^{\hat{T}^{ext}} e^{\hat{T}^{int}} ∣ D_{0} ⟩

\hat{T}^{'} = \hat{T}_{1} + ... + \hat{T}_{λ} + \hat{T}_{λ + 1} (a_{1} ... a_{λ} A_{1} I_{1} i_{1} ... i_{λ}) + + ... + T_{λ + k} (a_{1} ... a_{λ} A_{1} ... A_{k} I_{1} ... I_{k} i_{1} ... i_{λ}),

\hat{T}^{'} = \hat{T}_{1} + ... + \hat{T}_{λ} + \hat{T}_{λ + 1} (a_{1} ... a_{λ} A_{1} I_{1} i_{1} ... i_{λ}) + + ... + T_{λ + k} (a_{1} ... a_{λ} A_{1} ... A_{k} I_{1} ... I_{k} i_{1} ... i_{λ}),

\begin{split}\hat{T}^{\prime}&=\hat{T}_{1}+\hat{T}_{2}+\sum_{A^{\prime},I^{\prime},a,b,j,k,l}\Big{[}\hat{T}_{3}\begin{pmatrix}aAB\\ IJk\end{pmatrix}\\ &+\hat{T}_{3}\begin{pmatrix}abA^{\prime}\\ IJk\end{pmatrix}+\hat{T}_{3}\begin{pmatrix}aAB\\ I^{\prime}jk\end{pmatrix}+\hat{T}_{4}\begin{pmatrix}abAB\\ IJkl\end{pmatrix}\Big{]}\end{split}

\begin{split}\hat{T}^{\prime}&=\hat{T}_{1}+\hat{T}_{2}+\sum_{A^{\prime},I^{\prime},a,b,j,k,l}\Big{[}\hat{T}_{3}\begin{pmatrix}aAB\\ IJk\end{pmatrix}\\ &+\hat{T}_{3}\begin{pmatrix}abA^{\prime}\\ IJk\end{pmatrix}+\hat{T}_{3}\begin{pmatrix}aAB\\ I^{\prime}jk\end{pmatrix}+\hat{T}_{4}\begin{pmatrix}abAB\\ IJkl\end{pmatrix}\Big{]}\end{split}

\hat{T}^{int} = \hat{T}_{2} (A B I J)

\hat{T}^{int} = \hat{T}_{2} (A B I J)

\hat{T}_{2} (A B I J) \propto \overset{a}{^}_{1},

\hat{T}_{2} (A B I J) \propto \overset{a}{^}_{1},

\hat{T}_{3} (a A B I J k) \propto \hat{T}_{1}^{(1)} (a k) \overset{a}{^}_{1}

\hat{T}_{3} (a A B I J k) \propto \hat{T}_{1}^{(1)} (a k) \overset{a}{^}_{1}

\hat{T}_{3} (ab A I J k) \propto \hat{T}_{2}^{(1)} (ab B k) \overset{a}{^}_{1}

\hat{T}^{'} = \hat{T}_{1} + \hat{T}_{2} + (\hat{T}_{1}^{(1)} + \hat{T}_{2}^{(1)}) \overset{a}{^}_{1}

\hat{T}^{'} = \hat{T}_{1} + \hat{T}_{2} + (\hat{T}_{1}^{(1)} + \hat{T}_{2}^{(1)}) \overset{a}{^}_{1}

\hat{T}^{'} = i = 1 \sum m_{0} \hat{T}_{i} + n = 1 \sum N j = 0 \sum m_{n} \hat{T}_{j}^{(n)} \overset{a}{^}_{n}

\hat{T}^{'} = i = 1 \sum m_{0} \hat{T}_{i} + n = 1 \sum N j = 0 \sum m_{n} \hat{T}_{j}^{(n)} \overset{a}{^}_{n}

\begin{split}\hat{T}^{\prime}&=\hat{T}_{1}+\hat{T}_{2}+\sum_{A^{\prime},I^{\prime},a,b,j,k,l}\Big{[}\hat{T}_{3}\begin{pmatrix}aAB\\ IJk\end{pmatrix}\\ &+\hat{T}_{3}\begin{pmatrix}abB\\ Jkl\end{pmatrix}+\hat{T}_{3}\begin{pmatrix}abA\\ Ijk\end{pmatrix}\\ &+\hat{T}_{3}\begin{pmatrix}abA^{\prime}\\ IJk\end{pmatrix}+\hat{T}_{3}\begin{pmatrix}aAB\\ I^{\prime}jk\end{pmatrix}+\hat{T}_{4}\begin{pmatrix}abAB\\ IJkl\end{pmatrix}\Big{]}\end{split}

\begin{split}\hat{T}^{\prime}&=\hat{T}_{1}+\hat{T}_{2}+\sum_{A^{\prime},I^{\prime},a,b,j,k,l}\Big{[}\hat{T}_{3}\begin{pmatrix}aAB\\ IJk\end{pmatrix}\\ &+\hat{T}_{3}\begin{pmatrix}abB\\ Jkl\end{pmatrix}+\hat{T}_{3}\begin{pmatrix}abA\\ Ijk\end{pmatrix}\\ &+\hat{T}_{3}\begin{pmatrix}abA^{\prime}\\ IJk\end{pmatrix}+\hat{T}_{3}\begin{pmatrix}aAB\\ I^{\prime}jk\end{pmatrix}+\hat{T}_{4}\begin{pmatrix}abAB\\ IJkl\end{pmatrix}\Big{]}\end{split}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Multireference Stochastic Coupled Cluster

Maria-Andreea Filip

[email protected]

Charles J C Scott

Alex J W Thom

[email protected]

Department of Chemistry, University of Cambridge, Cambridge, UK

Abstract

We describe a modification of the stochastic coupled cluster algorithm that allows the use of multiple reference determinants. By considering the secondary references as excitations of the primary reference and using them to change the acceptance criteria for selection and spawning, we obtain a simple form of stochastic multireference coupled cluster which preserves the appealing aspects of the single reference approach. The method is able to successfully describe strongly correlated molecular systems using few references and low cluster truncation levels, showing promise as a tool to tackle strong correlation in more general systems. Moreover, it allows simple and comprehensive control of the included references and excitors thereof, and this flexibility can be taken advantage of to gain insight into some of the inner workings of established electronic structure methods.

1 Introduction

The study of strong correlation in electron systems has been an important theme in electronic structure theory in recent years, as it is present in a series of interesting chemical systems, such as radicals, excited states, transition states and dissociating bonds.1 In the presence of strong correlation, typically high-accuracy methods like coupled cluster often fail to correctly describe the system. This failure has been attributed to the decrease in quality of the Hartree–Fock wavefunction as a first-order representation of the system, as the static correlation present often leads to near-degeneracies in the Hilbert space which cannot be captured by a single-determinant wavefunction.

Coupled cluster (CC) theory2, 3 has become the most popular \latinab initio approach to electronic structure calculations, as it provides good results for medium-sized weakly correlated systems, while maintaining size-consistency and scaling polynomially with system size. However, for strongly correlated systems, it requires consideration of high level excitors in order to correctly estimate the correlation energy.4 Since its computational costs scale as $O(N^{2i+2})$ , where $i$ is the truncation level and $N$ is the system size, this limits its use to very small systems.

One way to circumvent this issue and accurately treat some strongly correlated systems has been to use multiple reference determinants. Today, the field of multireference coupled cluster is very broad, with numerous methods developed over the last forty years, falling broadly into two categories: particle-conserving methods, which either use multiple cluster operators in a Jeziorski-Monkhorst \latinansatz5, 6, 7, 8, 9, 10, 11, 12 or a single cluster operator,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 and Fock-space methods,25, 26, 27, 28, 29, 30, 31, 32 which generate wavefunctions with different numbers of electrons. While some of these have been successful in capturing the correlation energies of various test systems, 33, 34, 35, 36, 37, 25 they are plagued by various size-consistency and intruder-state issues.38, 39, 40, 29, 41, 42, 43, 17, 44, 45, 46

In recent years, conventional quantum chemical techniques have been successfully combined with stochastic wavefunction propagation methods to improve computational performance. A prime example of this is the Full Configuration Interaction Quantum Monte Carlo (FCIQMC) method.47 While, like Full Configuration Interaction (FCI), FCIQMC scales exponentially with system size, it does so with a significantly lower prefactor. This has allowed the method, together with its initiator adaptation,48 to successfully treat a variety of systems.49, 50, 51

A stochastic solution to the coupled cluster equations has also been implemented using Projector Monte Carlo.52 This Coupled Cluster Monte Carlo (CCMC) method reproduces deterministic CC results to within stochastic error bars, but only needs to store a small fraction of the Hilbert space, leading to significantly lowered memory and computational costs. Both FCIQMC and CCMC have recently been used in conjunction with conventional coupled cluster as a means to include selected higher-order clusters in a CC calculation, either iteratively or not.53, 54

In this paper we describe an implementation of multireference coupled cluster (using a single-reference formalism similar to that of 14, 15) within the stochastic paradigm, which allows for very quick implementation of such methods. In the following section we give an overview of the CCMC method and in the third section we describe our implementation of multireference Coupled Cluster Monte Carlo (mr-CCMC). The fourth and fifth sections then presents a series of results obtained using this method on known strongly correlated molecular systems. These results are discussed in the general context of multireference methods in section 5 and some conclusions are given in section 6.

2 Stochastic Coupled Cluster

In deterministic CC, the wavefunction is represented by the exponential \latinansatz

[TABLE]

where $\ket{D_{0}}$ is a reference wavefunction (usually the Hartree–Fock wavefunction),

[TABLE]

and $\hat{a}_{\textbf{i}}$ are excitors — combinations of creation and annihilation operators (for example, a second order excited determinant $\ket{D_{ij}^{ab}}=\hat{a}_{b}^{\dagger}\hat{a}_{a}^{\dagger}\hat{a}_{i}\hat{a}_{j}\ket{D_{0}}$ , so $\hat{a}_{ij}^{ab}=\hat{a}_{b}^{\dagger}\hat{a}_{a}^{\dagger}\hat{a}_{i}\hat{a}_{j}$ ). If we group the excitors based on their excitation level relative to the reference, we can also write

[TABLE]

where $i$ is the excitation level. This wavefunction is equivalent to the FCI wavefunction if all possible excitors are included. In truncated CC, $\hat{T}$ is limited to only excitors of up to a certain excitation level. In order to obtain $t_{\textbf{i}}$ , the Schrödinger equation is projected onto each of the determinants $\ket{D_{\textbf{i}}}$ (including the reference), leading to a series of coupled cluster equations to be solved:

[TABLE]

where $E$ is the energy of $\Psi_{\mathrm{CC}}$ . The number and complexity of these equations increases with the highest excitation level considered.

These equations are equivalent to

[TABLE]

Since

[TABLE]

this can be approximately recast in an iterative form as55

[TABLE]

It is possible to obtain the solutions to these equations from the population dynamics of a set of ‘excips’ in Hilbert space. This is done by stochastically sampling the action of the Hamiltonian, described by two processes: spawning of an excip from $\ket{D_{\textbf{i}}}$ onto another $\ket{D_{\textbf{j}}}$ coupled to it by the action of the Hamiltonian (with probability proportional to $\braket{D_{\textbf{j}}}{\hat{H}}{D_{\textbf{i}}}$ ) and death of excips on $\ket{D_{\textbf{i}}}$ (with probability proportional to $\braket{D_{\textbf{i}}}{\hat{H}-S}{D_{\textbf{i}}}$ ). The ‘shift’ $S$ replaces the parameter $E$ in the stochastic coupled cluster equations. Finally, pairs of excips of opposite signs on the same excitor annihilate each other, which helps ensure that the algorithm converges on the correct nodal structure.47 In order to improve computational performance and stability, a series of modifications to this algorithm have been made, such as the deterministic selection of the reference and non-composite excitors,56 the implementation of an efficient importance-based selection method,57 the use of a similarity transformed Hamiltonian in the linked CCMC formalism58 and the development of efficient excitation generators59, 60 and parallelizable algorithms. 56 More recently a diagrammatic version of CCMC has been implemented.61

From a CCMC calculation, we have two estimators for the correlation energy of $\ket{\Psi_{\mathrm{CCMC}}}$ :

The instantaneous projected energy

[TABLE]

The ‘shift’ $S$ , which is expected to converge to the correlation energy once the calculation has reached a stable excip population. Once a target population has been reached in a CCMC population, the shift is set to vary starting from the instantaneous value of the projected energy. The shift is then updated every $A$ steps using47

[TABLE]

where $N_{\mathrm{tot}}$ is the total excip population.

3 Multireference Coupled Cluster

Multireference methods are justified by the desire to include “important" highly-excited determinants in the wavefunction expansion (e.g. configurations with many electrons in antibonding orbitals that acquire large coefficients during bond breaking). These are only included in the single reference CC (sr-CC) algorithm at high truncation levels. Their inclusion causes a significant improvement in the energy estimate (see Figure 4), but also requires an increased computational cost. However, by considering such determinants as part of the reference (or model) space of the calculation, they can be included without increasing the truncation level.

3.1 Conventional MRCC

Most Hilbert-space multireference coupled cluster methods are based on the Jeziorski-Monkhorst formalism5

[TABLE]

where $\Phi_{\mu}$ are the reference-space functions, $\hat{T}^{\mu}$ are cluster operators defined relative to each of these references and $c_{\nu\mu}$ are CI coefficients. The formalism is the basis for so-called \latinstate-universal methods, where multiple wavefunctions are determined simultaneously. \latinState-specific methods have also been developed; here we will discuss a particular approach based on a single-reference formalism, known as SS CCSD(TQ)14, 62, 15, 63 or CCSDtq.16 The starting point is to observe that for a given reference $\ket{D_{\mathbf{i}}}=\hat{a}_{i}\ket{D_{0}}$ ,

[TABLE]

It is therefore possible to rewrite any multireference wavefunction in terms of excitations of a single reference only. With the appropriate intermediate normalisation ( $\braket{D_{0}}{\Psi}=1$ , it is further possible to write it in an exponential form, $e^{\hat{T}^{\prime}}D_{0}$ , where as before $\hat{T}^{\prime}=\sum_{i}\hat{T}_{i}^{\prime}$ , but highly excited $\hat{T}_{i}^{\prime}$ no longer include all possible excitations of order $i$ from $\ket{D_{0}}$ , but only those that can be reached by lower excitations from other references. To clarify, let us consider a complete (2,2) reference space, given by the four determinants. $\{\ket{D_{0}},\ket{D_{I}^{A}},\ket{D_{J}^{B}},\ket{D_{IJ}^{AB}}\}$ . $I$ and $A$ may be taken to be $\alpha$ spin-orbitals, and $J$ and $B$ as $\beta$ . If we are interested in including up to double excitations out of this space in our cluster expansion, the wavefunction can be described as a single-reference exponential \latinansatz15 with

[TABLE]

where model space orbitals have been labelled with capital letters $I,J,K,...\ A,B,C...$ and general orbitals as $i,j,k...\ a,b,c...$ . Primes have been used to differentiate summation indices over the model space from fixed model space orbitals.

In general, the cluster operator $\hat{T}^{\prime}$ can be written as a sum of internal and external cluster operators,15

[TABLE]

where $T^{\mathrm{int}}$ gives excitations within the model space and $T^{\mathrm{ext}}$ produces excitations outside the model space.

[TABLE]

where at least one of the indices in each cluster in $\hat{T}^{\mathrm{ext}}$ is not in the model space. The wavefunction may then be written as

[TABLE]

It is worth pointing out at this stage that the singly excited terms in the model space are constrained to have the correct spin (and therefore $\ket{D_{I}^{B}}$ and $\ket{D_{J}^{A}}$ are not included). However, pairs of model space orbitals of different spin can be included in $\hat{T}_{3}\begin{pmatrix}abC^{\prime}\\ I^{\prime}jk\end{pmatrix}$ , so long as the overall spin of the terms is correct. In the case where all but two indices in the term come from outside the model space, such terms, while included in the \latinansatz, are not related by a less than double excitation to any of the model space determinants. This feature will be important when comparing to our stochastic method.

The \latinansatz can be generalised to include any excitation level $\lambda$ from a model space of all at most $k$ -tuple excitations of a reference determinant, using a cluster operator15

[TABLE]

where indices $a_{1},...,a_{\lambda},i_{1},...,i_{\lambda}$ correspond to active or inactive orbitals, while $A_{1},...,A_{k}$ , $I_{1},...I_{k}$ are active indices.

3.2 Stochastic MRCC

Consider a stochastic coupled cluster calculation with truncation level $m$ . Currently, the single reference algorithm selects clusters that correspond to an excitation of up to order $m+2$ of the reference and allows spawning onto those that correspond to excitations up to order $m$ . We introduce a secondary reference in this model by allowing spawning and selection to occur in an expanded space, of size and shape determined by this secondary reference. However, the clusters will still be described exclusively by their effect on the primary reference, so we will not need to consider propagation differently for the two references. For example, in a system where the secondary reference considered is an excitation of order $n$ of the primary reference, we allow clusters to be selected if they correspond to excitations up to order $n+m+2$ of the primary reference. For high separations between references, this requires sampling a significantly larger space than the single reference equivalent, but due to recent improvements to the selection algorithm,57 this can be done relatively efficiently. Spawning is then only allowed onto excitors within $m$ excitations of either of the references.

Figure 1 shows an example of two references 6 excitations apart, treated at CCSD level. While this model nicely highlights the relation between the selection and spawning spaces, for easier comparison to the conventional method of Piecuch, Oliphant and Adamowicz,14, 15 we will consider two references two excitations apart, $\ket{D_{0}}$ and $\ket{D_{1}}=\ket{D_{IJ}^{AB}}$ and treat both at the CCSD level. The resulting wavefunction is given by $\Psi=e^{\hat{T^{\prime}}}\ket{D_{0}}$ , where

[TABLE]

where $A^{\prime}$ runs over model orbitals empty in $\ket{D_{0}}$ , $I^{\prime}$ runs over occupied ones, $j,k,l$ are core orbitals and $a,b$ are virtual orbitals. In terms of internal and external cluster operators,

[TABLE]

and $\hat{T}^{\mathrm{ext}}=\hat{T}^{\prime}-\hat{T}^{\mathrm{int}}$ . Obviously,

[TABLE]

where $\hat{a}_{1}\ket{D_{0}}=\ket{D_{1}}$ . Also if we label $\hat{T}^{(1)}$ cluster operators relative to $\ket{D_{1}}$ , then

[TABLE]

and so on. We can therefore write

[TABLE]

taking care to only include overlapping contributions in the cluster operator relative to a single reference. This leads to multiple equivalent representation of $\hat{T}^{\prime}$ in this form. In our stochastic approach, we only consider each excited determinant once, regardless of which references and clusters it can be reached from, so these considerations are trivially avoided. This makes CCMC an ideal framework for this kind of algorithm, as it allows simple implementations of potentially complicated reference spaces. Generalizing to an arbitrary number of references, with arbitrary corresponding truncation levels, we obtain

[TABLE]

where $\hat{T}_{i}$ are $i$ -th order excitors of the first reference, $\hat{T}^{(n)}_{j}$ are $j$ -th order excitors of the $n$ -th secondary reference, $\hat{a}_{n}$ is the excitor that generates the $n$ -th secondary reference from the first, $m_{n}$ is the truncation level for reference $n$ and $N$ is the number of secondary references used. We note here two differences from Eq. 17. Firstly, our formalism allows the definition of an arbitrary reference space, rather than requiring the inclusion of all excitations up to a certain order. Secondly, the truncation level with respect to each reference can be selected independently, allowing for additional flexibility.

This algorithm effectively allows consideration of secondary references while maintaining the relative simplicity of the sr-CCMC approach. It is worth noting that, in a multireference calculation that explores the set of determinants within $m$ excitations of two references, there is an approximately twofold increase in the proportion of the Hilbert space that must be stored compared to the corresponding single-reference calculation, truncated at excitation level $m$ . In general, we expect the space spanned by a calculation to increase at most linearly with the number of references, provided the truncation levels are the same for all references, as the spaces spanned by the cluster expansion about each reference may overlap, leading to slight sublinearity. In large basis sets, this is insignificant relative to the $\mathcal{O}(N^{2n})$ increase in memory costs associated with increasing the truncation level to $m+n$ in order to include the same determinants in a single-reference calculation. If lower truncation levels can be used to obtain results of the same accuracy, the scaling with system size is reduced polynomially. With the current selection scheme, the size of the selection space is only determined by the highest excited secondary reference, so we expect the computational scaling with number of references to be favourable.

If we include the two single excitations $\ket{D_{I}^{A}}$ and $\ket{D_{J}^{B}}$ in the model space, equation (18) becomes

[TABLE]

If we compare this to equation (12), we find all terms are accounted for, except for those of the form $\hat{T}_{3}\begin{pmatrix}abB\\ Ikl\end{pmatrix}$ and $\hat{T}_{3}\begin{pmatrix}abA\\ Jkl\end{pmatrix}$ . This is because, as mentioned as the end of the previous section, these are not within two excitations of any of the references. We therefore expect that, depending upon the magnitude of the contributions of such terms to the wavefunction, we may be able to observe differences between the SS CCSD(TQ) method and the mr-CCMCSD method, even when using the same model space. It is also worth pointing out that, while in our case there is a difference between equations (18) and (25), both of these model spaces would be described by the same cluster operator in SS CCSD(TQ), as the set of active orbitals is unchanged. Only the formal split of excitors between $\hat{T}^{\mathrm{int}}$ and $\hat{T}^{\mathrm{ext}}$ would change.

4 Two-Reference Results

4.1 The S4 model

First we look at a simple 4-electron system, known as the S4 model — $\text{H}_{4}$ in a square geometry,41. The symmetry of the system and the fact that each of the H-H distances may be longer than an equilibrium $\text{H}_{2}$ bond introduces significant electron correlation to the system, so we expect it to have some multireference character. As we increase the H–H separation, while maintaining the square geometry, the amount of strong correlation in the system also increases.

In a minimal basis,64 this system only has 10 Slater determinants in its Hilbert space, so we can easily obtain the FCI energy. In this case, both CCSDTQ and mr-CCSD with two references (2r-CCMCSD), where the second reference is chosen to be the highest excited determinant, explore the entire Hilbert space, so we expect very good agreement of both methods with FCI. Therefore this system is a good test that the behaviour of our algorithm is as expected. We can see from Table 1 that, at $r_{\mathrm{HH}}=2a_{0}$ there is indeed good agreement between the FCI result, CCMCSDTQ and 2r-CCMCSD projected energies, with differences of less than 0.1 milliHartrees, well within chemical accuracy ( $1.6\times 10^{-3}$ Hartree). Our results compare favourably to conventional MRCC results obtained for this system34, 41, 42.

A range of conventional MRCC methods41, 42 have been used to investigate this system in its strongly correlated regime ( $\alpha/a_{0}\in[2,7]$ ). As can be seen in Figure 2 the quality of our energy estimate remains consistent over this interval, showing an order of magnitude improvement over previous results.

4.2 The N2 molecule

The next system of interest is $\text{N}_{2}$ , which is known to be difficult to accurately describe by single reference methods at stretched geometries, due to correlation effects caused by the dissociation of the triple bond.65, 4 Going from the equilibrium bond length ( $2.118a_{0}$ ) to $3.6a_{0}$ , the convergence of the coupled cluster energy with truncation level becomes significantly poorer (Figure 4), requiring costly, high-truncation level calculations to converge on the FCI result. mr-CCMC can be applied to this system, using a sixth order excitation of the Hartree–Fock determinant as our second reference. This corresponds to exciting six electrons from bonding $\sigma$ and $\pi$ orbitals to anti-bonding ones (see Figure 3). We expect that this determinant is crucial in describing the bond breaking that occurs as the nitrogen molecule is stretched and therefore a good candidate for a second reference.

The numerical results of single- and multireference calculations on stretched nitrogen are given in the Supporting Information. For reference, Hartree–Fock energies are also given. In a STO-3G basis,66 2r-CCMC provides a significant improvement to our energy estimates, making 2r-CCMCSDT sufficient to get within chemical accuracy of the calculated FCI energy (Figure 4). A similar improvement can also be observed when treating the molecule in a larger Dunning cc-pVDZ basis set67 with frozen core electrons (Figure 4), confirming that the faster convergence is not simply a consequence of the multireference space effectively covering a high proportion of the relatively small STO-3G Hilbert space.

Figure 5 shows the proportion of the Hilbert space populated after the system has reached steady-state for single and multireference calculations. In general, coupled cluster memory costs should scale as $\mathcal{O}(N^{2l})$ and even calculations with high truncation levels only use a small fraction of the full Hilbert space of the system. Stochastic methods decrease the memory cost by a constant pre-factor 55. It can be seen from Figure 5 that 2r-CCMC produces more accurate results at a reduced memory cost relative to single-reference CCMC. Also, if CCSDT can be used to obtain results of similar accuracy to CCSDTQ, this reduces the scaling with system size by a factor of $N^{2}$ ( $N^{6}$ vs. $N^{8}$ ), provided an efficient sampling method for the multireference space is implemented.

We have also used mr-CCMC to calculate a binding curve for N2, given in Figure 6. Curves obtained from 2r-CCMCSDT are in significantly better agreement with FCI values4 than CCSD or CCSDT. These results will be discussed further in the following section.

4.3 The N ${}_{3}^{-}$ anion

Finally, we look at the azide anion in order to assess the effect of using a second reference in systems with larger numbers of electrons. We have found that both the equilibrium geometry ( $r_{\mathrm{NN}}=1.16$ Å)68 and a linear symmetrically stretched geometry ( $r_{\mathrm{NN}}=2.0$ Å) require high truncation levels for the CCMC energy to converge. For the multireference calculations, a quadruple excitation was used as the second reference, corresponding to the excitation of the four electrons in the non-bonding $\pi$ orbitals to the corresponding antibonding orbitals. As can be seen in Figure 7, once again 2r-CCMC provides a significant improvement to the energy estimate, even if 2r-CCMCSDT is not sufficient to reach chemical accuracy in the stretched case.

The poorer convergence for 2r-CCMC for N ${}_{3}^{-}$ suggests that the choice of secondary reference has a significant effect on the quality of the results. This is as expected, following from the notion that references should be highly weighted determinants in the expansion of the true ground state wavefunction. In the case of N2 we were aware of such a determinant, but for N ${}_{3}^{-}$ , we have at multiple reasonable choices of secondary reference, one of which is the fourth order excitation used. However, given that this excitation is already significant in the equilibrium geometry, it is likely that upon stretching the bonds, more highly excited determinants (perhaps the one corresponding to the excitation of both $\sigma$ and $\pi$ electrons, as for N2, or the excitation of bonding rather than non-bonding $\pi$ electrons) become highly weighted in the ground state and would therefore serve as better secondary references.

5 Beyond two references

We have shown in the previous section that, using two references, mr-CCMC is more successful in capturing the correlation in difficult molecular systems than the corresponding single reference methods. In this section we will turn our attention to the performance of the method relative to conventional MRCC methods and propose a procedure to balance the accuracy of our method against its computational cost.

5.1 Comparison to conventional MRCC methods

5.1.1 The H8 model

First, we turn our attention to the H8 model,69 shown in Figure 8, in a minimal basis.64

As the parameter $\alpha$ is varied from 0 to $\infty$ , the degree of electron correlation in the system decreases, as it dissociates to 4 independent H2 molecules. The HOMO and LUMO of the system at $\alpha=a_{0}$ become closer in energy as we decrease $\alpha$ , tending to degeneracy at $\alpha=0$ . Therefore, these orbitals form a natural choice of model space for mr-CCMC. This system has been studied using the SS CCSD(TQ) method of Piecuch, Oliphant and Adamowicz,63 allowing a direct comparison. We investigate the system for $\alpha/a_{0}\in[0.0001,1]$ and the results are given in Figure 9.

We find a noticeable ( $\approx 0.5$ miliHArtree) discrepancy between 2r-CCMC and SS CCSD(TQ) energies. Including the full(2,2) CAS in the reference space decreases this discrepancy, but 4r-CCMCSD is still not in agreement with SS CCSD(TQ). We expect this to be due to the absence of some terms in our cluster expansions compared to SS CCSD(TQ). To verify this, we compare the values given by these different approaches for the leading $\hat{T}_{3}$ terms in the CCSDTQ expansion.

As can be seen from Table 2, we observe more sign differences between the mr-CCSD wavefunctions and CCSDTQ than when comparing to SS CCSD(TQ). Also, at $\alpha=a_{0}$ , the fifth largest triple excitation coefficient in CCSDTQ is on a term that is excluded from our calculations, but included in SS CCSD(TQ). The presence of such terms in the wavefunction at other geometries as well could explain the discrepancy between our methods and SS CCSD(TQ). Indeed if we look at the 2r-CCSDT wavefunction (which includes some pentuple contributions and therefore has significantly different cluster amplitudes than CCSDTQ), this cluster continues to make a significant contribution at $\alpha=0.1a_{0}$ . We can therefore expect such clusters to continue being significant in the CCSDTQ and SS CCSD(TQ) wavefunctions, potentially justifying the discrepancy with relative to 2r- and 4r-CCMCSD. While our method does not include them when considering the active space, a small number of additional references could be included to ensure their presence. Alternatively, increasing the truncation level to CCSDT in a two reference calculation is sufficient to recover these terms and indeed obtain much more accurate results than either mr-CCMCSD or SS CCSD(TQ).

5.1.2 The N2 molecule

Figure 10 shows the difference between various implementations of MRCC, at CCSD level37, 4 and the FCI energy along the N2 binding curve. It can be easily observed that 2r-CCMCSDT performs as well as the best of these methods, while 2r-CCMCSD shows a significant deviation from the FCI values. We believe that the primary cause of this is the fact that all conventional methods are built on top of a CASSCF calculation in the N2 (6,6) CAS,37 with all double excitations out of this CAS considered. It is immediately obvious that the spanned space of such calculations is a large superset of the space our two-reference CCSD calculation spans, which could be expected to improve the accuracy of these calculations.

To obtain a fairer comparison, we have included all 400 determinants in the CAS as references in our calculation (400r-CCSD), allowing double excitations out of each. Including the CAS in such a way is equivalent to using a CASCI reference wavefunction and significantly improves the quality of the obtained correlation energy (see Figure 10), yielding a method that outperforms all but the most accurate conventional methods. The remaining gap can be bridged by using CASSCF rather than HF orbitals, however this comes at an increased computational cost.

5.1.3 The H2O molecule

We also investigate the symmetric dissociation of the water molecule, over a range of OH bond lengths raging from the equilibrium value $R_{\mathrm{e}}=1.84345a_{0}$ to $3R_{\mathrm{e}}$ , with the HOH angle fixed at $110.6\deg$ . As can be seen from Figure 11, for this system the CCSDT description fails at long bond lengths. By comparison, CCSD111The performance of all is highly dependent of the exact Hartree–Fock reference used at $r_{\mathrm{OH}}=3R_{\mathrm{e}}$ , where there are two low lying RHF states. One, with $E=-75.34439Hartree$ , gives the results shown in Figure 11 for CCSD, CCSDT and mr-CCMC. The other, with $E=-75.4341998$ Hartree, causes both conventional CCSD and CCMCSD, CCMCSDT and mr-CCMCSDT to converge to a metastable excited state with $E_{\mathrm{corr}}\approx-0.4$ Hartree. continues to provide reasonable descriptions across the binding curve. As in the case of N2, this molecule has been studied using state-specific MRCC methods,37 based on the (4,4) CASSCF wavefunction as a reference. Both SS-MRCCSD and sr-MRBWCCSD consistently give errors of less than 5 and 15 miliHartree respectively, relative to the FCI results. The CCSDtq method has also been applied to this system, giving errors consistently below 3 miliHartree, which can be reduced by applying further corrections.16, 70

2r-CCMCSD, using the highest excited determinant in the (4,4) CAS as a secondary reference performs comparably to sr-MRBWCCSD, however 2r-CCMCSDT shows a significant improvement, with errors of less than 1.5 mHartrees across the entire binding curve. Unlike its single-reference counterpart, 2r-CCSDT provides a consistent description of the system at all bond lengths.

As before, we can use all determinants in the (4,4) CAS as references for a CCMCSD method. Once again, we observe a significant improvement in the quality of our estimates, generally outperforming conventional MRCC methods, but not 2r-CCMCSDT or CCSDtq.

5.2 Bridging the gap

While achieving results of similar quality for N2, it is worth noting that the stochastic Hilbert space of 2r-CCMCSDT is less than half of that of 400r-CCMCSD (68000 vs. 151100 determinants). A direct comparison of these Hilbert spaces shows that, rather than the 2r-CCMCSDT calculation spanning a strict subset of the 400r-CCMCSD space, they only partially overlap. The CAS shows significant redundancy in spanning this overlap, with an average of 9 CAS determinants connected to any (connected) determinants. However, there are determinants in the overlap that are solely connected to one CAS determinant. Altogether these connect to only 38 of the CAS determinants and it turns out these 38 determinants are also sufficient to span the whole overlap. This suggests that the significant part of the wavefunction is encoded in this subspace. The flexibility mr-CCMC has in terms of defining references and their accepted cluster excitation levels allows us to easily investigate this hypothesis. Indeed, an mr-CCSD calculation using these 38 determinants as references recovers $98.7\%$ of the correlation energy at $r=3.6a_{0}$ , while decreasing the Hilbert space (and therefore memory cost) by $82\%$ compared to the 400r-CCSD case. It maintains this level of accuracy consistently across the binding curve, as can be seen in Figure 10

The mr-CCMC method shows fast convergence of the correlation energy with increasing number of references from this subset (see Figure 12). While the exact details of the convergence depend on the order in which the references are included, the behaviour is significantly outside the standard deviation of a randomly selected set of 38 references, supporting the idea that these references and their excitations encode the significant part of the wavefunction. We also observe the expected sub-linear scaling of memory cost with number of references, as their spawned spaces begin to overlap (see Figure 13).

In order to obtain this optimised reference space, one requires knowledge of the spanned Hilbert spaces of the larger 400r-CCSD and 2r-CCSDT calculations. In this work, the information was acquired from stochastic snapshots of the two calculations, however a list of all determinants in the Hilbert space of each calculation can be easily generated if the references and excitation levels of the methods are known. This could then be analysed in the same way we have done here and used to predict an optimised, less computationally expensive method, without incurring the cost of actually running the more demanding calculations.

6 Conclusions

We have successfully implemented a simple multireference technique within the framework of stochastic coupled cluster. The method shows a systematic improvement over single-reference CCMC, giving high-accuracy energy estimates in known strongly correlated molecular systems. The memory requirements are expected to scale sublinearly with the number of references used. This scaling is significantly better than the one expected with increasing the truncation level in a large Hilbert space, making the technique likely useful for the treatment of more complicated systems, with multiple highly weighted determinants in the true ground state. Significantly, in most cases the performs at least as well as many deterministic multireference methods, while providing a simple algorithm and significant possibilities for expansion.

In one case, we have shown that lowered accuracy may be correlated to the absence of some potentially significant clusters from our expansion. The effect of including these clusters will be investigated further. We have also observed that the choice of one electron orbitals can affect the quality of the mr-CCMC results. We are interested in investigating iterative schemes to optimise the orbitals used.

The method also shows great flexibility in the choice of reference and excitation space used, without significant effects on the stability and general behaviour of the calculations. This allows for potential detailed investigation into the structure of coupled cluster wavefunctions, as well as potential optimised computations, using the minimal required reference space.

The extent to which the use of multiple references improves the correlation energy is system dependent, which may be at least partly due to the different quality of the secondary references. Therefore, a systematic way of selecting the best secondary references, especially in systems where chemical intuition is lacking, is of further interest. This could potentially be done by iteratively modifying the reference space, using an amplitude threshold, similarly to what is done in the initiator approach or selected CI. 71, 72, 73, 74, 75 A connectivity criterion could also be implemented. With this refinement, we expect that this formulation of stochastic multireference coupled cluster could provide a flexible and robust method to compute accurate energies for a wide range of strongly correlated systems.

{acknowledgement}

M-A.F. is grateful to Magdalene College, Cambridge for summer project funding and to the Cambridge Trust and Corpus Christi College for a studentship. C.J.C.S. is grateful to the Sims Fund for a studentship and A.J.W.T. to the Royal Society for a University Research Fellowship under Grant No. UF160398. All are grateful for support under ARCHER Leadership Project grant e507.

Molecular orbital integrals were generated using PySCF76, Psi477 and Q-Chem78. CASSCF orbitals were obtained using PySCF or ORCA79. Stochastic post-Hartree Fock and some FCI calculations were performed using a development version of HANDE-QMC .80 Deterministic CC calculations for N2 at 1.8 $a_{0}$ were performed in MRCC.81

{suppinfo}

Numerical values for CCMC energies, as well as Hartree–Fock and FCI references.

Bibliography81

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Lyakh \latin et al. 2011 Lyakh, D. I.; Musiał, M.; Lotrich, V. F.; Bartlett, R. J. Multireference Nature of Chemistry: The Coupled-Cluster View. Chem. Rev. 2011 , 182
2Čížek 1966 Čížek, J. On the Correlation Problem in Atomic and Molecular Systems. Calculation of Wavefunction Components in Ursell-Type Expansion Using Quantum-Field Theoretical Methods. J. Chem. Phys 1966 , 45 , 4256
3Čížek 1969 Čížek, J. On the Use of the Cluster Expansion and the Technique of Diagrams in Calculations of Correlation Effects in Atoms and Molecules. Adv. Chem. Phys. 1969 , 24 , 35
4Chan \latin et al. 2004 Chan, G. K.-L.; Kállay, M.; Gauss, J. State-of-the-art density matrix renormalization group and coupled cluster theory studies of the nitrogen binding curve. J. Chem. Phys. 2004 , 121 , 6110
5Jeziorski and Monkhorst 1981 Jeziorski, B.; Monkhorst, H. J. Coupled-cluster method for multideterminantal reference states. Phys. Rev. A 1981 , 24 , 1668
6Piecuch and Paldus 1992 Piecuch, P.; Paldus, J. Orthogonally spin-adapted multi-reference Hilbert space coupled-cluster formalism: diagrammatic formulation*. Theor. Chim. Acta 1992 , 83 , 69
7Piecuch and Paldus 1994 Piecuch, P.; Paldus, J. Orthogonally spin-adapted state-universal coupled-cluster formalism: Implementation of the complete two-reference theory including cubic and quartic coupling terms. J. Chem. Phys 1994 , 101 , 5875
8Mahapatra \latin et al. 1998 Mahapatra, U. S.; Datta, B.; Mukherjee, D. A state-specific multi-reference coupled cluster formalism with molecular applications. Mol. Phys. 1998 , 94 , 157

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Multireference Stochastic Coupled Cluster

Abstract

1 Introduction

2 Stochastic Coupled Cluster

3 Multireference Coupled Cluster

3.1 Conventional MRCC

3.2 Stochastic MRCC

4 Two-Reference Results

4.1 The S4 model

4.2 The N2 molecule

4.3 The N3−{}_{3}^{-}3−​ anion

5 Beyond two references

5.1 Comparison to conventional MRCC methods

5.1.1 The H8 model

5.1.2 The N2 molecule

5.1.3 The H2O molecule

5.2 Bridging the gap

6 Conclusions

4.3 The N ${}_{3}^{-}$ anion