Stochastic vertex corrections: linear scaling methods for accurate   quasiparticle energies

Vojtech Vlcek

arXiv:1904.05524·physics.chem-ph·August 27, 2019

Stochastic vertex corrections: linear scaling methods for accurate quasiparticle energies

Vojtech Vlcek

PDF

TL;DR

This paper introduces stochastic linear-scaling methods within many-body perturbation theory to compute quasiparticle energies accurately, incorporating vertex corrections and validated on various molecular systems.

Contribution

It develops and verifies three stochastic approaches for quasiparticle energy calculations, including non-local vertex corrections, with linear scaling demonstrated for large molecules.

Findings

01

Vertex corrections are essential for accurate unoccupied state description.

02

All three stochastic methods scale linearly with the number of electrons.

03

Methods are validated against deterministic results for small molecules.

Abstract

New stochastic approaches for the computation of electronic excitations are developed within the many-body perturbation theory. Three approximations to the electronic self-energy are considered: $G_{0} W_{0}$ , $G_{0} W_{0}^{t} c$ , and $G_{0} W_{0}^{t c} Γ_{x}$ . All three methods are formulated in the time domain and the latter two incorporate non-local vertex corrections. In case of $G_{0} W_{0}^{t c} Γ_{x}$ , the vertex corrections are included both in the screened Coulomb interaction and in the expression for the self-energy. The implementation of the three approximations is verified by comparison to deterministic results for a set of small molecules. The performance fully stochastic implementation is tested on acene molecules, C $_{60}$ and PC $_{60}$ BM. The vertex correction appears crucial for the description of unoccupied states. Unlike conventional (deterministic) approaches, all three stochastic…

Tables3

Table 1. Table 1: Quasiparticle energies computed with distinct approximations to the self-energy based on HF starting point. Statistical error of each calculation is ≤ 0.05 absent 0.05 \leq 0.05 eV. The numbers of stochastic vectors representing the Green’s function ( N ζ subscript 𝑁 𝜁 N_{\zeta} ) are given in the parentheses. N ζ subscript 𝑁 𝜁 N_{\zeta} is rounded up to the nearest 100. Δ G 0 W 0 t c Δ subscript 𝐺 0 superscript subscript 𝑊 0 𝑡 𝑐 \Delta G_{0}W_{0}^{tc} represents the difference between the G 0 W 0 t c subscript 𝐺 0 superscript subscript 𝑊 0 𝑡 𝑐 G_{0}W_{0}^{tc} and G 0 W 0 subscript 𝐺 0 subscript 𝑊 0 G_{0}W_{0} result; the same notation is used for G 0 W 0 t c Γ X subscript 𝐺 0 superscript subscript 𝑊 0 𝑡 𝑐 subscript Γ 𝑋 G_{0}W_{0}^{tc}\Gamma_{X} in the last two columns. Reference results (labeled Ref. in the column header) are taken from Ref. 49 . All values are in eV.

system	$G_{0} W_{0}$ ( $N_{ζ}$ )	$Δ G_{0} W_{0}^{t c}$ ( $N_{ζ}$ )	Ref. $Δ G_{0} W_{0}^{t c}$	$Δ G_{0} W_{0}^{t c} Γ_{X}$ ( $N_{ζ}$ )	Ref. $Δ G_{0} W_{0}^{t c} Γ_{X}$
ammonia	-11.17 (4400)	0.56 (8800)	0.59	0.13 (4300)	0.17
ethylene	-11.02 (1700)	0.17 (3200)	0.20	0.36 (1700)	0.25
methane	-15.04 (1500)	0.40 (2800)	0.40	0.45 (1400)	0.38
methanol	-11.56 (3500)	0.64 (6300)	0.61	0.28 (3000)	0.32
water	-13.25 (4800)	0.71 (9200)	0.80	0.26 (5100)	0.18

Table 2. Table 2: Real-space grids (characterized by the number of points N g subscript 𝑁 𝑔 N_{g} ) and the range-separation parameters ( γ 𝛾 \gamma ) used in the calculations. The molecular structures were taken from Refs. 80 , 38 , 81

system	$N_{g}$	$γ$ [ $a_{0}^{- 1}$ ]
anthracene	$80 \times 60 \times 50$	0.23
tetracene	$88 \times 60 \times 50$	0.21
pentacene	$108 \times 60 \times 50$	0.19
hexacene	$112 \times 60 \times 50$	0.17
C₆₀	$88 \times 88 \times 88$	0.18
PC₆₀BM	$88 \times 88 \times 88$	0.15

Table 3. Table 3: Quasiparticle energies computed with DFT (LC- ω 𝜔 \omega PBE) and distinct approximations to the self-energy. Statistical errors of the stochastic methods are given in the parentheses. The reference values are taken from Refs. 38 , 82 , 83 , 84 , 85 . The number of ζ 𝜁 \zeta states is 1,500; the remaining parameters are described in the text.

	HOMO
system	DFT	$G_{0} W_{0}$	$G_{0} W_{0}^{t c}$	$G_{0} W_{0}^{t c} Γ_{X}$	Ref.
anthracene	-7.33	-7.42 ( $\pm$ 0.04)	-7.31 ( $\pm$ 0.04)	-7.25 ( $\pm$ 0.05)	-7.48
tetracene	-6.70	-7.00 ( $\pm$ 0.06)	-6.89 ( $\pm$ 0.05)	-6.79 ( $\pm$ 0.06)	-6.96
pentacene	-6.47	-6.65 ( $\pm$ 0.04)	-6.55 ( $\pm$ 0.05)	-6.42 ( $\pm$ 0.05)	-6.58
hexacene	-6.15	-6.32 ( $\pm$ 0.05)	-6.22 ( $\pm$ 0.04)	-6.11 ( $\pm$ 0.05)	-6.32
C₆₀	-7.90	-7.69 ( $\pm$ 0.06)	-7.68 ( $\pm$ 0.04)	-7.60 ( $\pm$ 0.05)	-7.69*
PC₆₀BM	-7.27	-7.42 ( $\pm$ 0.06)	-7.26 ( $\pm$ 0.03)	-7.20 ( $\pm$ 0.04)	-7.17*
	LUMO
system	DFT	$G_{0} W_{0}$	$G_{0} W_{0}^{t c}$	$G_{0} W_{0}^{t c} Γ_{X}$	Ref.
anthracene	-0.49	-0.54 ( $\pm$ 0.04)	-0.71 ( $\pm$ 0.04)	+0.00 ( $\pm$ 0.05)	-0.28
tetracene	-1.04	-1.22 ( $\pm$ 0.06)	-1.39 ( $\pm$ 0.05)	-0.55 ( $\pm$ 0.06)	-0.82
pentacene	-1.53	-1.65 ( $\pm$ 0.05)	-1.80 ( $\pm$ 0.05)	-0.91 ( $\pm$ 0.05)	-1.21
hexacene	-1.84	-1.94 ( $\pm$ 0.05)	-2.07 ( $\pm$ 0.04)	-1.27 ( $\pm$ 0.05)	-1.47
C₆₀	-2.47	-2.77 ( $\pm$ 0.07)	-2.91 ( $\pm$ 0.04)	-1.90 ( $\pm$ 0.05)	-2.68*
PC₆₀BM	-2.46	-2.61 ( $\pm$ 0.07)	-2.85 ( $\pm$ 0.04)	-1.95 ( $\pm$ 0.04)	-2.63*

Equations105

i G (1, 2) = ⟨ Ψ T \hat{ψ} (1) \hat{ψ}^{†} (2) Ψ ⟩,

i G (1, 2) = ⟨ Ψ T \hat{ψ} (1) \hat{ψ}^{†} (2) Ψ ⟩,

[i \frac{\partial}{\partial t} - \hat{h}] G (1, 2) - Σ_{T} (1, \overset{ˉ}{3}) G (\overset{ˉ}{3}, 2) = δ (r_{1} - r_{2})

[i \frac{\partial}{\partial t} - \hat{h}] G (1, 2) - Σ_{T} (1, \overset{ˉ}{3}) G (\overset{ˉ}{3}, 2) = δ (r_{1} - r_{2})

\hat{h} ψ (r_{1}) + \int Σ_{T} (r_{1}, r_{2}, ω = ε) ψ (r_{2}) d r_{2} = ε ψ (r_{1})

\hat{h} ψ (r_{1}) + \int Σ_{T} (r_{1}, r_{2}, ω = ε) ψ (r_{2}) d r_{2} = ε ψ (r_{1})

Σ_{T} (1, 2) = Σ_{H} (1) δ (1, 2) + Σ_{x c}, (1, 2),

Σ_{T} (1, 2) = Σ_{H} (1) δ (1, 2) + Σ_{x c}, (1, 2),

Σ_{H} (1) = - i ν (1, \overset{ˉ}{2}) G (\overset{ˉ}{2}, \overset{ˉ}{2}^{+})

Σ_{H} (1) = - i ν (1, \overset{ˉ}{2}) G (\overset{ˉ}{2}, \overset{ˉ}{2}^{+})

ν (1, 2) = \frac{1}{∣ r _{1} - r _{2} ∣} δ (t_{1} - t_{2})

ν (1, 2) = \frac{1}{∣ r _{1} - r _{2} ∣} δ (t_{1} - t_{2})

Σ_{x c} (1, 2) = - i ν (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) \frac{δ G ^{- 1} ( 3 ˉ , 2 )}{δ U ( 4 ˉ )},

Σ_{x c} (1, 2) = - i ν (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) \frac{δ G ^{- 1} ( 3 ˉ , 2 )}{δ U ( 4 ˉ )},

\frac{δ G ^{- 1} ( 3 , 2 )}{δ U ( 4 )} = \frac{δ G ^{- 1} ( 3 , 2 )}{δ U _{c l} ( 5 ˉ )} \frac{δ U _{c l} ( 5 ˉ )}{δ U ( 4 )},

\frac{δ G ^{- 1} ( 3 , 2 )}{δ U ( 4 )} = \frac{δ G ^{- 1} ( 3 , 2 )}{δ U _{c l} ( 5 ˉ )} \frac{δ U _{c l} ( 5 ˉ )}{δ U ( 4 )},

ϵ^{- 1} (5, 4) \equiv \frac{δ U _{c l} ( 5 )}{δ U ( 4 )} .

ϵ^{- 1} (5, 4) \equiv \frac{δ U _{c l} ( 5 )}{δ U ( 4 )} .

Γ (3, 2, 5) := - \frac{δ G ^{- 1} ( 3 , 2 )}{δ U _{c l} ( 5 )} .

Γ (3, 2, 5) := - \frac{δ G ^{- 1} ( 3 , 2 )}{δ U _{c l} ( 5 )} .

Σ_{x c} (1, 2) = iW (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) Γ (\overset{ˉ}{4}, \overset{ˉ}{3}, 2),

Σ_{x c} (1, 2) = iW (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) Γ (\overset{ˉ}{4}, \overset{ˉ}{3}, 2),

Σ_{x c} (1, 2) =

Σ_{x c} (1, 2) =

i ν (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) [δ (\overset{ˉ}{3}, 2) δ (\overset{ˉ}{3}, \overset{ˉ}{4}) + \frac{δ Σ _{T} ( 3 ˉ , 2 )}{δ U ( 4 ˉ )}] .

Σ_{x} (1, 2) = i ν (1, 2) G (1, 2) .

Σ_{x} (1, 2) = i ν (1, 2) G (1, 2) .

Σ_{c} (1, 2) = i ν (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) \frac{δ Σ _{T} ( 3 ˉ , 2 )}{δ U ( 4 ˉ )} .

Σ_{c} (1, 2) = i ν (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) \frac{δ Σ _{T} ( 3 ˉ , 2 )}{δ U ( 4 ˉ )} .

\frac{δ Σ _{T} ( 3 , 2 )}{δ U ( 4 )} = \frac{δ Σ _{T} ( 3 , 2 )}{δ G ( 6 ˉ , 5 ˉ )} \frac{δ G ( 6 ˉ , 5 ˉ )}{δ U ( 4 )} .

\frac{δ Σ _{T} ( 3 , 2 )}{δ U ( 4 )} = \frac{δ Σ _{T} ( 3 , 2 )}{δ G ( 6 ˉ , 5 ˉ )} \frac{δ G ( 6 ˉ , 5 ˉ )}{δ U ( 4 )} .

^{3} χ (6, 5, 4) := - \frac{δ G ( 6 , 5 )}{δ U ( 4 )} .

^{3} χ (6, 5, 4) := - \frac{δ G ( 6 , 5 )}{δ U ( 4 )} .

Σ_{c} (1, 2) = - i ν (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) \frac{δ Σ _{T} ( 3 ˉ , 2 )}{δ G ( 6 ˉ , 5 ˉ )}^{3} χ (\overset{ˉ}{6}, \overset{ˉ}{5}, \overset{ˉ}{4}) .

Σ_{c} (1, 2) = - i ν (1, \overset{ˉ}{4}) G (1, \overset{ˉ}{3}) \frac{δ Σ _{T} ( 3 ˉ , 2 )}{δ G ( 6 ˉ , 5 ˉ )}^{3} χ (\overset{ˉ}{6}, \overset{ˉ}{5}, \overset{ˉ}{4}) .

\frac{δ Σ _{T} ( 3 , 2 )}{δ G ( 6 , 5 )} \approx - ν (2, 5) δ (6, 5) δ (3, 2),

\frac{δ Σ _{T} ( 3 , 2 )}{δ G ( 6 , 5 )} \approx - ν (2, 5) δ (6, 5) δ (3, 2),

χ (5, 4) := - \frac{δ G ( 5 , 5 ^{+} )}{δ U ( 4 )} = \frac{δ n ( 5 )}{δ U ( 4 )} .

χ (5, 4) := - \frac{δ G ( 5 , 5 ^{+} )}{δ U ( 4 )} = \frac{δ n ( 5 )}{δ U ( 4 )} .

Σ_{c}^{G W} (1, 2) = i ν (1, \overset{ˉ}{4}) G (1, 2) ν (2, \overset{ˉ}{5}) χ (\overset{ˉ}{5}, \overset{ˉ}{4}) .

Σ_{c}^{G W} (1, 2) = i ν (1, \overset{ˉ}{4}) G (1, 2) ν (2, \overset{ˉ}{5}) χ (\overset{ˉ}{5}, \overset{ˉ}{4}) .

Σ_{x c}^{G W} (1, 2) = iW (1, 2^{+}) G (1, 2),

Σ_{x c}^{G W} (1, 2) = iW (1, 2^{+}) G (1, 2),

W (1, 2) = ν (1, 2) + ν (1, \overset{ˉ}{3}) χ (\overset{ˉ}{3}, \overset{ˉ}{4}) ν (\overset{ˉ}{4}, 2) .

W (1, 2) = ν (1, 2) + ν (1, \overset{ˉ}{3}) χ (\overset{ˉ}{3}, \overset{ˉ}{4}) ν (\overset{ˉ}{4}, 2) .

\frac{δ Σ _{T} ( 3 , 2 )}{δ G ( 6 , 5 )} \approx

\frac{δ Σ _{T} ( 3 , 2 )}{δ G ( 6 , 5 )} \approx

- ν (2, 5) δ (6, 5) δ (3, 2) + ν (3, 2) δ (5, 2) δ (6, 3)

Σ_{c}^{G W Γ_{X}} (1, 2) =

Σ_{c}^{G W Γ_{X}} (1, 2) =

-

Σ_{c}^{G W + S O S E X} (1, 2) = i ν (1, \overset{ˉ}{4}) G (1, 2) ν (2, \overset{ˉ}{5}) χ (\overset{ˉ}{5}, \overset{ˉ}{4})

Σ_{c}^{G W + S O S E X} (1, 2) = i ν (1, \overset{ˉ}{4}) G (1, 2) ν (2, \overset{ˉ}{5}) χ (\overset{ˉ}{5}, \overset{ˉ}{4})

-

ε_{i} = ε_{i}^{0} + ⟨ ϕ_{i} \hat{Σ} (ω = ε_{i}) - \overset{v}{^}_{x c} ϕ_{i} ⟩,

ε_{i} = ε_{i}^{0} + ⟨ ϕ_{i} \hat{Σ} (ω = ε_{i}) - \overset{v}{^}_{x c} ϕ_{i} ⟩,

\hat{H}_{0} = \hat{h} + \hat{Σ}_{H} + \hat{V}_{c} + \hat{V}_{x} + \hat{Σ}_{x}^{γ}

\hat{H}_{0} = \hat{h} + \hat{Σ}_{H} + \hat{V}_{c} + \hat{V}_{x} + \hat{Σ}_{x}^{γ}

Σ_{x}^{γ} (1, 2) = i ν^{γ} (1, 2) G_{0} (1, 2),

Σ_{x}^{γ} (1, 2) = i ν^{γ} (1, 2) G_{0} (1, 2),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Stochastic vertex corrections: linear scaling methods for accurate quasiparticle energies

Vojtěch Vlček

Department of Chemistry and Biochemistry, University of California, Santa Barbara, 93106, U.S.A

[email protected]

Abstract

New stochastic approaches for the computation of electronic excitations are developed within the many-body perturbation theory. Three approximations to the electronic self-energy are considered: $G_{0}W_{0}$ , $G_{0}W_{0}^{t}c$ , and $G_{0}W_{0}^{tc}\Gamma_{x}$ . All three methods are formulated in the time domain and the latter two incorporate non-local vertex corrections. In case of $G_{0}W_{0}^{tc}\Gamma_{x}$ , the vertex corrections are included both in the screened Coulomb interaction and in the expression for the self-energy. The implementation of the three approximations is verified by comparison to deterministic results for a set of small molecules. The performance fully stochastic implementation is tested on acene molecules, C60 and PC60BM. The vertex correction appears crucial for the description of unoccupied states. Unlike conventional (deterministic) approaches, all three stochastic methods scale linearly with the number of electrons.

1 Introduction

Efficient first-principles methods allow calculations of the ground state electronic structure in large molecules and solids. 1, 2, 3, 4, 5, 6, 7, 8, 9 However, quantitative prediction of electronic excitation is still computationally prohibitive. Traditional quantum chemistry methods, such as configurational interaction10, 11 or coupled cluster12, 13, 14, 15 approaches scale at least as $N_{e}^{6}$ (where $N_{e}$ is the number of electrons).2, 16 As a result, these methods are applied only to small systems.

Many-body perturbation theory17, 18 offers an alternative and becomes an increasingly popular tool for computation of quasiparticle (QPs) energies of molecules. The central quantity is the QP self-energy, i.e., a dynamical potential that embodies all many-body interactions. In principle, it is found in a self-consistent manner.19, 18 The self-consistency relates the self-energy to the QP Green’s function, polarizability and screened Coulomb interaction.

In practice, the expression for the self-energy is often simplified by neglecting high-order terms (non-trivial part of vertex function $\Gamma$ ) leading to the $GW$ approximation.19, 20, 21, 18 In addition, self-consistency in $GW$ calculations is either further approximated or completely avoided.22, 23, 20, 24, 25, 26, 27, 18 The latter thus corresponds to a one-shot correction, labeled $G_{0}W_{0}$ , on top a mean-field starting point (usually DFT). Such practical simplification of $GW$ has two purposes: (i) Conventional implementations scale as $N_{e}^{3}$ or $N_{e}^{4}$ , and repeated evaluation of the self-energy is thus costly even for small systems.28, 29, 30, 31 (ii) Self-consistent $GW$ may yield worse results than $G_{0}W_{0}$ due to the absence of the vertex term.32, 27, 33 The typical strategy is thus to use $G_{0}W_{0}$ on top of the “best” possible DFT starting point.34, 35, 36, 37, 38 Recent benchmark for acenes, however, revealed that $GW$ suffers from substantial errors for QP energies of unoccupied states.38

Beyond $GW$ techniques include approximate vertex functions ( $\Gamma$ ), which are closely related to the electron-hole interaction kernel in the Bethe-Salpeter equation (BSE).21 In practice $\Gamma$ is computed in various ways: Local vertex functions derived from the Kohn-Sham time-dependent density functional theory (TDDFT) are simple and relatively inexpensive,39, 40, 41, 42, 43 but they do not remedy failures of $GW$ , such as spurious “self-screening error”.44 Furthermore, they do not outperform simple $G_{0}W_{0}$ .45, 40, 43

Non-local vertex functions seem to improve the description of the QP energies;46, 44, 47, 48, 49 however, they are costly and suffer from steep scaling ( $N_{e}^{6}$ ).49 Alternatively, the vertex term has been approximated up to the second order,50, 51, 36 but this is associated with only mild cost reduction ( $N_{e}^{5}$ scaling).52, 51 Consequently, the beyond $GW$ calculations have been applied only to model or few-electron systems.46, 53, 48, 49

Here, numerical and theoretical developments are combined to overcome this limitation. A self-consistent expression for the self-energy with non-local $\Gamma$ is constructed using derivatives of the inverse Green’s function.53, 54 In practice, we apply only a one-shot correction, in which $\Gamma$ is derived from the non-local exchange term present in the mean-field starting point: either in the Hartree-Fock (HF) approximation or generalized Kohn-Sham (GKS) theory.55 The approach is labeled $GW\Gamma_{X}$ .

To lower the computational cost, $GW\Gamma_{X}$ is implemented using real-time stochastic numerical techniques.8, 56, 57, 58, 59, 60, 61 Up to now, stochastic calculations of QP energies were limited to $G_{0}W_{0}$ with DFT based on the local density approximation (LDA) to exchange and correlation (xc). Here, we first extend the methodology to HF and hybrid xc functionals. Next, the stochastic form of the self-energy is presented and tested on a set of molecules.

The stochastic implementation scales (sub)linearly with the number of electrons. Furthermore, favorable self-averaging leads to low statistical noise. For large systems, the $GW\Gamma_{X}$ method is found to be computationally less expensive than the stochastic implementation of $G_{0}W_{0}$ based on hybrid xc functionals. Results for ionization potentials and electron affinities of molecules suggest that the inclusion of a non-local vertex is necessary for accurate predictions of QP energies.

The manuscript is organized as follows: Derivation of the self-energy expressions is presented in Sec. 2. The stochastic formulation suitable for numerical implementation is shown in Sec. 3. Performance of the method and its implementation are demonstrated in the results section (Sec. 4) followed by conclusions and outlook (Sec. 5).

2 Theory

In this section, we first review the theoretical description of quasiparticles (QP), namely quasi-electron and quasi-holes. Propagation of a QP is described by a single-particle Green’s functions (GF), which is defined as

[TABLE]

where $\Psi$ is the ground state many-body wave-function of the $N_{e}$ -electron system, $\mathcal{T}$ denotes time ordering operator, $\hat{\psi}$ and $\hat{\psi}^{\dagger}$ are the electron annihilation and creation operators. Here, we adopt a short-hand notation for space-time coordinates: $\left({\bf r}_{1},t_{1}\right)\equiv 1$ .

The GF satisfies the equation of motion

[TABLE]

where $\hat{h}$ contains the kinetic energy and the electron-nuclear attraction terms, and $\Sigma_{T}$ is the total self-energy, which contains both Hartree and exchange-correlation interactions. Further, we simplified the notation by omitting integration symbol and introducing a bar symbol above the space-time coordinates that should be integrated.

In this work, we focus on QP energies of quasi-electrons and quasi-holes ( $\varepsilon$ ), which are obtained from the QP equation:

[TABLE]

where $\psi$ is the QP state. Eq. 3 is a fixed point expression where $\Sigma_{T}$ has to be computed at the frequency corresponding to $\varepsilon$ . In the rest of the paper, we use a real-time representation of the self-energy, $\Sigma_{T}({\bf r}_{1},{\bf r}_{2},t_{1},t_{2})$ which is merely a Fourier transformation of $\Sigma_{T}({\bf r}_{1},{\bf r}_{2},\omega)$ .

We will now review the expressions for the total self-energy in Sec. 2.1 and then we turn to the approximate forms in Sec. 2.2.

2.1 Self-energy

The total self-energy ( $\Sigma_{T}$ ) in principle requires knowledge of the two-particle GF, leading to a hierarchy of coupled equations of motion.53, 54, 18 Alternatively, $\Sigma_{T}$ is written as a sum:19, 20

[TABLE]

where $\Sigma_{H}$ and $\Sigma_{xc}$ are the Hartree and exchange-correlation self-energies. Note that the former is local and instantaneous; hence, it appears together with a delta function $\delta(1,2)\equiv\delta({\bf r}_{1}-{\bf r}_{2})\delta(t_{1}-t_{2})$ .

The Hartree term represents the interaction with the electron density:

[TABLE]

where $\nu$ is the instantaneous Coulomb interaction defined as

[TABLE]

and the density is given by the equal-time GF, i.e., $n({\bf r}_{1})\equiv G(1,1^{+})$ . The $1^{+}$ argument represents $({\bf r}_{1},t_{1}^{+})$ , where $t_{1}^{+}$ is only infinitesimally after $t_{1}$ .

The exchange-correlation self-energy is:

[TABLE]

where $U$ is an external potential introduced to remove the two-particle GF in the expression for $\Sigma_{T}$ .19, 20 The derivative of the inverse GF in Eq. 7 leads to two equivalent expressions: The first one is very compact and includes a three-point irreducible vertex function $\Gamma$ ; the second is slightly more involved, but it is more versatile expression leading to useful approximations to $\Gamma$ . For completeness, we will now review both.

2.1.1 $\Sigma_{xc}$ with the vertex function

In the first route, we consider a chain rule of derivatives:

[TABLE]

where $U_{cl}$ is a classical potential, consisting of the Coulomb and external potentials. From classical electrostatics, the change of $U_{cl}$ with the variation of the external potential $U$ corresponds to the inverse dielectric function:

[TABLE]

The first derivative in the right side of Eq. 8 serves as the definition of the irreducible vertex function:

[TABLE]

Combining Eqs. 7,9 and 10 leads to the following compact expression for the exchange-correlation self-energy:

[TABLE]

where $W$ is the screened Coulomb interaction obtained by convolution of Eqs. 6 and 9.

2.1.2 $\Sigma_{xc}$ with generalized polarizability

In the second route, we start again from Eq. 7 and make use of the functional derivatives.53, 54 First, the change of the inverse GF with respect to $U$ is:

[TABLE]

The two terms in the square brackets lead to a suitable definition of the exchange and correlation self-energies. The former is

[TABLE]

Note that the Coulomb kernel is instantaneous (Eq. 6), so $G(1,2)$ in Eq. 13 is the density matrix $\rho(1,2)\equiv G(1,2)\delta(t_{1}-t_{2})$ .

The correlation self-energy has a more complicated expression:

[TABLE]

It is convenient to recast the functional derivative of the total self-energy in Eq. 14 as

[TABLE]

Further, we introduce a generalized three-point reducible polarizability:

[TABLE]

The final expression of the correlation self-energy thus reads

[TABLE]

The compact expression with the vertex function (Eq. 11) is equivalent to the sum of the exchange (Eq. 13) and polarization (Eq. 17) self-energies. Note that $\Sigma_{c}$ depends on a functional derivative of $\Sigma_{T}$ (Eq. 17); hence, the total self-energy should be, in principle, found by a self-consistent cycle.

2.2 Approximate Self-energy

Due to the quadruple integration, Eq. 17 is computationally difficult. In the following, we outline how to construct practical expressions for $\Sigma_{c}$ based on successive approximations for the functional derivative $\delta\Sigma_{T}/\delta G$ .

2.2.1 The $GW$ approximation

In the $GW$ approximation, the total-self-energy in Eq. 17 is substituted with the classical Hartree self-energy $\Sigma_{H}$ . Hence, the functional derivative of $\Sigma_{T}$ with respect to the GF becomes

[TABLE]

where the delta functions are due to the locality of $\Sigma_{H}$ (Eq. 5).

The approximation introduced in Eq. 18 greatly simplifies the polarization self-energy: $\,{}^{3}\chi$ becomes two-point reducible polarizability, i.e.,

[TABLE]

Note the $\chi$ is a time-ordered quantity, but it is trivially related to the standard retarded response function.60, 61 Consequently, the polarization self-energy has the following form:

[TABLE]

The $GW$ xc self-energy is a sum of Eqs. 13 and 20, which becomes

[TABLE]

where we used an alternative definition of the screened Coulomb interaction:

[TABLE]

Note that the $GW$ self-energy is often obtained from Eq. 11 by approximating the vertex function as $\Gamma(1,2,3)\approx\delta(1,2)\delta(1,3)$ . Such derivation of $\Sigma_{xc}^{GW}$ is quite simple and compact, but it is not immediately clear how to construct a better approximation.

2.2.2 $GW\Gamma_{X}$ approximation

In this part, we will consider the next step in the construction of the self-energy. We use Eq. 17, and take $\Sigma_{T}\approx\Sigma_{H}+\Sigma_{x}$ . Note that this expression omits the functional derivative of the correlation self-energy. We denote this approximation $GW\Gamma_{X}$ . The derivative of the total self-energy becomes:

[TABLE]

The first term on the right is the classical Hartree interaction (as in Eq. 18), the second term is due to a non-local exchange. While both terms include the Coulomb kernel (Eq. 6), they have a different structure, i.e., each contracts distinct space-time points. As a result, the polarization self-energy contains a contribution from the reducible three-point polarizability:

[TABLE]

The definition of $\nu$ (Eq. 6) contains a delta function which guarantees that the Coulomb interaction is instantaneous in time. Hence, the second term in Eq. 24 implicitly contains $\delta(t_{2}-t_{\bar{3}})$ . As a result, the generalized polarizability depends on three spatial coordinates but only two time points. In other words, $\,{}^{3}\chi(\bar{3},2,\bar{4})$ (defined by Eq. 16) yields time-dependent induced density matrix $\delta\rho({\bf r}_{\bar{3}},{\bf r}_{2},t_{\bar{3}},t_{2})\delta(t_{2}-t_{\bar{3}})$ due to the variation of the external potential at $4$ . The response of the density matrix is time-ordered with respect to $t_{4}$ . In practice, $\,{}^{3}\chi(\bar{3},2,\bar{4})$ is not computed explicitly; as shown in Sec. 3.2.1, the second term is evaluated using real-time propagation of the time-dependent induced density matrix.

Finally, it is important to comment on the relation between $GW\Gamma_{X}$ approximation and the second-order screened-exchange (SOSEX) method.62, 51, 36 In the latter, two distinct steps are involved in approximating Eq. 17: (i) $\Sigma_{T}\approx\Sigma_{H}+\Sigma^{GW}_{xc}$ , and the functional derivative of the second term is $\delta\Sigma^{GW}_{xc}/\delta G\approx W$ (ii) the three-point polarizability is $\,{}^{3}\chi(1,2,3)\approx G(1,3)G(3,2)$ , which can be viewed as a generalized case of the independent QP approximation.21

Together, the $GW$ +SOSEX self-energy is62, 51

[TABLE]

Clearly, this expression is different from Eq. 24. Unlike $GW\Gamma_{X}$ , it contains a screened Coulomb interaction, and it lacks the induced density matrix. SOSEX approximates the vertex to second order49 and, as discussed, it represents a distinct approach to solve Eq. 17.

In the rest of the paper, we will consider only $GW$ (with and without RPA - c.f., Sec. 3.2.3) and the $GW\Gamma_{X}$ self-energies, in which the vertex term is derived from the mean-field starting-point.

3 Computational methodology

In this section, we present practical steps which allow the application of Eqs. 20 and 24 to large molecules. In practice, we employ two simplifications:

(i) We do not seek self-consistency in $\Sigma_{T}$ . The $i^{\rm th}$ QP energy is computed as:

[TABLE]

where $\varepsilon_{i}^{0}$ is an eigenvalue of the mean-field Hamiltonian, $\phi_{i}$ is the corresponding eigenstate, and $\hat{v}_{xc}$ is the mean-field exchange-correlation potential operator. The one-shot correction means that the GF and the screened Coulomb interaction (denoted $G_{0}$ and $W_{0}$ ) are expressed using the mean-field eigenvalues and eigenstates. Further, $\,{}^{3}\chi$ and $\chi$ in Eq. 24 are substituted with $\,{}^{3}\chi_{0}$ and $\chi_{0}$ , which correspond to the response functions computed with the mean-field Hamiltonian

In a one-shot correction scheme, $\delta\Sigma_{x}/\delta G_{0}$ is obtained from the derivative of the mean-field non-local exchange. Here, we consider HF and GKS starting points. In the second case, only the non-local part of the exchange (introduced in Eq. 28 in Sec. 3.1) is considered.

(ii) We reformulate Eqs. 21 and 24 using the stochastic approach, i.e., the expectation values become statistical estimators over (many) stochastic samples. This method was applied to the $G_{0}W_{0}$ approximation as described in earlier publications.56, 60, 61 In contrast to the previous work, the starting point is no longer constrained to DFT with local functionals. The stochastic formulation of $\Sigma_{c}^{GW\Gamma_{X}}$ is a new development.

The details of the starting point Hamiltonians are given in the next subsection (3.1), followed by a short overview of the stochastic approach and the description of the new developments (3.2).

3.1 Mean-field starting points

The starting point is computed with a Hamiltonian:

[TABLE]

where $\hat{h}$ contains the kinetic energy and the electron-nuclear attraction (as in Eq. 2), and $V_{c}$ is the correlation potential approximated by a semilocal functional of the density. The exchange interaction is based on spatial separation of the $1/r$ kernel into short and long-range parts.63, 64, 65 $\Sigma_{x}^{\gamma}$ is the non-local long-range exchange interaction

[TABLE]

where $G_{0}$ is the GF constructed from the $H_{0}$ and $\nu^{\gamma}$ is the exchange kernel:

[TABLE]

The short-range part, $V_{x}$ , is derived from the complementary error-function term; it is given by a semilocal density functional which depends on the value of $\gamma$ . In HF calculations, the $V_{c}$ and $V_{x}$ are set to zero, and the non-local exchange is given by Eq. 13 (i.e., the range-separation parameter $\gamma\to\infty$ ).

The calculations presented in this work rely on two starting points: HF and the optimally tuned LC- $\omega$ PBE functional66 implemented using the LibXC library.67, 68 In practice, optimal tuning amounts to finding range-separation parameter $\gamma$ which enforces the IP theorem, i.e., $\gamma$ is varied such that the negative of the HOMO energy corresponds to the ionization potential (energy difference between a neutral system and a cation). Optimal tuning is associated with mitigation of spurious electron self-interaction and leads to good $I$ and fundamental band gaps ( $E_{g}$ ) in finite systems 69, 70, 71. Further, TDDFT with optimally tuned functionals with long-range exchange treats attractive electron-hole interactions and efficiently mimics BSE.72, 73, 74

3.2 Stochastic approach

3.2.1 Stochastic sampling of Green’s functions and self-energies

We will now introduce the basics of the stochastic approach and describe how $\Sigma^{GW\Gamma_{X}}$ and $\Sigma^{GW}$ are computed using stochastic sampling of the GFs.

In the initial part of the algorithm, random functions are prepared on a real space grid as:

[TABLE]

where ${\rm d}\Omega$ is the volume element associated with each grid point. The $\pm$ in front of the fraction represents a random sign assigned to each space-point ${\bf r}$ . This choice satisfies the stochastic resolution of identity $\hat{\mathcal{I}}\equiv\left\{\left|\bar{\zeta}\middle\rangle\middle\langle\bar{\zeta}\right|\right\}$ , where $\hat{\mathcal{I}}$ is the identity operator and $\left\{\cdots\right\}$ denotes an average over the entire (in principle infinite) set of random functions.

In the stochastic representation, the density matrix is given as an average:

[TABLE]

Here, $\eta$ are random vectors within the occupied subspace, i.e., $\left|\eta\right\rangle=\hat{P}\left|\bar{\zeta}\right\rangle$ and $\hat{P}$ is a projection operator. $\hat{P}$ depends on the chemical potential and the Hamiltonian $\hat{H}_{0}$ . In practice, $\eta$ states can be constructed either by projecting on the occupied eigenstates or, e.g., by Chebyshev filtering.75, 8, 56, 60, 61 In this paper, we follow the former approach.

The stochastic form of the GKS Green’s function $G_{0}$ is56, 60, 61

[TABLE]

where the $\zeta$ vector is either in the occupied or unoccupied subspace, i.e., it is obtained by projection with $\hat{P}$ or its complement $\left(\hat{\mathcal{I}}-\hat{P}\right)$ . Since the equilibrium GF depends only on the difference between $t_{1}$ and $t_{2}$ , we set $t_{1}=0$ and let only the projected stochastic vectors to evolve in time. For negative/positive times, the GF represents a propagator of holes/electrons. The corresponding time-evolved random vectors are:

[TABLE]

In practice, the time propagation is performed using Trotter (split operator) technique. The computational cost of the time propagation scales with the number of occupied states in $\hat{H}_{0}$ . It is possible to reduce the cost by employing stochastic time propagation described in Sec. 3.2.3 and in Refs. 56, 60, 61, 76.

We will now focus on the two approximations introduced earlier (Eqs. 20 and 24) and combine them with the stochastic form of the GF (Eq. 32).

The $G_{0}W_{0}$ approximation to $\Sigma_{c}$ is:56, 60, 61

[TABLE]

where we introduced a time-ordered polarization potential $W_{P}(t)=\hat{\nu}\,\hat{\chi}_{0}(t)\,\hat{\nu}\left|\bar{\zeta}\phi_{i}\right\rangle$ , which is obtained from a retarded potential, $W_{P}^{r}$ , by manipulating its imaginary components in the frequency domain56, 60, 61 using sparse stochastic compression technique111Details of the implementation are given in Vlcek et al. Phys. Rev. B, 98(7):075107, 2018. Here, we use 20,000 stochastic vectors with size of 10% of the total real-space grid. This was found to be sufficient and agrees with our previous finding both molecules and periodic systems.. Note that the response function $\chi_{0}$ is based on the mean-field starting point (i.e., HF or GKS) as discussed earlier.

The retarded potential is computed as

[TABLE]

where the induced time-dependent density is

[TABLE]

In practice, the induced density is computed as a difference $\delta{n}({\bf r},t)={n}({\bf r},t)-{n}({\bf r},0)$ , where the density is constructed from time-evolved occupied states ${n}({\bf r},t)=\sum_{i}^{N_{occ}}|\phi({\bf r},t)|^{2}$ . The system is perturbed at $t=0$ by a potential ${\nu}({\bf r},\bar{\bf r}_{2})\bar{\zeta}(\bar{\bf r}_{2})\phi_{i}(\bar{\bf r}_{2})$ .

In the fully stochastic formulation, $\delta{n}({\bf r},t)$ is computed via stochastic sampling detailed in Sec. 3.2.2. Further, the propagation is performed using: (i) random phase approximation (RPA) or (ii) TDDFT. Both approaches are discussed at the end of Sec. 3.2.3.

Eq. 34 provides an intuitive interpretation of the $GW$ correlation self-energy: it is a time-dependent induced Coulomb potential due to the addition of an electron/hole to the state $\phi$ .

The correlation self-energy in the $GW\Gamma_{X}$ approximation is based on Eq. 24. Using the expression for the $GW$ self-energy, we obtain

[TABLE]

where we introduced a time-ordered induced exchange potential $W_{x}(t)=\hat{\nu}^{\gamma}\,{}^{3}\hat{\chi}_{0}(t)\,\hat{\nu}\left|\bar{\zeta}\phi_{i}\right\rangle$ , which contains the exchange kernel $\nu^{\gamma}$ (Eq. 29) and the three-point polarizability ${}^{3}\hat{\chi}_{0}(t)$ . The polarizability is based on the mean-field starting point (i.e., HF or GKS). The induced exchange potential is computed from its retarded form $W^{r}_{x}({\bf r},t)$ . The time-ordering is performed in the frequency domain and $W_{P}$ acquires positive/negative sign for electrons/holes.

The retarded potential is computed as:

[TABLE]

where the induced time-dependent density matrix is

[TABLE]

In practice, it is computed as a difference $\delta{\rho}({\bf r},{\bf r^{\prime}},t)={\rho}({\bf r},{\bf r^{\prime}},t)-{\rho}({\bf r},{\bf r^{\prime}},0)$ , the density matrix is constructed from time-evolved occupied states ${\rho}({\bf r},{\bf r^{\prime}},t)=\sum_{i}^{N_{occ}}\phi({\bf r},t)\phi^{*}({\bf r^{\prime}},t)$ . In the fully stochastic formulation, $\delta{\rho}({\bf r},{\bf r^{\prime}},t)$ is computed via stochastic sampling detailed in Sec. 3.2.2. Hence, $\Sigma_{c}^{GW\Gamma_{X}}$ contains time-dependent induced Coulomb and exchange potentials due to the addition of an electron/hole to the state $\phi$ .

Note that the $GW$ and $GW\Gamma_{x}$ self-energies can be written by Eqs. 34 and 37 by virtue of the stochastic decomposition of the GF. Only then, it is possible to express the correlation simply in terms of the induced time polarization and exchange potentials $W_{P}$ and $W_{x}$ .

3.2.2 Stochastic induced time-dependent density and density matrix

In this part, we describe how we evaluate both $\delta{n}$ and $\delta{\rho}$ using stochastic sampling rather than by summation over all occupied states. Since the density matrix is not constructed from eigenstates of $\hat{H}_{0}$ , it will naturally fluctuate in time unless an infinite number of stochastic vectors is used.

In practice, the density matrix induced by the addition/removal of an electron is computed as

[TABLE]

Here, $\rho_{\lambda}$ represents the perturbed density matrix and $\lambda$ denotes the strength of the perturbing potential due to charge addition (see discussion below Eq. 36). $\rho_{0}$ is the unperturbed density matrix which exhibits time dependence due to its stochastic nature.

The time-dependent density matrices are constructed from random vectors in the occupied subspace

[TABLE]

The stochastic states $\eta_{\lambda}(t)$ are found for each $t$ by time evolution according to:

[TABLE]

where $\hat{H}_{0}$ is the GKS Hamiltonian from Eq. 27, which adiabatically depends on $t$ since $\hat{\Sigma}_{H}$ , $\hat{V}_{c}$ and $\hat{V}_{x}$ are functionals of the time-dependent density, and $\hat{\Sigma}^{\gamma}_{x}$ is a functional fo the density matrix. The time-dependent density is, of course, $n({\bf r}_{1},t_{1})=\rho({\bf r}_{1},{\bf r}_{1},t_{1})$ . Numerically, Eq. 42 is solved using Trotter propagation technique (see Sec. 3.2.3).

The states $\eta_{\lambda}$ , are perturbed at $t=0$ :

[TABLE]

where $\hat{v}_{\lambda}$ is a perturbing potential:

[TABLE]

Here, $\lambda$ is the strength of the perturbation. In practice, we take $\lambda=10^{-4}E_{h}^{-1}$ , but the value of $\lambda$ between $10^{-5}E_{h}^{-1}$ and $10^{-3}E_{h}^{-1}$ does not affect the results for molecules in Sec.4. This is consistent with previous observations.56, 59, 60, 61, 76

In practice, stochastic computation of the induced charge density requires only a few random states $\eta$ (typically between 4 and 20), i.e., the number of states that are propagated by Eq. 42 is much smaller than the number of occupied states. Further, the induced density matrix is damped by a factor $\exp[-(\alpha t)^{2}/2]$ , where the damping factor is related to the maximum propagation time $\alpha=3/t_{max}$ .

3.2.3 Time propagation and stochastic decomposition of the long-range exchange – RPA and TDDFT response

The self-energy requires two distinct time-propagations to be computed: for the Green’s function (Eq. 33), and for the density matrix (Eq. 42). In both cases, the time-evolution operator is split into the local and non-local part of $\hat{H}_{0}$ , and it is calculated in discrete time steps $\Delta t$ as:

[TABLE]

where the local part of the Hamiltonian is

[TABLE]

The time-evolution due to the non-local part is computed simply as:

[TABLE]

Here, the time-step is a parameter subject to convergence tests; in typical calculations, $\Delta t$ ranges between 0.02 and 0.05 a.u.

There are only a few vectors $\zeta$ and $\eta$ . Nevertheless, the time evolution is costly due to the non-locality of $\Sigma^{\gamma}_{x}$ . To make Eq. 47 less expensive, we use two additional sets of stochastic vectors to represent $\Sigma^{\gamma}_{x}$ (Eq. 28):

(i) The first set is used for the long-range Coulomb interaction $\nu^{\gamma}$ :58

[TABLE]

This form was applied previously to the ground state calculations.58, 76 Here it is applied only to the time-evolution of stochastic states.

(ii) The second set, $\left\{\vartheta\right\}$ , is used to decompose $G_{0}$ . Since the exchange interaction is instantaneous, the GF is merely a density matrix. In practice, it is sufficient to use one or two the stochastic states $\vartheta$ (see the next section), which are obtained by a linear combination:

[TABLE]

where $\theta\in[0,2\pi]$ is a random phase, and $N_{\eta}$ is the number of $\eta$ vectors used for decomposition of the density matrix (Eq. 31).

Together, the action of $\hat{\Sigma}_{x}^{\gamma}$ on an arbitrary vector $\psi$ is

[TABLE]

Time arguments are omitted here since the exchange interaction is instantaneous (note the delta function in Eq. 48). Also, note that the $\bar{\bf r}_{2}$ coordinates are integrated out, i.e., $\chi^{\gamma}\left(\bar{{\bf r}}_{2}\right)\vartheta\left(\bar{{\bf r}}_{2}\right)\psi(\bar{\bf r}_{2})$ is a complex number.

In practice, the numbers of $\vartheta$ and $\chi$ vectors are finite, and hence the stochastic noise is, in principle, increased. However, at each time step, new random phases $\left\{\theta_{j}\right\}$ are selected. Frequent resampling of $\vartheta$ helps to reduce the stochastic error. As a result, only a few states are needed in actual calculations (see Sec. 4).

Finally, it is necessary to point out the difference between response computed with random phase approximation (RPA) and TDDFT. In the first case, only the Hartree self-energy evolves in time (i.e., it is constructed at each time step). Such treatment corresponds to the time-dependent Hartree approximation (equivalent to RPA). Here, the exchange and correlation terms are static. The non-local exchange interaction is computed with $\vartheta$ that repeatedly sample the static unperturbed vectors $\eta$ at $t=0$ . The corresponding results are labeled as $G_{0}W_{0}$ .

If the screening is computed with TDDFT, both Hartree and xc terms are constructed from the time-dependent states. The $\vartheta$ vectors that are made at each time step by a linear combination of time-evolved $\eta_{\lambda}$ states. To distinguish the level of theory applied, we label the beyond-RPA approaches as $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{X}$ , because the screened Coulomb interaction is based on a test charge-test charge response function.77 All the methods are tested in the next section.

4 Results

4.1 Verification of the time-dependent formulation

In this section, we use the time-dependent formulation of the self-energy (derived in Sec. 2.2) combined with the stochastic approach described in Sec. 3. We first compute the HOMO energies of small molecules using HF starting point. The results are verified against deterministic calculations in the frequency domain from Ref. 49.

We deliberately limit the stochastic approach to the decomposition of the Green’s functions. Another set of stochastic orbitals is used only for sparse stochastic compression and time-ordering of the induced potentials.61 We use Eqs. 34 and 37, in which $W_{P}(t)$ and $W_{x}(t)$ are computed deterministically. This partially stochastic approach is chosen because, for small systems, the stochastic time propagation leads to substantial statistical errors, which decrease only slowly with the number of stochastic states.60, 76

The ground state electronic structure was computed on a real-space grid for selected molecules (Table 1). In all cases, a grid of $64^{3}$ points with a spacing of $h=0.30~{}a_{0}$ yields results converged to 0.02 eV. Our calculations are performed only for valence electrons; we use LDA Trouiller-Martins pseudopotentials78 and kinetic energy cutoff of 28 $E_{h}$ .

As discussed in Sec. 3.2, the time propagations of $G_{0}$ and $\delta\rho$ are performed in discrete steps $\Delta t$ for a limited total simulation time $t_{max}$ . We use a time-grid spacing of 0.05 a.u.; $t_{max}$ was 100 a.u. This yields converged results with two exceptions: in $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{x}$ runs for ethylene and methane, $t_{max}=150$ a.u. was necessary due to a low-frequency features discussed later. Note that $\Delta t$ and $t_{max}$ are parameters of the calculation (similar to grid size and spacing), and hence they affect the statistical error only indirectly. The GF was sampled by $N_{\zeta}$ stochastic vectors (Table. 1); the value of $N_{\zeta}$ was converged so that the statistical errors are less than 0.05 meV. $N_{\zeta}$ varies strongly among different systems and methods.

The $G_{0}W_{0}$ results (Table. 1) are in excellent agreement with the calculations from Ref. 49 extrapolated to the complete basis-set limit. The mean absolute error (MAE) is 0.16 eV, which is a typical discrepancy between two distinct implementations (real-space and real-time versus atomic basis-set and frequency-domain).79 In all cases, $N_{\zeta}<5,000$ , which is slightly lower compared to the fully stochastic $G_{0}W_{0}$ based on a KS DFT starting point.60

Inclusion of vertex corrections shifts QPs to higher energies. $G_{0}W_{0}^{tc}$ usually provides higher estimates than $G_{0}W_{0}^{tc}\Gamma_{x}$ . To facilitate the comparison between our results and Ref. 49, we only consider differences between the $G_{0}W_{0}$ and the other methods. Our predictions are in excellent agreement with previous calculations yielding MAE of 0.04 and 0.07 eV for $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{x}$ .

Fig. 1 illustrates the self-energy of methanol calculated with the three distinct methods using the stochastic sampling of the GF. The spectral features in the self-energy (both real and imaginary part) are broadened due to a finite length of the time propagation. The present frequency resolution is, however, sufficient. In small systems, Eq. 26 requires self-energy at frequencies sufficiently distant from the poles of $\Sigma_{c}$ , where the curves are smooth and monotonic. We have increased the computational time by 50% and found that the QP energies change by $<0.02$ eV.

The vertex corrections tend to shift the spectral features to lower frequencies, as shown in Fig. 1. Hence, the variation of the self-energy is extended over longer time-scales in $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{x}$ . This explains why longer propagation times were needed in some calculations which included vertex (namely for ethylene and methane). Further, the vertex function in the screened Coulomb interaction tends to increase the amplitude of the frequency variation of $\Sigma_{c}$ . More stochastic samples are thus needed to converge the calculation to the same level of statistical error, which is also seen in the results in Table. 1.

4.2 Fully stochastic method

We now turn to the calculations of the QP energies using a fully stochastic approach. Treating HF exchange by stochastic sampling may require a very high number of stochastic vectors leading to a high computational cost. However, it is possible to decompose the long-range non-local exchange interactions in GKS DFT.58, 59, 76

Here, we apply the LC- $\omega$ PBE functional discussed in Sec. 3.1. Both the GF and the time-dependent potentials ( $W_{P}$ and $W_{x}$ in Eqs. 34 and 37) are sampled stochastically using multiple sets of stochastic vectors.

We calculate HOMO and LUMO QP energies of the molecules listed in Table 2. The results were converged with respect to the real-space grid; the number of grid points ( $N_{g}$ ) is specified in Table. 2 for each system. In all calculations, we used the grid spacing of $h=0.35~{}a_{0}$ . The ground state eigenvalues are converged with respect to the grid parameters to $<0.01$ eV. The QP energies are converged to within 0.03 eV. Fig. 2 shows that the differences in the self-energy for grids with $h=0.35$ and $h=0.30~{}a_{0}$ are small.

The range-separation parameter ( $\gamma$ ) was selected such that the ionization potential theorem of the neutral system is satisfied. Tuning $\gamma$ to enforce the ionization potential simultaneously for the neutral molecule and an anion tends to change the parameter negligibly. Indeed, the DFT results presented here are in agreement with Ref. 38 , in which the latter tuning approach was applied. The mean absolute deviation from Ref.38 is 0.10/0.06 eV for HOMO/LUMO.

We focus only on relatively large molecules that form stable anions since the goal is to test how different approximations treat both ionization potential and electron affinities. Further, we investigate how the computational cost scales with the system size. The convergence of the computed QP energies is described in the next section; the results of the three methods are compared afterward.

4.2.1 Convergence of stochastic errors

In the fully stochastic implementation, the statistical errors arise predominantly from the induced time-dependent potentials $W_{P}$ and $W_{x}$ in Eqs. 34 and 37. The time evolution is performed with $\Delta t=0.05$ a.u. and $t_{max}=50$ a.u. The propagation time is shorter than for small molecules with HF starting point because the dynamics with LC- $\omega$ PBE functional exhibits a faster time decay of the response. These values of $\Delta t$ and $t_{max}$ yield QP energies converged to better than $0.02$ eV. This is consistent with the previous stochastic calculations for $G_{0}W_{0}$ based on LDA starting point.60, 61 Increasing $t_{max}$ or decreasing $\Delta t$ affects neither the QP energies nor the self-energies, as illustrated in Fig. 2.

For a given set of time and real-space grid parameters, the QP energies exhibit stochastic fluctuations stemming from the random sampling vectors. In all three approximations, the following sets of stochastic orbitals are employed: (i) $\zeta$ for decomposition of the GKS Green’s function (Eq. 32); (ii) $\eta$ for decomposition of the induced density matrix (Eq. 41); (iii) $\chi$ for the decomposition of the exchange kernel $\nu^{\gamma}$ (Eq. 48); (iv) $\vartheta$ for decomposition of the density matrix in $\Sigma_{x}^{\gamma}$ (Eq. 49).

Three types of stochastic vectors are part of a “nested sampling”: There are $N_{\eta}\times N_{\chi}\times N_{\vartheta}$ states per each $\zeta$ vector. The overall error is thus governed mainly by the number of $N_{\zeta}$ samples, each having a stochastic fluctuation determined by $N_{\eta}\times N_{\chi}\times N_{\vartheta}$ . In practice, $N_{\zeta}$ is increased until the statistical error is below a predetermined threshold.

For all systems investigated, we take $N_{\eta}=15$ , which is consistent with previous calculations for acenes and C60.61 Additional tests for anthracene and C60 show that the same QP energies are obtained with $N_{\eta}=12$ (albeit with higher stochastic fluctuation per single $\zeta$ ). Fig 2 illustrates that doubling the number of stochastic vectors $\eta$ , i.e., $N_{\eta}=30$ , does not affect the results.

For the other two sampling vectors, it is sufficient to take $N_{\vartheta}=1$ and $N_{\chi}=20$ . Such low values are due to a small magnitude of the $\Sigma_{x}^{\gamma}$ term, which stems from a weak long-range exchange ( $\gamma\leq 0.23\,a_{0}^{-1}$ , see Table. 2). Fig. 2 shows that twice as high numbers of stochastic vectors $N_{\vartheta}$ and $N_{\chi}$ again does not change the results. The self-energy curves are almost identical; they differ by $<0.02$ eV at the QP energy, which is much less than the statistical error due to finite $N_{\zeta}$ and $N_{\eta}$ .

The overall error of the fully stochastic approach accumulates each of the contributions discussed above (i.e., real-space and time grids, and the numbers of stochastic vectors). Calculations for tetracene and hexacene LUMO states showed that the total (accumulated) error is $<0.04$ eV. The error was estimated by comparing the results in Table 3 with results for $h=0.30$ , $\Delta t=0.03$ a.u., $t_{max}=100$ a.u., $N_{\eta}=30$ , $N_{\vartheta}=2$ and $N_{\chi}=40$ . The error is relatively low due to mutual cancellation among the different contributions. This is consistent with previous calculations for molecules.60, 61

Finally, we compare the total stochastic error in the different approximations to $\Sigma_{c}$ . In this analysis, the target fluctuation, $\sigma(\varepsilon)$ , is 0.05 eV. The results are shown in Fig. 3. For large systems, the $G_{0}W_{0}$ calculations converge slower compared to the other approximations, yet the computational cost maintains linear scaling (with a steep slope of $\sim 15$ core hours per electron). Further, $N_{\zeta}$ rises with system size for the two largest molecules. In contrast, the costs of $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{X}$ depend much less on the system size and their computational time remains practically constant for systems between 100 and 300 valence electrons.

The distinct behavior of the stochastic $G_{0}W_{0}$ calculations is due to RPA applied in the time propagation. As discussed in Sec. 3.2.3, RPA assumes that $\Sigma_{x}^{\gamma}$ is time-independent, i.e., it is not constructed from time propagated states. Although the $\Sigma_{x}^{\gamma}$ term is sampled by distinct stochastic vectors $\vartheta$ at each time step, it leads to a strong stochastic noise. This random fluctuation is amplified with time (similar to the breakdown of stochastic BSE59). Tests for hexacene showed that taking $N_{\vartheta}=2$ leads to only a $\sim 1\%$ reduction of the fluctuation. Fig. 4 clearly illustrates the amplification of the stochastic error in $G_{0}W_{0}$ with $t_{max}$ . In contrast, $\sigma(\varepsilon)$ in $G_{0}W_{0}^{tc}$ calculation remains almost constant regardless of $t_{max}$ . In $G_{0}W_{0}^{tc}\Gamma_{X}$ calculations, the statistical error is higher, but it does not increase significantly with $t_{max}$ .

In general, the stochastic approach is aimed for large systems.8, 56, 60, 61 The calculation for tetracene requires the highest number of stochastic samples irrespective of the method chosen (Fig. 3). This behavior can be understood as follows: For systems smaller than tetracene, $N_{\eta}=15$ is relatively high compared to the number of occupied states (for instance, there are 33 occupied valence states in anthracene). As a result, the occupied subspace is sufficiently well sampled. For large systems, the stochastic approach exhibits strong self-averaging,8, 56, 61 which leads to a decrease of $N_{\zeta}$ required for target $\sigma(\varepsilon)$ , i.e., the computational cost decreases. Tetracene is found to be the “worst-case scenario” in which the stochastic sampling introduces relatively large errors, and there is only limited self-averaging.

Overall, the fully stochastic implementation of $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{X}$ is efficient and numerically stable, while $G_{0}W_{0}$ suffers from stochastic fluctuations. The total computational time of the beyond-RPA-methods depends only weakly on the system size (Fig. 3). For large systems, the more involved expressions for the self-energy are less expensive than their $G_{0}W_{0}$ counterpart. The low cost of stochastic beyond- $GW$ calculations is in striking contrast to their conventional (deterministic) implementations.

4.2.2 Performance of the approximations to $\Sigma_{c}$

We will now address how distinct approximations to $\Sigma_{c}$ affect predictions of the HOMO and LUMO energies. Here, we will report results of stochastic calculations with $N_{\zeta}=1500$ , $N_{\vartheta}=1$ , $N_{\chi}=20$ , and $N_{\eta}=15$ . The DFT starting point (optimally tuned LC- $\omega$ PBE) is already in a good agreement with the reference values, as shown in Table 3; yet, the HOMO/LUMO energies are consistently over/underestimated. The mean absolute error (MAE) of the DFT reference point for acenes is 0.17 eV for the HOMO energies and 0.28 eV for the LUMO energies.

The $G_{0}W_{0}$ approximation makes the HOMO energies more negative (Table 3). As a result, the ionization potentials (negative of the HOMO energy) are significantly improved, and the corresponding MAE is only 0.07 eV. The $G_{0}W_{0}$ performance is, however, very different for LUMO. Here, the correction is too large, and the QP energies are thus much higher than the reference values leading to MAE of 0.28 eV. If we exclude the experimental reference data for C60 and PC60BM, we get MAE of 0.04 eV and 0.39 eV for HOMO and LUMO, which are in agreement with an earlier benchmark for acenes.38

This is a disappointing result, because a more advanced computational technique ( $G_{0}W_{0}$ ), which is aimed to improve upon DFT, yields worse results than the DFT itself. Further, the stochastic implementation of $G_{0}W_{0}$ on top of hybrid functionals is numerically expensive for large systems due to numerical instabilities discussed earlier. Note that such instabilities were not observed in previous stochastic $G_{0}W_{0}$ calculations based on LDA starting point.

The $G_{0}W_{0}^{tc}$ approximation is computationally stable and, similar to $G_{0}W_{0}$ , it provides good ionization potentials; MAE for HOMO is 0.08 eV. In the systems selected, the presence of the vertex corrections thus does not have a pronounced effect on the occupied states. However, the method amplifies the problems for unoccupied states. The affinities (negative of the LUMO energy) are predicted to be significantly larger than the reference values, leading to MAE of 0.44 eV which is worse than in $G_{0}W_{0}$ . If experimental data for C60 and PC60BM are excluded from the analysis, $G_{0}W_{0}^{tc}$ leads to even larger errors for LUMO (0.55 eV). This failure for unoccupied states indicates that abandoning RPA has detrimental effects on unoccupied QP states.

Finally, we turn to the analysis of $G_{0}W_{0}^{tc}\Gamma_{X}$ predictions. The presence of the non-local exchange interaction in Eq. 23 has a significant impact on the QP energies. The HOMO states are higher in energy, leading to MAE of 0.19 eV for acenes. This error is significantly worse than $G_{0}W_{0}$ but similar to the DFT results.

In contrast to the other approximations tested here, $G_{0}W_{0}^{tc}\Gamma_{X}$ self-energies are of the unoccupied states are qualitatively different. Fig. 5 illustrates that inclusion of the vertex changes the self-energy dramatically for the LUMO state (while for HOMO it is similar to the $G_{0}W_{0}$ and $G_{0}W_{0}^{tc}$ results). This observation is consistent with previous theoretical results based on local vertex corrections that incorporate derivative discontinuity.41, 42 Here such discontinuity is included via the time-dependent induced exchange potential $W_{x}$ in the $G_{0}W_{0}^{tc}\Gamma_{X}$ self-energy (Eq. 37). The vertex correction thus acts similar to exchange in the static ground-state calculations and shifts unoccupied states higher in energy. In all cases tested, the QP energy of the unoccupied states are significantly increased and are higher than the reference values. Note that all other methods, the LUMO QP energies are too low. For the acenes molecules, we find MAE of 0.26 eV; hence, the perturbation theory slightly improves upon the DFT starting point.

Based on the results for acenes LUMO energies, $G_{0}W_{0}^{tc}\Gamma_{X}$ appears to be more successful than self-consistent $GW$ methods benchmarked in Ref. 38. If eigenvalue self-consistent $GW$ is employed and both the GF and the screened Coulomb interactions are updated, the predictions yield MAE of 0.41 eV. For the same subset of systems, $G_{0}W_{0}^{tc}\Gamma_{X}$ yields MAE that is $\sim 50\%$ smaller. In an alternative scheme, the self-consistency is applied only to the GF; however, for LUMO states the MAE increases to 0.44 eV, i.e., the deviation is again higher than for $G_{0}W_{0}^{tc}\Gamma_{X}$ results.

5 Conclusions and Outlook

In this work, we presented an efficient way for improving predictions of QP energies beyond the popular $GW$ approach using stochastic paradigm. In practice, this improvement amounts to the inclusion of nontrivial parts of the vertex function in the correlation self-energy ( $\Sigma_{c}$ ). Here, an approximate non-local vertex correction was derived from the functional derivative of the exchange self-energy. In principle, the self-energy should be found by a self-consistent set of expressions. In practice, we employ only a one-shot correction scheme on top of a mean-field starting point, which includes a non-local long-range exchange. This approach is labeled $G_{0}W_{0}^{tc}\Gamma_{X}$ , and it is compared to (stochastic) $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}$ methods.

A new stochastic formulation reduces the overall computational cost considerably. In contrast to the previous implementation of stochastic $G_{0}W_{0}$ , it is possible to use Hartree-Fock and generalized Kohn-Sham starting points.

The real-time formulation of all three methods was verified against previous results for small molecules computed in the frequency domain. Note that the time-domain version of $G_{0}W_{0}^{tc}\Gamma_{X}$ is entirely new. The Hartree-Fock starting point is used together with a stochastic sampling of the Green’s function. The calculations are in excellent agreement with the reference values.

For larger systems, we present a fully stochastic approach in which two additional sets of stochastic vectors sample the non-local vertex. For the large molecules investigated, we employed DFT with a long-range non-local exchange interaction. Such functional form is efficiently sampled even with a small number of stochastic vectors. The, otherwise extremely involved, beyond $GW$ calculations can thus be performed for large molecules at a low cost. While deterministic implementations scale as $N_{e}^{4}$ for $GW$ and $N_{e}^{6}$ for $GW\Gamma$ (where $N_{e}$ is the number of electrons), the stochastic formulation scales (sub)linearly. In fact, we found that RPA is numerically unstable; its statistical error worsens with the system size and the simulation time. In contrast, more difficult $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{X}$ calculations are stable and, paradoxically, computationally less expensive than $G_{0}W_{0}$ .

The three stochastic approximations were tested on a set of acene molecules, C60 and PC60BM. The computational costs of $G_{0}W_{0}^{tc}$ and $G_{0}W_{0}^{tc}\Gamma_{X}$ were practically constant with the system size. The overall computational time required to converge $G_{0}W_{0}^{tc}\Gamma_{X}$ QP energies was on average $\sim 50\%$ higher than in $G_{0}W_{0}^{tc}$ due to increased statistical fluctuation. The overall sublinear scaling is due to the rapid convergence of the statistical errors with the number of electrons. Hence, stochastic algorithms will be a method of choice for demanding beyond- $GW$ calculations or, at least, for efficient implementation of non-local vertex functions.

While DFT with optimally-tuned range-separated hybrid functionals (LC- $\omega$ PBE) provides a good starting point, some deviation from reference data is observed. One-shot $G_{0}W_{0}$ and $G_{0}W^{tc}_{0}$ improve the description of the ionization potentials compared to LC- $\omega$ PBE, but severely increase errors for electron affinities. On average, both methods perform worse than DFT; the worst performance is observed for the $G_{0}W^{tc}_{0}$ approach.

The $G_{0}W_{0}^{tc}\Gamma_{X}$ performs worse than DFT for the occupied states, but it improves the description of unoccupied states. While all other approaches underestimate the LUMO QP energies, $G_{0}W_{0}^{tc}\Gamma_{X}$ predicts them to be higher than the reference values. The energy increase of the unoccupied QP states is due to the vertex correction based on time-dependent induced non-local exchange potential. For the set of molecules considered, $G_{0}W_{0}^{tc}\Gamma_{X}$ is the only method that outperforms DFT for the electron affinities.

Previous calculations which included approximate vertex corrections were mostly applied to ionization potentials of small molecules that do not form stable anions. The accuracy of predicted electron affinities is another major and more sensitive indicator for performance assessment.

Together, these findings indicate that beyond $GW$ schemes are crucial for an improved description of QP energies. As shown here, stochastic techniques make such calculations affordable even for large systems. Future steps are directed toward the formulation of better self-energy expressions that include higher-order interactions (beyond $G_{0}W_{0}^{tc}\Gamma_{X}$ ) to improve the prediction of QP energies further. In particular, the vertex terms stemming from the time-dependent induced correlation potential will be studied. Investigations of charge transfer systems with strong electron hole-interactions are underway.

{acknowledgement}

The calculations were performed as part of the XSEDE86 Project No. TG-CHE180051. Use was made of computational facilities purchased with funds from the National Science Foundation (CNS-1725797) and administered by the Center for Scientific Computing (CSC). The CSC is supported by the California NanoSystems Institute and the Materials Research Science and Engineering Center (MRSEC; NSF DMR 1720256) at UC Santa Barbara.

Bibliography86

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Goedecker 1999 Goedecker, S. Linear scaling electronic structure methods. Rev. Mod. Phys. 1999 , 71 , 1085
2Goedecker and Scuseria 2003 Goedecker, S.; Scuseria, G. Linear scaling electronic structure methods in chemistry and physics. Comput. Sci. Eng. 2003 , 5 , 14–21
3Skylaris et al. 2005 Skylaris, C.-K.; Haynes, P. D.; Mostofi, A. A.; Payne, M. C. Introducing ONETEP: Linear-scaling density functional simulations on parallel computers. J. Chem. Phys. 2005 , 122 , 084119
4Vande Vondele et al. 2005 Vande Vondele, J.; Krack, M.; Mohamed, F.; Parrinello, M.; Chassaing, T.; Hutter, J. Quickstep: Fast and accurate density functional calculations using a mixed Gaussian and plane waves approach. Comput. Phys. Commun. 2005 , 167 , 103–128
5Zhou et al. 2006 Zhou, Y.; Saad, Y.; Tiago, M. L.; Chelikowsky, J. R. Self-consistent-field calculations using Chebyshev-filtered subspace iteration. J. Comput. Phys. 2006 , 219 , 172–184
6Neese et al. 2009 Neese, F.; Wennmohs, F.; Hansen, A.; Becker, U. Efficient, approximate and parallel Hartree–Fock and hybrid DFT calculations. A ‘chain-of-spheres’ algorithm for the Hartree–Fock exchange. Chem. Phys. 2009 , 356 , 98–109
7Saad et al. 2010 Saad, Y.; Chelikowsky, J. R.; Shontz, S. M. Numerical methods for electronic structure calculations of materials. SIAM Rev. 2010 , 52 , 3–54
8Baer et al. 2013 Baer, R.; Neuhauser, D.; Rabani, E. Self-averaging stochastic Kohn-Sham density-functional theory. Phys. Rev. Lett. 2013 , 111 , 106402

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Stochastic vertex corrections: linear scaling methods for accurate quasiparticle energies

Abstract

1 Introduction

2 Theory

2.1 Self-energy

2.1.1 Σxc\Sigma_{xc}Σxc​ with the vertex function

2.1.2 Σxc\Sigma_{xc}Σxc​ with generalized polarizability

2.2 Approximate Self-energy

2.2.1 The GWGWGW approximation

2.2.2 GWΓXGW\Gamma_{X}GWΓX​ approximation

3 Computational methodology

3.1 Mean-field starting points

3.2 Stochastic approach

3.2.1 Stochastic sampling of Green’s functions and self-energies

3.2.2 Stochastic induced time-dependent density and density matrix

3.2.3 Time propagation and stochastic decomposition of the long-range exchange – RPA and TDDFT response

4 Results

4.1 Verification of the time-dependent formulation

4.2 Fully stochastic method

4.2.1 Convergence of stochastic errors

4.2.2 Performance of the approximations to Σc\Sigma_{c}Σc​

5 Conclusions and Outlook

2.1.1 $\Sigma_{xc}$ with the vertex function

2.1.2 $\Sigma_{xc}$ with generalized polarizability

2.2.1 The $GW$ approximation

2.2.2 $GW\Gamma_{X}$ approximation

4.2.2 Performance of the approximations to $\Sigma_{c}$