Analytical Gradients for Projection-Based Wavefunction-in-DFT Embedding
Sebastian J. R. Lee, Feizhi Ding, Frederick R. Manby, Thomas F., Miller III

TL;DR
This paper derives and demonstrates analytical nuclear gradients for projection-based wavefunction-in-DFT embedding, enabling accurate and efficient calculations of chemical systems with combined wavefunction and DFT methods.
Contribution
It introduces a general formulation for analytical gradients in WF-in-DFT embedding, compatible with various WF and DFT methods, and simplifies their evaluation.
Findings
Gradients formulated in the Lagrangian framework with enforced constraints.
WF contributions to gradients can be computed using existing WF gradient codes.
Application to a cobalt catalyst reveals significant differences from pure DFT results.
Abstract
Projection-based embedding provides a simple, robust, and accurate approach for describing a small part of a chemical system at the level of a correlated wavefunction method while the remainder of the system is described at the level of density functional theory. Here, we present the derivation, implementation, and numerical demonstration of analytical nuclear gradients for projection-based wavefunction-in-density functional theory (WF-in-DFT) embedding. The gradients are formulated in the Lagrangian framework to enforce orthogonality, localization, and Brillouin constraints on the molecular orbitals. An important aspect of the gradient theory is that WF contributions to the total WF-in-DFT gradient can be simply evaluated using existing WF gradient implementations without modification. Another simplifying aspect is that Kohn-Sham (KS) DFT contributions to the projection-based embedding…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4| Method | MAE (hartree/bohr) |
|---|---|
| HF | |
| HF-in-HF | |
| LDA | |
| LDA-in-LDA | |
| HF-in-LDA | |
| MP2-in-LDA | |
| CCSD-in-LDA | |
| CCSD(T)-in-LDA | |
| CCSD-in-LDA (AO) | |
| CCSD(T)-in-LDA (AO) | |
| CCSD(T)-in-PBE0 | |
| CCSD(T)-in-PBE0 (AO) |
| Method | r(O-H) | OH | r(-) | r(-O) |
|---|---|---|---|---|
| LDA/6-31G | 0.988 | 110.4 | 1.503 | 1.439 |
| CCSD-in-LDA/6-31G | 0.979 | 110.7 | 1.506 | 1.435 |
| CCSD/6-31G | 0.979 | 110.6 | 1.532 | 1.475 |
| LDAX | CCSD-in-LDAX | |||
| Sub A | r(Co-N1) | 1.836 | 1.846 | 0.010 |
| r(Co-N2) | 1.893 | 1.883 | 0.010 | |
| r(Co-N3) | 1.932 | 1.951 | 0.019 | |
| r(Co-N4) | 1.900 | 1.926 | 0.026 | |
| r(Co-N5) | 1.978 | 2.026 | 0.048 | |
| Boundary | r(N5-O1) | 1.317 | 1.355 | 0.038 |
| r(N5-O2) | 1.262 | 1.301 | 0.039 | |
| r(C1-N5) | 1.131 | 1.150 | 0.019 | |
| r(N2-C2) | 1.430 | 1.458 | 0.028 | |
| r(N3-C3) | 1.428 | 1.458 | 0.030 | |
| Sub B | ||||
| r(O1-H) | 1.025 | 1.030 | 0.005 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Analytical Gradients for Projection-Based Wavefunction-in-DFT Embedding
Sebastian J. R. Lee
Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
Feizhi Ding
Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
Frederick R. Manby
Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol BS8 1TS, United Kingdom
Thomas F. Miller III
Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
Abstract
Projection-based embedding provides a simple, robust, and accurate approach for describing a small part of a chemical system at the level of a correlated wavefunction method while the remainder of the system is described at the level of density functional theory. Here, we present the derivation, implementation, and numerical demonstration of analytical nuclear gradients for projection-based wavefunction-in-density functional theory (WF-in-DFT) embedding. The gradients are formulated in the Lagrangian framework to enforce orthogonality, localization, and Brillouin constraints on the molecular orbitals. An important aspect of the gradient theory is that WF contributions to the total WF-in-DFT gradient can be simply evaluated using existing WF gradient implementations without modification. Another simplifying aspect is that Kohn-Sham (KS) DFT contributions to the projection-based embedding gradient do not require knowledge of the WF calculation beyond the relaxed WF density. Projection-based WF-in-DFT embedding gradients are thus easily generalized to any combination of WF and KS-DFT methods. We provide numerical demonstration of the method for several applications, including calculation of a minimum energy pathway for a hydride transfer in a cobalt-based molecular catalyst using the nudged-elastic-band method at the CCSD-in-DFT level of theory, which reveals large differences from the transition state geometry predicted using DFT.
I Introduction
The theoretical description of many chemical processes demands accurate, ab initio electronic structure theories. However, the study of complex reactive processes, including those arising in inorganic and enzyme catalysis, gives rise to the need for a compromise between accuracy and the ability to complete the computation in a reasonable amount of time. For systems in which the complicated chemical rearrangements (e.g. bond breaking and forming) occurs in a local spatial region, an effective strategy for balancing accuracy and computational cost is to employ one of various multiscale embedding strategies Kitaura et al. (1999); Deev and Collins (2005); Collins and Deev (2006); Fedorov and Kitaura (2007); Elliott et al. (2009); Goodpaster et al. (2010); Huang and Carter (2011); Goodpaster, Barnes, and Miller III (2011); Manby et al. (2012); Goodpaster et al. (2012); Gordon et al. (2012); Knizia and Chan (2012); Barnes et al. (2013); Goodpaster et al. (2014); Neuhauser, Baer, and Rabani (2014); Barnes et al. (2015); Fornace et al. (2015); Bennie et al. (2015); Stella, Bennie, and Manby (2015); Huo et al. (2016); Bennie et al. (2016); Pennifold et al. (2017); Zhang et al. (2018); Chapovetsky et al. (2018); Mühlbach and Reiher (2018). Generally, embedding methodologies hinge on the condition that a system can be efficiently partitioned into a local subsystem that demands a high-level treatment and an environment that can be treated with a lower (and computationally less expensive) level of theory.
The current paper focuses on projection-based embedding, Manby et al. (2012); Lee et al. (2019) a DFT-based embedding theory in which subsystem partitioning is performed in terms of localized Kohn-Sham (KS) molecular orbitals (LMOs). The method describes subsystem interactions at the level of KS and allows for the partitioning of the subsystems across covalent and even conjugated bonds, and it enables the use of relatively small subsystem sizes for an embedded WF description. A recent review of projection-based WF-in-DFT embedding is available in Ref. 26.
Projection-based embedding has proven to be a useful tool in a wide range of chemical contexts including transition-metal complexes Stella, Bennie, and Manby (2015); Huo et al. (2016); Chapovetsky et al. (2018); Welborn, Manby, and Miller III (2018), protein active sites Bennie et al. (2016); Zhang et al. (2018), excited states de Lima Batista, de Oliveira-Filho, and Galembeck (2017); Bennie et al. (2017); Chen and Franklin Goldsmith (2019) and condensed phase systems Barnes et al. (2015), among others Parrish et al. (2015); Libisch et al. (2017); Yao et al. (2017); Meitei and Heßelmann (2017); Chulhai and Goodpaster (2018); Lin and Zepeda-Núñez (2018); Böckers and Neugebauer (2018). The development of analytical nuclear gradients for projection-based embedding will expand its applicability to include geometry optimization, transition state searches, and potentially ab initio molecular dynamics. Analytical nuclear gradients already exist for a number of other embedding methodologies, including the incremental molecular fragmentation method Hesselmann and Meitei (2018), fragment molecular orbital method Nagata, Fedorov, and Kitaura (2012); Nakata et al. (2013); Brorsen et al. (2014), quantum mechanics/molecular mechanics (QM/MM) Warshel and Levitt (1976); Field, Bash, and Karplus (1990); Lin and Truhlar (2007), ONIOM Dapprich et al. (1999); Hratchian et al. (2008); Mayhall, Raghavachari, and Hratchian (2010); Hratchian et al. (2011), embedded mean-field theory (EMFT) Fornace et al. (2015); Manby et al. (2019), frozen density embedding Dułak, Kamiński, and Wesołowski (2007); Heuser and Höfener (2016, 2017); Heuser and Hoefener (2018), and subsystem DFT Kovyrshin and Neugebauer (2016); Schlüns et al. (2017); Klahr, Schlüns, and Neugebauer (2018). However, the projection-based approach provides a number of advantages for WF-in-DFT embedding calculations and leads to a distinct analytical gradient theory, which we derive and numerically demonstrate in several applications.
In section II.1 we outline projection-based WF-in-DFT embedding and in section II.2 we provide the derivation of its analytical nuclear gradients. Section IV numerically validates the analytical nuclear gradient theory and its implementation in MolproWerner et al. (2019) via comparison with finite difference calculations, as well as presenting results for optimizing geometries in benchmark systems and the calculation of a minimum energy profile for an organometallic reaction using the nudged-elastic-band (NEB) method. We additionally provide the analytical nuclear gradient theory for WF-in-DFT embedding with atomic orbital (AO) truncation Bennie et al. (2015) in Appendices C and D.
II Projection-based Embedding Analytical Nuclear Gradients
II.1 Projection-based Embedding Energy Theory
Projection-based WF-in-DFT embedding relies on the partitioning the LMOs of a system into two subsystems. Subsystem A contains the LMOs that are treated using the WF method and subsystem B contains the remaining LMOs that are treated using KS. This WF-in-DFT procedure is accomplished by first performing a KS calculation on the full system to obtain a set of KS MOs. The occupied KS MOs are then localized and partitioned into subsystems A and B. Finally, subsystem A is treated using the WF method in the presence of the embedding potential created by the frozen LMOs of subsystem B. Note that the cost of the KS calculation on the full system is typically negligible in comparison to the subsystem WF calculation. This results in our working equation for projection-based WF-in-DFT embedding, Manby et al. (2012)
[TABLE]
where and are the WF and energy of subsystem A, is the subsystem A one-particle reduced density matrix that corresponds to , is the KS energy, and and are respectively the KS subsystem A and B one-particle densities that equal the full system KS density, , when summed together. Throughout, we shall use a tilde to indicate quantities that have been calculated using the WF method. The embedding potential, , is defined as
[TABLE]
where includes all KS two-electron terms,
[TABLE]
and where , , and label atomic orbital basis functions, are the two-electron repulsion integrals, is the fraction of exact exchange and is the exchange-correlation (XC) potential matrix. The level-shift operator, , is given by
[TABLE]
where is the overlap matrix. In the limit of , the LMOs that make up subsystems A and B are enforced to be exactly orthogonal, eliminating the non-additive kinetic energy present in other embedding frameworks Wesolowski and Warshel (1993); Götz, Beyhan, and Visscher (2009). In practice, finite values of in the range of hartree to hartree are found to provide accurate results regardless of chemical system.Manby et al. (2012) If greater accuracy is needed, a perturbative correction outlined in Ref. (9) can be added to the WF-in-DFT energy expression to account for the finiteness of , but in practice, this correction is found to contribute negligibly to the total energy and is thus neglected here.
Projection-based embedding can also be used for DFT-in-DFT embedding via a simplified version of Eq. 1. The working equation for projection-based DFT-in-DFT embedding isManby et al. (2012)
[TABLE]
The only differences between WF-in-DFT and DFT-in-DFT embedding is that the first term on the RHS of Eq. 1 is replaced with the KS energy on subsystem A, , and in the second and last terms is reduced to the subsystem A KS density matrix, .
II.2 Projection-based WF-in-DFT Embedding Gradient Theory
Since projection-based embedding is a non-variational theory, its analytical gradient is conveniently derived using a Lagrangian approach. We first construct a Lagrangian based on the projection-based WF-in-DFT energy. We then minimize the Lagrangian with respect to the variational parameters in the embedding energy, which include the subsystem A WF and the LMO coefficients. Then we show how to solve for each of the Lagrange multipliers and provide the working equation for the gradient of the total energy.
For consistency in notation, the MO coefficient matrix refers to the entire set of KS MOs (occupied and virtual). The submatrix of that refers to the (occupied) LMOs is denoted as with column indices . The submatrix of that refers to the canonical virtual space is denoted as with column indices . The indices are used to index generic molecular orbitals.
II.3 Total Energy Lagrangian
We now derive the total energy Lagrangian for projection-based WF-in-DFT embedding. Where appropriate we will provide WF method specific examples (e.g. MP2) of general terms outlined in the equations. The WF-in-DFT Lagrangian is
[TABLE]
The first term on the right hand side (RHS) of Eq. 6 is the projection-based WF-in-DFT embedding energy described by Eq. 1. The second term on the RHS of Eq. 6 contains any constraints, , and the corresponding Lagrange multipliers, , that arise from ensuring that the Lagrangian is variational with respect to parameters in the WF method. The third term on the RHS constrains the KS MOs, , to be orthonormal, which accounts for the basis set being atom centered; this term is commonly referred to as the Pulay force Pulay (1969) and arises from the atomic orbital basis set being atom centered. The localization conditions, , take into account how the KS MOs are localized before being selected for subsystems A and B. This is important because the LMOs will have a different dependence on nuclear perturbation than canonical MOs. In this work, we use Pipek-Mezey localization Pipek and Mezey (1989) to obtain LMOs. Generalization to other localization methods (e.g. Boys Foster and Boys (1960) and intrinsic bond orbitalsKnizia (2013)) is straightforward. The localization conditions for Pipek-Mezey are
[TABLE]
where corresponds to an atom in the molecule. The matrices are defined as
[TABLE]
where the summation over is restricted to basis functions at atom . The Brillouin conditions, , reflect how the KS MOs are optimized before being used to construct subsystems A and B. The Brillouin conditions are only needed because subsystem B is frozen at the KS level of theory. However, due to the non-additivity of the XC potential, the Lagrange multipliers, , span the full virtual-occupied space.
The type and number of constraints applied to the WF method depend on the chosen method. For example, if the WF method is MP2 then the constraints are
[TABLE]
where the first term on the RHS of Eq. 9 constrains the Hartree-Fock MOs, , to be orthonormal, the condition restricts the sum to occupied MOs in subsystem A, and the second term on the RHS are the Brillouin conditions using the embedded Fock matrix, . The embedded Fock matrix is defined asManby et al. (2012)
[TABLE]
where is the standard one-electron Hamiltonian, includes all of the usual HF two-electron terms and is the subsystem A HF one-particle density. These constraints also arise in the derivation of the MP2 analytical nuclear gradient Schütz et al. (2004).
For the projection-based WF-in-DFT energy to equal the Lagrangian, the Lagrangian must be minimized with respect to all of its parameters, including , , and all of the Lagrange multipliers.
II.4 Minimizing the Lagrangian with respect to the variational parameters of the WF method
Upon minimizing the WF-in-DFT Lagrangian with respect to , only terms associated with the first two terms on the RHS of Eq. 6 survive, all of which are familiar from the WF Lagrangian for the corresponding WF gradient theories.
[TABLE]
Since the embedding potential is independent of , the Z-vector coupled perturbed Hartree-Fock (Z-CPHF) equations of any post-HF method are only impacted through the eigenvalues of the subsystem A HF WF. Therefore, the solutions for the WF Lagrange multipliers (e.g. and for MP2 in Eq. 9) are obtained using the standard implementation of the WF gradient no matter what KS method is selected to describe subsystem B. However, if an alternative embedding potential is used that depends on the subsystem A WF, such as the Huzinaga constraint (i.e. Ref. (65)), then the formulation of the WF gradient is changed; the Z-CPHF equations for a general WF method would need to be modified to include the contributions from the derivative of the embedding potential with respect to the subsystem A WF, .
II.5 Minimizing the Lagrangian with respect to the MO coefficients
The remaining Lagrange multipliers, , , and in Eq. 6, are determined by minimizing the WF-in-DFT Lagrangian with respect to the variational parameters of the KS method, namely the MO coefficients, . Differentiation of the Lagrangian with respect to these parameters yields
[TABLE]
where
[TABLE]
[TABLE]
[TABLE]
and
[TABLE]
The 4-dimensional tensors, and , are expanded in Appendices A and B, respectively, corresponds to , and includes all two-electron terms of the generalized Fock matrix and is shown explicitly in Appendix B. Since the embedded Fock matrix, , contains the embedding potential, , its derivative with respect to the MO coefficients, , is nonzero resulting in the WF relaxed density being needed to construct in Eq. 13, which is explicitly shown in Appendix B. Therefore, the subsystem A WF gradient only affects the embedding contributions to the gradient through the WF relaxed density.
We now show that solving for the Lagrange multipliers leads to familiar coupled perturbed equations. Combining the stationary conditions described by Eq. 12 with the auxiliary conditions yields the linear Z-vector equationsSchütz et al. (2004)
[TABLE]
where permutes the indices and , which is used to solve for and . The matrix is then obtained as
[TABLE]
The Lagrange multipliers pertain to the occupied-occupied MO space; considering only the occupied-occupied part of Eq. 17 yields
[TABLE]
Using the Brillouin conditions and the knowledge that , Eq. 19 can be further simplified by showing that
[TABLE]
The solutions, , are thus independent of , such that Eq. 19 reduces to
[TABLE]
These are the Z-vector coupled perturbed localization (Z-CPL) equations, which are used to solve for . Subsequently, can be computed according to Eq. 14.
The Lagrange multipliers pertain to the virtual-occupied MO space; considering only the virtual-occupied part of Eq. 17 yields
[TABLE]
which further simplifies to
[TABLE]
These are the Z-vector coupled perturbed Kohn-Sham (Z-CPKS) equations. Having solved the Z-CPL and Z-CPKS equations, the remaining Lagrangian multipliers associated with the orthogonality constraints, , can be obtained from Eq. 18.
II.6 Gradient of the Total Energy
Once the Lagrangian is minimized with respect to all variational parameters, the gradient of the total energy takes the form
[TABLE]
Since the Lagrangian is minimized with respect to the subsystem A WF and the KS LMO coefficients, calculation of the WF and KS LMO responses to nuclear perturbation, and respectively, are avoided. This yields the following expression for the gradient,
[TABLE]
where the superscript denotes the explicit derivative of the quantity with respect to a nuclear coordinate. Eq. 25 can be further simplified by folding into the first three terms on the RHS of Eq. 25, yielding
[TABLE]
Here, denotes the total derivative of the subsystem A WF energy with respect to nuclear coordinate, which can be directly calculated using existing WF gradient implementations, and is the WF-relaxed density for subsystem A. For example, the MP2-relaxed density is
[TABLE]
which contains the subsystem A Hartree-Fock density, , the MP2 density matrix, , and the solutions of the subsystem A Brillouin conditions, . Eq. 26 can be expressed in terms of the WF gradient on subsystem A and the derivative AO integrals, yielding our final expression for the projection-based WF-in-DFT analytical gradient,
[TABLE]
The effective one-particle density and effective two-particle density are defined
[TABLE]
and
[TABLE]
The effective one-particle densities and are defined
[TABLE]
and
[TABLE]
The matrix is defined
[TABLE]
where
[TABLE]
The second term on the RHS of Eq. 34 is expanded in Appendix A.
The analytical nuclear gradient expression for projection-based DFT-in-DFT closely follows that for WF-in-DFT, with regard to evaluation of both the Lagrange multipliers (Eq. 12) and the final gradient (Eq. 28). To obtain the corresponding DFT-in-DFT expressions, becomes the subsystem A KS density
[TABLE]
which affects the evaluation of in Eq. 12 (expanded in Eq. 39) and the evaluation of the final gradient expression, Eq. 28. Additionally, the first term on the RHS of the final gradient expression, Eq. 28, is replaced with the subsystem A KS gradient, .
III Computational Details
The implementation of projection-based WF-in-DFT embedding gradients is available in the 2019 general release of Molpro Werner et al. (2019). In all embedding calculations reported here, unless otherwise specified, the Pipek-Mezey localization method Pipek and Mezey (1989) is used with the core and occupied MOs localized together. The subsystem A region is chosen by including any LMOs with a net Mulliken population larger than 0.4 on the atoms associated with subsystem A, although more sophisticated partitioning algorithms have been introduced.Welborn, Manby, and Miller III (2018) A level-shift parameter of hartree is used for all embedding calculations. The perturbative correction to using a finite value of in Eq. 5 is less than 20 microhartrees for the applications presented here and thus not included (accomplished by specifying the option ). Throughout this work, all embedding calculations are described using the nomenclature “(WF method)-in-DFT/basis,” where the WF method describes subsystem A and the KS method describes subsystem B. For some embedding calculations a mixed-basis set is used and is denoted by “(WF method)-in-DFT/large-basis:small-basis,” where the large basis is used to describe subsystem A and the small basis is used to describe subsystem B.
All SCF calculations employ a tighter threshold than default for MO convergence by specifying the option a.u. in Molpro. All KS calculations used in projection-based embedding are done without density fitting, employing the local-density approximation (LDA) Hohenberg and Kohn (1964); Vosko, Wilk, and Nusair (1980), Perdew-Burke-Ernzerhof (PBE) Perdew, Burke, and Ernzerhof (1996), PBE0 Adamo and Barone (1999), and LDAX functionals with the def2-TZVPP, def2-SVP, def2-ASVP, Weigend and Ahlrichs (2005); Weigend (2006) cc-pVDZ, Dunning (1989) and 6-31G Hehre, Ditchfield, and Pople (1972) basis sets. Note that the def2-ASVP basis set used in Molpro is constructed by adding one set of even tempered diffuse functions to the def2-SVP basis set. The LDAX functional is constructed by including exact exchange and reducing the weight of the DIRAC functional to in the LDA functional. For the calculations in sections IV.1 and IV.2.2, the XC functional is evaluated on a fixed-pruned grid with index 7 (Ref. 74). For the optimized geometries shown in section IV.2, and the malondialdehyde calculations in section IV.3 the XC functional is evaluated on an adaptively generated quadrature grid that reproduces the energy of the Slater-Dirac functional to a specified threshold accuracy of . All WF calculations are performed with the frozen-core approximation, without density fitting, employing the MP2 Møller and Plesset (1934), coupled-cluster singles and doubles (CCSD) Scuseria, Janssen, and Schaefer (1988); Purvis and Bartlett (1982), and coupled-cluster singles, doubles, and perturbative triples [CCSD(T)] Raghavachari et al. (1989) correlation treatments with the def2-TZVPP, def2-SVP, cc-pVDZ and 6-31G basis sets. Even though the density fitting approximation is not used for the WF methods in this study, density fitted gradients are available for the aforementioned WF methods. Bozkaya and Sherrill (2016, 2017) The default values for integral screening were used in Molpro. For all Z-CPKS calculations an iterative subspace solver employing the Davidson algorithm Davidson (1975); Kauczor, Jørgensen, and Norman (2011) is used with a convergence threshold of a.u. For all Z-CPHF calculations needed for the subsystem A WF gradient an iterative solver with a convergence threshold of a.u. is used. Grid weight derivatives are included for all gradient calculations involving the XC functional and potential.
For all geometry optimizations the number of LMOs in subsystem A is kept unchanged throughout the optimization. A natural way of enforcing this in future work is to employ even-handed partitioning,Welborn, Manby, and Miller III (2018) although this was not needed in the examples studied here; the default procedure based on net Mulliken population sufficed to keep subsystem A unchanged. All geometries are optimized using the translation-rotation-internal coordinate system devised by Wang and Song, Wang and Song (2016) which is available in the GeomeTRIC package.Wang (2019) Convergence parameters for the geometry optimizations follow the default parameters used by Molpro, namely that the maximum gradient value becomes less than hartree/bohr and the energy change between adjacent steps becomes less than hartree or the maximum component of the step displacement becomes less than bohr. The maximum gradient value is evaluated in the Cartesian basis. All geometries are provided in the supporting information.
Nudged elastic band (NEB) Henkelman and Jónsson (2000) calculations are run using the implementation of the method in the atomic simulation environment (ASE) package Larsen et al. (2017). All NEB calculations use Molpro forces which are provided through a Molpro calculator interface within the ASE package.
The intramolecular proton transfer of malondialdehyde is modeled with an NEB consisting of 15 images connected by springs with spring constants of eV/Å2. The CCSD/def2-aSVP, CCSD-in-LDA/def2-aSVP and LDA/def2-aSVP optimized NEBs used the image dependent pair potential (IDPP) method Smidstrup et al. (2014) as the initial guess for the band with reactant and product geometries previously optimized at the corresponding level of theory. All NEB calculations for malondialdehyde are converged with the Broyden-Fletcher-Goldfarb-Shanno (BFGS) update of the Hessian and by enforcing that the maximum gradient value is less than eV/Å2.
The intramolecular proton transfer of the organometallic cobalt complex is modeled with an NEB consisted of images connected by springs with spring constants of eV/Å2. The PBE0/cc-pVDZ climbing image NEB Henkelman, Uberuaga, and Jónsson (2000), consisting of 26 images, used the IDPP method as the initial guess for the band with the reactant geometry previously optimized. The CCSD-in-PBE0/cc-pVDZ NEB consisting of 23 images, used its optimized reactant and the climbing image NEB converged at the PBE0/cc-pVDZ level of theory as its initial guess. Since the product is spatially far away from the reactant, an intermediate geometry between the transition state and the product is used as the endpoint of the NEB. This intermediate geometry is determined by initially converging a NEB with an extra image such that the maximum force dropped below eV/Å2. Then the second to last image is used as the new endpoint and a new NEB is converged. The PBE0/cc-pVDZ climbing image NEB is optimized using the FIRE Bitzek et al. (2006) algorithm using a convergence criteria of eV/Å2 for the maximum gradient value. The projection-based WF-in-DFT embedding NEB is optimized at the CCSD-in-PBE0/cc-pVDZ level of theory using the FIRE algorithm with a convergence criteria of eV/Å2 for the maximum gradient value.
The CCSD-in-PBE0/cc-pVDZ calculations used for the NEB optimization are performed by specifying 51 occupied MOs to be in subsystem A using the N_ORBITALS option and by using AO truncation with a threshold of a.u. An even-handed selection of AOs were used along the NEB by creating a union of the AOs that were selected for each image by the truncation procedure to ensure that the NEB traversed a smooth potential energy surface. The final, reported energies of the WF-in-DFT NEB are performed using the PNO-LCCSDSchwilk, Usvyat, and Werner (2015)/cc-pVDZ and PNO-LCCSD-in-PBE0/cc-pVDZ levels of theory. Both the PNO-LCCSD and PNO-LCCSD-in-PBE0 calculations are performed with density fitting using the cc-pVTZ/JKFIT Weigend (2002) (the def2-TZVPP/JKFIT Weigend (2008) basis set was used for cobalt since the cc-pVTZ/JKFIT basis set was not available) and the cc-pVTZ/MP2FIT Weigend, Köhn, and Hättig (2002) density fitting basis sets. Tighter domain approximations were employed for all PNO-LCCSD calculations by specifying the DOMOPT=TIGHT option. Additionally, the Boughton-Pulay completeness criterion was used for the selection of the primary projected atomic orbitals domain by specifying the option THRBP=1 and the Pipek-Mezey localization method was used. For the PNO-LCCSD-in-PBE0 calculations, AO truncation is not used, the core and valence DFT molecular orbitals are localized separately using the Pipek-Mezey localization method, and the subsystem A orbitals are selected using the default procedure based on the Mulliken population threshold.
All calculations using AO truncation Bennie et al. (2015) ensure that at least one AO is kept per atom (specified by option AO_PER_ATOM) to make evaluation of the integral derivative contributions from the one electron Hamiltonian simpler within Molpro. This adds a negligible amount of AO functions than would have been selected using only the density threshold parameter Bennie et al. (2015) for the systems studied in this paper. In all embedding geometry optimizations that employ AO truncation, the number of truncated AOs is fixed using the STOREAO option to ensure smoothness of the potential energy function. Upon convergence, the truncated AO list is reevaluated using the same density threshold parameter; if the number of kept AOs remains a subset of the original list of truncated AOs then the optimization is converged.
IV Results and Discussion
IV.1 Comparison of Analytical and Numerical Gradients
The implementation of the projection-based WF-in-DFT analytical gradient is tested by comparison with the gradient evaluated by numerical finite difference for a distorted geometry of ethanol. The finite difference gradients are evaluated using a four-point central difference formula with a base step size of bohr. The mean absolute error (MAE) between the analytical and finite difference gradients is reported for a range of embedding calculations in Table 1. These results show that the analytical nuclear gradient for projection-based WF-in-DFT embedding is essentially numerically exact with respect to the gradients calculated by finite difference. Comparison of the results obtained using HF over the full system versus using LDA over the full system illustrate that some of the finite difference error comes from the DFT exchange-correlation grid. Comparison of the HF-in-HF results with full HF and of the LDA-in-LDA results with full LDA illustrate the modest effect of using a large-but-finite value for the level-shift operator in projection based embedding. These results confirm the correct implementation of projection-based WF-in-DFT analytical nuclear gradients.
IV.2 Optimized Geometries
IV.2.1 Ethanol
As a proof of concept, CCSD-in-LDA/6-31G analytical nuclear gradients are employed to determine the ground state geometry of ethanol, which is shown in Fig. 1. For this simple case, the O-H moiety is treated by CCSD and the remainder of the molecule is treated by LDA. Table 2 shows that the O-H bond length within subsystem A reproduces the CCSD predicted bond length of Å and the remaining bonds within subsystem B reproduce the LDA predicted bond lengths. This indicates that the potential energy surface produced by projection-based embedding varies smoothly from CCSD-like interactions for subsystem A and LDA-like interactions for subsystem B. Interestingly, the C-O bond located at the boundary between subsystems A and B closely reproduces the LDA bond length and is not an interpolation between the CCSD and LDA bond lengths.
IV.2.2 Cobalt-based Organometallic Complex
As a demonstration of embedding gradients with AO truncation, the geometry of the cobalt-based organometallic complex, shown in Fig. 2, is optimized. Fig. 2a shows the CCSD-in-LDAX/def2-TZVPP:def2-SVP optimized structure of the cobalt complex where the solid atoms are included in subsystem A and the transparent atoms are included in subsystem B. In Fig. 2b the optimized structures evaluated at the CCSD-in-LDAX/def2-TZVPP:def2-SVP (solid) and the LDAX/def2-TZVPP:def2-SVP (transparent) levels of theory are overlaid. While only modest differences are seen in the overall structure, Table 3 shows that the optimized bond lengths do change between the two levels of theory, both for the region within subsystem A and at the subsystem boundary. This indicates that the WF method is capable of relaxing the atoms in subsystem A even when they are strongly coordinated with subsystem B. It is also seen that the bond lengths across the boundary of subsystems A and B also differ from the LDAX geometry since the bonds in question experience the effects of both the WF and KS methods. Finally, if a bond length associated with atoms in subsystem B is considered, such as the O1-H bond, it is found to closely match the LDAX predicted bond length.
IV.3 Malondialdehyde: Minimum Energy Reaction Pathway
The minimum energy reaction pathway for the proton transfer in malondialdehyde is determined using the NEB method. Fig. 3 shows that with minimal embedding (Fig. 3a) the CCSD-in-LDA/def2-aSVP reaction barrier, shown in Fig. 3b, is kcal/mol which is within kcal/mol of the CCSD/def2-aSVP reference reaction barrier of kcal/mol. This is a vast improvement over the LDA/def2-aSVP result, which predicts an essentially barrierless reaction. In addition to correctly predicting the reaction barrier, Fig. 3c shows that the CCSD-in-LDA/def2-aSVP reaction pathway lies precisely on top of the CCSD/def2-aSVP pathway with only a small deviation in the basins. In contrast, Fig. 3d shows that the LDA/def2-aSVP reaction pathway and potential energy surface reveal errors in the location of the reactant and product basins, with the hydrogen-bond length vastly underestimated. This is consistent with the tendency of LDA to over stabilize hydrogen bonds.
IV.4 Cobalt-based Organometallic Complex: Minimum Energy Reaction Pathway
The minimum energy reaction pathway for the intramolecular proton transfer in a cobalt diimine-dioxime catalyst (Fig. 4a) is now investigated. Previously, the reaction pathway for the transfer of the [-NH] to form a cobalt hydride had been investigated using geometries obtained using DFT. Huo et al. (2016) Fig. 4b shows the energy profile for this reaction determined by various levels of theory. We observe that the reaction pathway determined by the NEB optimized at the PBE0/cc-pVDZ level of theory (purple curve) predicts a barrier height of kcal/mol. However, when single-point PNO-LCCSD-in-PBE0/cc-pVDZ embedding energy calculations are run on the PBE0 optimized geometries (blue curve), the barrier height is lowered to kcal/mol and the position of the transition state is shifted towards the reactant. The NEB optimized at the CCSD-in-PBE0/cc-pVDZ embedding level of theory (red curve) shows an even lower barrier height of kcal/mol and predicts a substantially different transition state geometry (Fig. 4c) than the DFT result. The difference between the transition states predicted by the PBE0/cc-pVDZ and PNO-LCCSD-in-PBE0/cc-pVDZ levels of theory is clearly seen in Fig. 4d, which shows the projection of the NEB onto the two dimensions of the Co-H and N-H bonds and with the position of the transition state geometry indicated with stars. This result clearly shows the large degree to which commonly employed DFT transition state geometries can differ from the CCSD-quality result that is obtained using projection-based embedding.
V Conclusions
We present the derivation and numerical demonstration of analytical nuclear gradients for projection-based embedding both with and without AO truncation. A key aspect of the gradient theory is that the WF contributions can be evaluated using existing WF gradient implementations without the need for modification or additional programming, thereby allowing projection-based WF-in-DFT embedding gradients to be easily generalized to any combination of WF and KS-DFT methods. It is demonstrated that projection-based embedding gradients produce accurate geometries for a variety of benchmark systems, including for bond-lengths that span the interface between subsystems. Furthermore, in applications to both malondialdehyde and a transition-metal catalyst, WF-in-DFT minimum energy pathways obtained via the NEB method reveal large errors in DFT-computed transition-state energies and geometries. Finally, we note that the Lagrangian framework presented here can be used to derive other analytical gradients of the projection-based WF-in-DFT energy with respect to quantities such as electric and magnetic fields.
Acknowledgements.
We thank Matthew Welborn for helpful discussions. This material is based upon work supported by the U.S. Army Research Laboratory under Grant No. W911NF-12-2-0023 (S.J.R.L.). S.J.R.L. thanks the Caltech Resnick Sustainability Institute for a graduate fellowship. T.F.M. and F.R.M. acknowledge joint support from the DOE (Award No. DEFOA-0001912), and F.R.M. acknowledges support form the Engineering and Physical Sciences Research Council for funding (EP/M013111/1).
VI Supporting Information
All geometries used in all tables and figures are available for download.
Appendix A Pipek-Mezey Localization
Equation 14 from the main text
[TABLE]
corresponds to the derivative of the localization conditions, Eq. 7, with respect to the molecular orbital coefficients , where
[TABLE]
Next, the overlap derivative contribution from the localization conditions from Eq. 34 is
[TABLE]
where permutes the indices and , and is restricted to atomic orbitals on atom .
Appendix B Orbital Derivatives of Projection-based WF-in-DFT Embedding Energy
This appendix provides additional details for the terms in Eqs. 13 and 15 of the main text. The derivative of the projection-based WF-in-DFT embedding energy and the WF constraints with respect to the MO coefficients shown in Eq. 13 is
[TABLE]
where the partial derivative of the WF constraints causes the appearance of the WF relaxed density, , in the last four terms on the RHS of Eq. 39. Equation 39 simplifies to
[TABLE]
where indicates that the index is restricted to LMOs, indicates that is restricted to LMOs in subsystem A, and is the KS Fock matrix evaluated with the bracketed density. The last term on the RHS of Eq. 40
[TABLE]
simplifies to
[TABLE]
where
[TABLE]
In the current study, we employ both LDA and generalized gradient approximation (GGA) exchange-correlation functionals; for the special case of LDA, the term assumes the form
[TABLE]
where is the XC kernel which is defined as the second derivative of the XC functional with respect to density.
The derivative of the Brillioun conditions in Eq. 15 can be expanded as follows.
[TABLE]
where and is defined as
[TABLE]
Appendix C Atomic Orbital Truncation
Projection-based WF-in-DFT embedding reduces the cost of the WF calculation on subsystem A by reducing the number of LMOs that are correlated at the WF level, but thus far leaves the virtual space untouched. However, the scaling of most WF methods is dominated by the number of virtual MOs (e.g. for CCSD). One strategy has been to employ local correlation WF methods such as PNO-LMP2 Werner et al. (2015) and PNO-LCCSD Schwilk et al. (2017); Ma and Werner (2018) to describe subsystem A since these methods are able to leverage the reduced number of LMOs to significantly lower the number of occupied-virtual orbital pairs that need to be included, resulting in a cheap and accurate WF calculation. However, a real advantage of projection-based embedding hinges on being able to use any WF method to describe subsystem A. Therefore, having a more general approach to reduce the cost of the WF calculation on the subsystem A is desirable.
The AO truncation scheme devised by Bennie et al. Bennie et al. (2015) provides a simple way to significantly reduce the cost of the WF calculation by reducing the size of the basis used to describe subsystem A. The AOs that are discarded are selected through a single density threshold parameter: if the net Mulliken population, computed using the subsystem A density, of an AO is less than the specified threshold, it is removed from the basis set. This scheme has shown to greatly speedup up WF-in-DFT calculations at a small cost in accuracy in total and relative energies. Bennie et al. (2015) Additionally, it has the nice feature that given a fixed subsystem A, the size of the truncated subsystem A basis scales asymptotically as the size of the environment grows. This basis set modification does not cause any complications in the evaluation of the subsystem A WF gradient so existing implementations can be used without any modifications. The energy expression for a projection-based WF-in-DFT calculation with AO truncation using the so-called type-in-type correction Bennie et al. (2015) is
[TABLE]
where is the subsystem A WF in the truncated basis, is the KS subsystem A one-particle density in the truncated basis, and are the KS subsystem A and B one-particle densities in the full basis respectively, is the subsystem A one-particle reduced density matrix that corresponds to , is the embedding potential in the truncated basis which is evaluated by
[TABLE]
and is the projection operator in the truncated basis which is evaluated by
[TABLE]
Here, is the rectangular matrix that maps the full basis to the truncated basis which is created by starting with identity matrix and deleting columns corresponding to thrown away AO functions. We note that the even though the notation for Eq. 47 is different from the one used in Ref. 18 the approach is identical.
Appendix D Projection-based WF-in-DFT Gradient Theory with AO Truncation
D.1 Total Energy Lagrangian
We now derive the total energy Lagrangian for for projection-based WF-in-DFT embedding with AO truncation. The WF-in-DFT AO truncation Lagrangian is
[TABLE]
where the bar superscript refers to subsystem A quantities optimized by the KS functional in the truncated basis. The constraints that appear in Eq. 50 are all the same as those that appear in Eq. 6 from the main text, except for the third term on the RHS of Eq. 50. This term constrains the MOs, , to be orthogonal.
D.1.1 Minimizing the Lagrangian with respect to the variational parameters of the WF method – .
Minimizing the WF-in-DFT AO truncation Lagrangian with respect to simplifies to the minimization of the subsystem A WF energy and the WF constraints (as explained in section II.4), which corresponds to the conventional WF Lagrangian used to derive WF gradient theories, albeit in the truncated basis.
D.1.2 Minimizing the Lagrangian with respect to the MO coefficients, .
The minimization of the Lagrangian with respect to the optimized KS MO coefficients in the truncated basis, , results in the SCF equations using the embedded Fock matrix.
[TABLE]
[TABLE]
[TABLE]
Therefore, the Lagrange multipliers are simply the MO eigen energies of the KS optimized subsystem A MOs in the truncated basis.
D.1.3 Minimizing the Lagrangian with respect to the MO coefficients, .
The minimization of the Lagrangian with respect to the KS MO coefficients in the full basis, , is
[TABLE]
where only the matrix differs from the ones outlined in Eqns. 13-16.
[TABLE]
With the updated matrix, the Lagrangian multipliers are solved in the same way as outlined for the WF-in-DFT Lagrangian multipliers.
D.1.4 Gradient of the Total Energy
Once the Lagrangian is minimized with respect to all variational parameters, the gradient of the energy with respect to nuclear coordinate, , takes the form
[TABLE]
where the first two terms on the RHS of Eq. 56, and , are the total derivative of the truncated subsystem A WF and KS energy respectively, minus the embedding contribution, . These two terms are calculated using existing gradient implementations, whereas the embedding contribution has been folded into the remaining terms. The effective one-particle densities , and are
[TABLE]
[TABLE]
and
[TABLE]
The effective two-particle density is
[TABLE]
The matrix is
[TABLE]
where
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Kitaura et al. (1999) K. Kitaura, E. Ikeo, T. Asada, T. Nakano, and M. Uebayasi, Chemical Physics Letters 313 , 701 (1999) . · doi ↗
- 2Deev and Collins (2005) V. Deev and M. A. Collins, The Journal of Chemical Physics 122 , 154102 (2005) . · doi ↗
- 3Collins and Deev (2006) M. A. Collins and V. A. Deev, Journal of Chemical Physics 125 , 104104 (2006) . · doi ↗
- 4Fedorov and Kitaura (2007) D. G. Fedorov and K. Kitaura, Journal of Physical Chemistry A 111 , 6904 (2007) . · doi ↗
- 5Elliott et al. (2009) P. Elliott, M. H. Cohen, A. Wasserman, and K. Burke, Journal of Chemical Theory and Computation 5 , 827 (2009) . · doi ↗
- 6Goodpaster et al. (2010) J. D. Goodpaster, N. Ananth, F. R. Manby, and T. F. Miller III, The Journal of Chemical Physics 133 , 084103 (2010) . · doi ↗
- 7Huang and Carter (2011) C. Huang and E. A. Carter, The Journal of Chemical Physics 135 , 194104 (2011) . · doi ↗
- 8Goodpaster, Barnes, and Miller III (2011) J. D. Goodpaster, T. a. Barnes, and T. F. Miller III, The Journal of Chemical Physics 134 , 164108 (2011) . · doi ↗
