Bridging molecular dynamics and correlated wave-function methods for accurate finite-temperature properties
Dario Rocca, Anant Dixit, Michael Badawi, S\'ebastien Leb\`egue, Tim, Gould, Tom\'a\v{s} Bu\v{c}ko

TL;DR
The paper presents the selPT perturbative approach that combines ab initio molecular dynamics with correlated wave-function methods to accurately predict finite-temperature properties, demonstrated on adsorption enthalpies in zeolites with excellent experimental agreement.
Contribution
The novel selPT method efficiently integrates AIMD with correlated wave-function techniques for precise finite-temperature property calculations.
Findings
Accurate adsorption enthalpies for CH₄ and CO₂ in zeolites at 300 K.
Excellent agreement with experimental data.
Advances toward quantitative AIMD predictions.
Abstract
We introduce the "selPT" perturbative approach, based on ab initio molecular dynamics (AIMD), for computing accurate finite-temperature properties by efficiently using correlated wave-function methods. We demonstrate the power of the method by computing prototypical molecular enthalpies of adsorption in zeolite (CH and CO on protonated chabazite at 300~K) using the random phase approximation. Results are in excellent agreement with experiment. The improved accuracy provided by selPT represents a crucial step towards the goal of truly quantitative AIMD prediction of experimental observables at finite temperature.
| MD | MD | SET1 | SET2+3 | SET1+2+3 | ||
| @PBE | @PBE+D2 | @PBE+D2 | @PBE+D2 | @PBE+D2 | ||
| PBE+D2 | - | - | - | |||
| PBE | ||||||
| RPA | – | – |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Physics Studies · Spectroscopy and Quantum Chemical Studies · Zeolite Catalysis and Synthesis
Bridging molecular dynamics and correlated wave-function methods for accurate finite-temperature properties
Dario Rocca
Université de Lorraine, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
CNRS, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
Anant Dixit
Université de Lorraine, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
CNRS, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
Michael Badawi
Université de Lorraine, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
CNRS, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
Sébastien Lebègue
Université de Lorraine, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
CNRS, LPCT, UMR 7019, 54506 Vandœuvre-lès-Nancy, France
Tim Gould
Qld Micro- and Nanotechnology Centre, Griffith University, Nathan, Qld 4111, Australia
Tomáš Bučko
Department of Physical and Theoretical Chemistry, Faculty of Natural Sciences, Comenius University in Bratislava, Ilkovičova 6, SK-84215 Bratislava, Slovakia
Institute of Inorganic Chemistry, Slovak Academy of Sciences, Dúbravská cesta 9, SK-84236 Bratislava, Slovakia
Abstract
We introduce the “selPT” perturbative approach, based on ab initio molecular dynamics (AIMD), for computing accurate finite-temperature properties by efficiently using correlated wave-function methods. We demonstrate the power of the method by computing prototypical molecular enthalpies of adsorption in zeolite (CH4 and CO2 on protonated chabazite at 300 K) using the random phase approximation. Results are in excellent agreement with experiment. The improved accuracy provided by selPT represents a crucial step towards the goal of truly quantitative AIMD prediction of experimental observables at finite temperature.
pacs:
Since the seminal work of Car and Parrinello Car and Parrinello (1985) the field of ab initio molecular dynamics (AIMD) has become impressively popular as it offers insights into finite-temperature systems. Because of the high computational cost involved in ab initio electronic structure calculations, AIMD is usually based on density functional theory (DFT) Kohn and Sham (1965), which provides a good compromise between accuracy and computational cost. However, approximate DFT functionals do not systematically reach chemical accuracy (1 kcal/mol) and are affected by a series of known shortcomings Cohen et al. (2011). This can hamper understanding of systems for which DFT approaches are insufficient to obtain a quantitative, or even qualitative, standard.
An alternative to DFT is to use correlated wave-function methods, which are more accurate and systematically improvable, but involve a higher computational cost. For applications to molecular systems the Møller-Plesset perturbation theory Møller and Plesset (1934) (MP2, MP3 etc.) and coupled-cluster theory Bartlett and Musial (2007) (CC) have long been used Szabo and Ostlund (2012). For condensed phase applications, MP2 has only recently appeared – using plane-wave Marsman et al. (2009); Dixit et al. (2017a), localized Pisani et al. (2008), and hybrid Del Ben et al. (2012) basis sets. Approximations based on coupled-cluster theory have been implemented in the VASP Booth et al. (2013) and PySCF codes McClain et al. (2017); Sun et al. (2018).
A particularly promising correlated approach for condensed phase applications is the random phase approximation (RPA) Bohm and Pines (1953); Gell-Mann and Brueckner (1957); Langreth and Perdew (1975); Dobson and Wang (1999); Furche (2001); Eshuis and Furche (2011); Heßelmann and Görling (2011a); Ren et al. (2012). This approach to the ground state electronic correlation energy can be seen both as a DFT approximation Langreth and Perdew (1975, 1977) or a CC approximation Scuseria et al. (2008). The RPA also has analogies with MP theory Mussard et al. (2016). Unlike CC and MP theories, which are applied exclusively as post Hartree-Fock methods, the RPA is most often evaluated from a DFT starting point.
The success of the RPA is largely related to its accurate description of weak van der Waals (vdW) interactions Dobson and Wang (1999); Furche (2001), but this approximation is sometimes less reliable for short range interactions. Although significantly more expensive than traditional DFT approximations, the RPA can be performed for systems with up to a few hundred electrons and a series of applications to condensed phase systems have been reported in the literature Del Ben et al. (2015a); Marini et al. (2006); Lebègue et al. (2010); Harl and Kresse (2008); Lu et al. (2009); Harl et al. (2010); Kaoui and Rocca (2016); Ren et al. (2009); Schimka et al. (2010); Göltl et al. (2012). When extra precision is required, beyond-RPA methods can be used Furche and van Voorhis (2005); Heßelmann and Görling (2011b); Bates and Furche (2013); Olsen and Thygesen (2012); Ren et al. (2013); Lu (2014); Colonna et al. (2014); Mussard et al. (2016); Dixit et al. (2017b); Hellgren et al. (2018).
Whether via RPA, MP or CC theories, correlated wave-function methods hold the promise of a significantly higher level of accuracy, which is of fundamental importance for design of truly predictive approaches to assist experimental and technological developments. However, they are characterized by an unfavorable scaling, slow basis set convergence Marini et al. (2006); Harl and Kresse (2008); Dixit et al. (2018), and slow cell size convergence Liao and Grüneis (2016), making them expensive.
Despite these difficulties, high-level methodologies are of vital importance in many technologically attractive cases involving e.g. nanoporous materials such as zeolites Speybroeck et al. (2015); Chibani et al. (2016); Bučko et al. (2017); Chibani et al. (2018); Grajciar et al. (2018) or metal-organic frameworks Chibani et al. (2018); Grajciar et al. (2018); Babaei and Wilmer (2016); Adil et al. (2017). These materials are intensively investigated for their ability to selectively adsorb target compounds and this property is used in numerous applications such as depollution Speybroeck et al. (2015); Chibani et al. (2016); Bučko et al. (2017); Chibani et al. (2018); Grajciar et al. (2018); Babaei and Wilmer (2016) or separation of chemicals Adil et al. (2017); Sholl and Lively (2016). Reliable predictions of adsorption properties of these materials can be made only if the electronic structure method used in simulations accurately describes all interactions within the system of interest, and if the thermal effects are properly accounted for by using a suitable statistical mechanics method such as molecular dynamics (MD) or Monte Carlo. Unfortunately, simulations combining high-level methodologies with MD or Monte Carlo approaches, usually requiring ensembles of tens of thousands of configurations, are thus far beyond the reach of the computational power available today. In fact, the forces for correlated methods in the condensed phase have only recently been implemented Del Ben et al. (2015b); Ramberger et al. (2017) and employed in MD simulations Del Ben et al. (2015a); Bokdam et al. (2017).
In this work we present a methodology based on perturbation theory that extends the applicability of correlated methods to compute finite temperature properties at a reasonable computational cost. Our approach starts from MD simulations based on computationally inexpensive DFT functionals. Then, a small number (a few tens) of significant configurations are chosen from the ensemble of structures generated by MD to represent the full probability distribution. Finally, correlated wave-function calculations are performed on these configurations and the results are used to reconstruct the corresponding probability distribution and ensemble averages. Below we will apply this method to the computation of the finite temperature adsorption enthalpies of CH4 and CO2 in the zeolite chabazite, using RPA.
The new approach introduced in this letter will be referred to as perturbation theory on selected configurations (selPT). A detailed discussion of perturbation theory in the context of molecular dynamics and free energy calculations can be found in Ref. Chipot and Pohorille, 2016. The main ideas and working equations will be summarized here.
Within the canonical ensemble, the expectation value of a certain observable is defined as follows: , for . Here and and denote nuclear positions and momenta, respectively. In this work the notation represents the canonical ensemble average corresponding to the classical Hamiltonian with and being the nuclear kinetic and potential energies, respectively.
Exploring all and is generally impossible. An alternative is to assume the ergodic hypothesis. Then, the ensemble average can more usefully be expressed as
[TABLE]
a temporal average over an AIMD trajectory , . For infinite time, is independent of trajectory, since the system explores all configurations. However, for finite time, the choice of trajectory is crucial.
Ideally, we could use any method to obtain long AIMD trajectories that ensure a sufficient level of statistical precision. AIMD simulations are time-consuming, however, and thus long trajectories require a numerically efficient model electronic energy Hamiltonian , with potential
[TABLE]
at nuclear configuration and repulsive nuclear energy . Unfortunately, this leaves us with a compromise between accuracy (quality of electronic energies) and efficiency (length of trajectories), for a given precision.
In such cases, perturbation theory can be used to transform from an efficient/inaccurate Hamiltonian (e.g. PBE+D2) to an expensive/accurate, Hamiltonian (e.g. RPA), via the perturbation,
[TABLE]
The trick is to write thermal averages of potentials as,
[TABLE]
where describes the energy distribution of some potential energy of interest. The subscript indicates that it is calculated in the trajectory described by Hamiltonian . Then, one uses the perturbation to obtain from
[TABLE]
then gives properties of the trajectory , but is calculated on the trajectory .
This lets us define properties of the system driven by the primed Hamiltonian (e.g. RPA) in terms of trajectories (Eq. 1) on the system described by the original Hamiltonian (e.g. PBE+D2). Typical calculations employing perturbative approaches based on this formula involve large (albeit reduced) numbers of configurations computed in a given MD or Monte Carlo simulation, which are implausible for the expensive methods we seek to use. In this work, we thus introduce an innovation that lets the perturbation be evaluated with very few additional calculations, but without significant loss of precision. We thereby break the compromise between speed and accuracy that hampers the application of AIMD to difficult problems, such as the ones presented below.
To understand the challenges involved in the direct application of perturbation theory, we now present a specific numerical example that will also be used later to discuss our new methodological developments. Let us consider the finite temperature adsorption of methane on protonated chabazite (later, results will also be presented for the adsorption of carbon dioxide). The model used for this system, which involves 200 valence electrons per unit cell, is shown in Fig. 1 and further discussed in Supplemental Material. Using MD in the NVT ensemble, the adsorption enthalpy can be computed as
[TABLE]
where subscripts and denote the clean substrate (protonated chabazite), and the adsorbate (methane), respectively, while indicates the system of substrate interacting with adsorbate. In order to evaluate the ensemble averages in Eq. 6, MD simulations at 300 K have been performed with the VASP code Kresse and Hafner (1993) using two different approximations, namely PBE and PBE+D2 (a vdW corrected version of the PBE functional Grimme (2006)). Computational details are provided in Supplemental Material. These two MD simulations will be denoted as MD@PBE and MD@PBE+D2, respectively.
In Tab. 1 the PBE (PBE+D2) enthalpy of adsorption corresponding to MD@PBE (MD@PBE+D2) is an “exact” estimate, namely an estimate where the averaged potential and the Hamiltonian used for the sampling are at the same level of theory; the PBE (PBE+D2) value evaluated from MD@PBE+D2 (MD@PBE) is based on the perturbative approach in Eq. 5. Both PBE+D2 and PBE values evaluated perturbatively from MD@PBE and MD@PBE+D2, respectively, are close to their “exact” values, demonstrating the reliability of our approach. The PBE+D2 perturbative estimate is slightly worse, showing the importance of starting from an approach that correctly samples the key adsorption region of phase-space.
From a practical point of view, this brute force application of perturbation theory involves the reevaluation of the energy for thousands of configurations (specifically, the reevaluation of the electronic part of Eq. 2). If, starting from MD@PBE or MD@PBE+D2, we applied this perturbative procedure to compute the RPA enthalpy of adsorption, the corresponding computational time would amount to several millions of CPU hours in our supercomputer center. While technically feasible, such high demands are of limited practical interest.
We now proceed to show that it is possible to reduce the number of times the electronic part of Eq. 2 is recomputed to a very small fraction of the total number of configurations. This is achieved by finding a semi-analytic model of , and using it to build an approximate model of by assuming that the distributions generated by and (e.g., and , as used here) are similar. We employ the following procedure to approximately reconstruct the correct high-level energy distribution and corresponding average values:
Step 1: MD driven by the Hamiltonian is performed to generate the distribution (eq. 4).
Step 2: is approximated by a set of Gaussian functions with fitting parameters
[TABLE]
with normalization constant . Parameters are selected to represent key statistical features of the distribution. The tilde introduced in Eq. 7 emphasizes the approximate character of the probability distribution. It is crucial here to forbid any single or a limited number of Gaussians to dominate; for this reason the contribution of each single Gaussian is limited to a certain threshold. We note that this representation of potential energy density is related to the kernel density estimation method of probability density functions Parzen (1962). Technical details on the fitting with Gaussian functions are further discussed in Supplemental Material.
Step 3: configurations with positions are selected from the ensemble of step 1 to satisfy
[TABLE]
where we use kcal/mol. Since several time-uncorrelated configurations can satisfy Eq. 8, the specific configuration is randomly chosen with uniform probability. This is a reasonable procedure since configurations with the same energy () have equal probability to be sampled. For the selected configurations a different (higher) level of theory is used to evaluate the energies to obtain .
Step 4: Next, we calculate
[TABLE]
and use them to obtain:
[TABLE]
where . This equation follows from Eqs. 5 and 7, with reweighting factor .
Step 5: Finally, we determine the estimate of by combining eq. 10 with eq. 4. The accuracy of this ensemble average can be evaluated by computing the standard error according to Eq. S17 (see Supplemental Material). If a higher level of accuracy is required the original set of selected configurations can be extended to decrease the statistical error of the final result, as done later.
We label the five step procedure described above by the acronym “selPT”, which stands for perturbation theory on selected configurations. In order for this approach to be useful, statistical convergence should be achieved for , where denotes the total number of MD steps. This is not an unreasonable expectation, as the configurations produced within a certain MD simulation are not completely independent due to the finite correlation time. Below, we show that even configurations can be sufficient for a reasonable estimate and can provide highly accurate results – both are vast improvements on the runs required to recompute a full MD trajectory. Such a dramatic reduction is possible because is a smooth, representative function that captures the important properties of the true distribution .
To establish the accuracy and the efficiency of the selPT procedure we come back to the example of CH4 adsorbed on chabazite. Before computing RPA adsorption energies, the approach is validated using PBE and PBE+D2: First, the “exact” distribution generated by the previously described MD@PBE+D2 simulation is fitted by 60 Gaussian-shaped functions (eq. 7); second, 60 configurations are chosen in order to satisfy the condition in Eq. 8; finally, Eq. 10 is used to evaluate and the corresponding ensemble averages. The procedure is repeated three times to generate three uncorrelated sets containing 60 configurations each, that will be called SET1@PBE+D2, SET2@PBE+D2, and SET3@PBE+D2. This is helpful to study the behavior of the selPT method as a function of the number of configurations. Indeed, the adsorption enthalpies have been computed using the three sets with 60 configurations, from the three sets with 120 configurations obtained by merging pairs of the original sets (SET1+2@PBE+D2, SET2+3@PBE+D2, and SET1+3@PBE+D2), and with a set that includes all the configurations (SET1+2+3@PBE+D2). Since the enthalpy of adsorption Eq. 6 requires ensemble averages for the three different systems (S, A, S+A), the selPT procedure is applied independently on each one of them.
Fig. 1 demonstrates the fitting quality. The middle panel demonstrates the ability of selPT to reproduce key features of the sampling data, and to perturb them using a more accurate method, by comparing from the full MD@PBE+D2 run, with its parametrized distribution and RPA counterpart approximated from Eq. 10 using SET2@PBE+D2. The right panel shows a phase space diagram for the full MD run, and the 60 samples selected by selPT, reported in terms of variables representing the speed of CH4 and its distance from the surface of the chabazite. Note that selPT samples the probability distribution accurately, by design; as shown in the right panel of Fig. 1 a representative part of phase space is also covered for the system under consideration.
The row corresponding to PBE in Table 1 shows the practical reliability of selPT in computing enthalpies of adsorption . Specifically, the similarity of the energies across all calculations highlights the success of the approach. If our trajectories sampled an overly-limited set of configurations (whether through a too-short trajectory or too-inaccurate functional), or if our selections sets were too poor, the numbers would differ significantly across columns. The consistency of the values justifies the length and reasonableness of our initial AIMD calculations, and the accuracy of the selection process.
The bottom row of Table 1 shows RPA enthalpies of adsorption, which would be basically impossible to obtain using a full AIMD treatment due to the inherent cost of RPA (full results considering all the possible combinations of sets are provided in Tables S9 and S10 of Supplemental Material). Numerical results are based on the RPA implementation in the VASP code Harl and Kresse (2008) and computational parameters are provided in the Supplemental Material. The statistically most accurate selPT estimate of the RPA enthalpy of adsorption is obtained from the largest set SET1+2+3@PBE+D2 () and provides a value of 4.38 0.39 kcal/mol in excellent agreement with the experimental value of kcal/mol Piccini et al. (2015); Luo et al. (2016). From Table 1 (and Tables S9-S10) it can be noticed that the errors present a clear trend to decrease as a function of the number of configurations included in the selPT set. In Table S9 it is shown that RPA results from as few as 60 samples are all within 0.92 kcal/mol of one another, and within 0.83 kcal/mol of experiment. This is already a considerable improvement with respect to the PBE and PBE+D2 estimates, which present deviations of at least 2 kcal/mol well beyond the chemical accuracy.
Key for this high level of accuracy is that the selPT calculations are based on a reasonable PBE+D2 trajectory which, unlike PBE, samples phase space similarly to RPA. This ensures that the perturbation from PBE+D2 to RPA is correctly biased to the weak vdW interaction between the zeolite and molecule. This intuitive point is further discussed in a quantitative way in Supplemental Material where RPA results from a PBE starting point are presented (Secs. SIII and SIV) and it is shown how PBE leads to significantly different sampling of the configurational space with respect to PBE+D2 and RPA (Sec. SV).
To further establish the accuracy and reliability of the selPT approach, a second system is discussed, CO2 adsorbed in protonated chabazite. Computational details are analogous to those of CH4 in chabazite. The adsorption enthalpy of CO2 is sizeably larger than that of CH4, with an experimental value of -8.41 kcal/mol at room temperature Pham et al. (2012). While the PBE+D2 result deviates from this value by 1.3 kcal/mol, our most accurate estimate from the largest set SET1+2+3@PBE+D2 provides an highly accurate enthalpy of adsorption of 8.74 0.39 kcal/mol. Detailed results for all the test sets are presented in Table S9 () and S10 () in Supplemental Material. The errors are analogous to those for adsorbed CH4; by considering that the adsorption enthalpy of CO2 is about double than that of CH4, the relative error in this case decreases much more rapidly as a function of the set size and provides already an accuracy within .
From a practical point of view, the selPT approach has been instrumental to achieve the computation of RPA enthalpies of adsorption. By decreasing the number of configurations to 60, the RPA results are obtained in a CPU time that is only about 3-4 times the CPU time involved in the full MD@PBE calculation, or around ten times for the more accurate simulation with 180 configurations.
In conclusion, we introduced the selPT method to compute ensemble averages of energies and, more generally, energy distributions for correlated wave-function methods. As proof of principle, this methodology was applied to compute the finite-temperature adsorption energy of CH4 and CO2 in protonated chabazite at the RPA level of theory, where it successfully reproduced known experimental values within chemical accuracy. selPT is not limited to the RPA but could be interfaced with other sophisticated approximations (e.g. MP2, double-hybrids or coupled-cluster theory), as long as computations on few tens of configurations can be afforded.
Beyond providing an accurate and efficient methodology to address challenging problems in physics and material science, selPT could be also used to develop finite-temperature test sets from high level theories, to test and develop new DFT functionals and semi-classical models.
Acknowledgements.
This work was supported by Agence Nationale de la Recherche under grant number ANR-15-CE29-0003-01. T.B. is grateful to University of Lorraine for invited professorships during the academic year 2015-16 and he acknowledges support from Project 643 No. APVV-15-0105. D.R., T.G. and S.L. acknowledge support from the French PIA project “Lorraine Université d’Excellence. The results of this research have been achieved using GENCI-CCRT/CINES computational resources under grants A0030805106 and A0040910433, the DECI resource ARCHER based in the United-Kingdom with support from the PRACE aisbl, and the Computing Center of the Slovak Academy of Sciences acquired in projects ITMS 26230120002 and 26210120002 supported by the Research and Development Operational Program funded by the ERDF.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Car and Parrinello (1985) R. Car and M. Parrinello, Phys. Rev. Lett. 55 , 2471 (1985).
- 2Kohn and Sham (1965) W. Kohn and L. J. Sham, Phys. Rev. 140 , A 1133 (1965).
- 3Cohen et al. (2011) A. J. Cohen, P. Mori-Sánchez, and W. Yang, Chemical reviews 112 , 289 (2011).
- 4Møller and Plesset (1934) C. Møller and M. S. Plesset, Physical Review 46 , 618 (1934).
- 5Bartlett and Musial (2007) R. J. Bartlett and M. Musial, Reviews of Modern Physics 79 , 291 (2007).
- 6Szabo and Ostlund (2012) A. Szabo and N. S. Ostlund, Modern quantum chemistry: introduction to advanced electronic structure theory (Courier Corporation, 2012).
- 7Marsman et al. (2009) M. Marsman, A. Grüneis, J. Paier, and G. Kresse, The Journal of chemical physics 130 , 184103 (2009).
- 8Dixit et al. (2017 a) A. Dixit, J. Claudot, S. Lebègue, and D. Rocca, The Journal of chemical physics 146 , 211102 (2017 a).
