Multi-Layer Free Energy Perturbation
Ying-Chih Chiang, Frank Otto

TL;DR
This paper introduces a multi-layer free energy perturbation method that improves sampling efficiency by allowing environmental relaxation during molecular dynamics, enabling faster and more accurate free energy calculations for drug design.
Contribution
The paper proposes a novel multi-ensemble ansatz, MLFEP, addressing sampling inefficiency in FEP by incorporating environmental relaxation during MD simulations.
Findings
MLFEP allows environmental reorganization during sampling.
It reduces the need for enhanced sampling methods.
Enables faster, accurate free energy calculations.
Abstract
Free energy perturbation (FEP) is frequently used to evaluate the free energy change of a biological process, e.g. the drug binding free energy or the ligand solvation free energy. Due to the sampling inefficiency, FEP is often employed together with computationally expensive enhanced sampling methods. Here we show that this sampling inefficiency, which stems from not accounting for the environmental reorganization, is an inherent property of the single-ensemble ansatz of FEP, and hence simply prolonging the MD simulation can hardly alleviate the problem. Instead, we propose a new, multi-ensemble ansatz -- the multi-layer free energy perturbation (MLFEP), which allows environmental reorganization processes (relaxation) to occur automatically during the MD sampling. Our study paves the way toward a fast but accurate free energy calculation that can be employed in computer-aided drug…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Microfluidic and Capillary Electrophoresis Applications · Analytical Chemistry and Chromatography
Multi-Layer Free Energy Perturbation
Ying-Chih Chiang
Frank Otto
Department of Physics, Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Abstract
Free energy perturbation (FEP) is frequently used to evaluate the free energy change of a biological process, e.g. the drug binding free energy or the ligand solvation free energy. Due to the sampling inefficiency, FEP is often employed together with computationally expensive enhanced sampling methods. Here we show that this sampling inefficiency, which stems from not accounting for the environmental reorganization, is an inherent property of the single-ensemble ansatz of FEP, and hence simply prolonging the MD simulation can hardly alleviate the problem. Instead, we propose a new, multi-ensemble ansatz – the multi-layer free energy perturbation (MLFEP), which allows environmental reorganization processes (relaxation) to occur automatically during the MD sampling. Our study paves the way toward a fast but accurate free energy calculation that can be employed in computer-aided drug design.
Accurately evaluating the free energy change of a ligand binding to its receptor has a very practical use in computational drug design, i.e. determining the relative binding free energies between two drug candidates for lead- optimization. One of the most frequently employed method for this purpose is the so-called free energy perturbation (FEP) method Zwanzig (1954), which states that the free energy change between the final target state T and the initial reference state R can be evaluated via a single ensemble average, i.e.
[TABLE]
where denotes the free energy change and , with the Boltzmann factor denoted by and temperature by . The term denotes the perturbation introduced to the initial reference state, and its value is given by the potential difference between the target state T and the reference state, . Finally, the symbol represents that the canonical ensemble average is performed over the reference state R. In other words, the sampling is performed using the Hamiltonian of the reference state. Similarly, one can also sample the target state T for calculating , this leads to the so-called backward FEP calculation, i.e. . Although Eq. 1 is theoretically exact, numerically evaluating the ensemble average often suffers from a problem of the sampling inefficiency. While plenty of methods, e.g. stratification (multi-step FEP) Valleau and Card (1972); Pohorille et al. (2010), confine-and-release method Boresch et al. (2003); Woo and Roux (2005); Mobley et al. (2007), or replica-exchange molecular dynamics (with solute tempering) Sugita and Okamoto (1999); Sugita et al. (2000); Liu et al. (2005); Wang et al. (2011); Moors et al. (2011); Wang et al. (2012), have been developed to improve the sampling efficiency and hence advance the FEP convergence, the current computational cost of using enhanced sampling methods combined with FEP is still rather prohibitive to be regularly applied in drug design Wang et al. (2015); Lin et al. (2016). Hence, further pursuing an accurate but fast free energy method is still desirable.
Previously we have shown that the insufficient sampling comes from missing the environmental reorganization Chiang et al. (2016), e.g. allowing the water to move or reorient to accommodate the inserted ligand (perturbation). This process is a type of relaxation process, which is well studied in gas phase reactions. For instance, consider the quantum nuclear dynamics Pahl et al. (1996); Scheit et al. (2004); Chiang et al. (2011) during the interatomic/intermolecular Coulombic decay process (ICD) Cederbaum et al. (1997); Sisourat et al. (2010); Jahnke et al. (2010); Mucke et al. (2010); Gokhberg et al. (2014); Stumpf et al. (2016), in the neon dimer Jahnke et al. (2004); Scheit et al. (2004). After introducing a strong perturbation to the system (ionizing an inner valence electron on Ne), the system quickly responds to this perturbation by emitting one electron on the neighboring Ne, resulting in a Ne+-Ne+ state that undergoes Coulomb explosion to lower the system (free) energy. Clearly, the nuclear motion in the electronic decay process is always governed by the corresponding Hamiltonian of a specific electronic state Pahl et al. (1996). Similarly, in the classical molecular dynamics, the molecular motion is also governed by the Hamiltonian of the simulated system. The only difference is that the classical system is described by Newtonian mechanics Phillips et al. (2005) with force fields Vanommeslaeghe et al. (2010).
Let us now consider a common illustrative example in free energy calculations, namely, the ligand solvation process. According to Eq. 1, collecting the ensemble governed by the Hamiltonian of reference state R (ligand and water solvent are separated) is sufficient for correctly evaluating . However, as illustrated in Fig. 1, the two end states can have very different potential energy landscapes so that their associated probability distributions center at different geometries, as indicated by the dotted curves in Fig. 1.
Consequently, when sampling the distribution via MD simulation in order to sample all possible conformations of the reference state R, one faces the sampling inefficiency because the relevant microstates belonging to the target state T are generally missed, leading to a non-converged free energy result. This problem can be solved by introducing the reorganization process (relaxation) into the sampling procedure by starting at the same conformation as reference state R but performing the MD simulation based on the Hamiltonian of target state T, see e.g. the orange dotted curve in Fig. 1. While this idea may not be so familiar to the native biophysics society, its quantum version is regularly performed in studying gas phase molecular dynamics involving multiple electronic states Pahl et al. (1996); Scheit et al. (2004); Chiang et al. (2011); Köppel et al. (1984); Domcke et al. (2004). Furthermore, our approach is very different from contemporary enhanced sampling schemes, e.g. increasing temperature to overcome the potential barrier, adding a biasing potential to flatten the potential landscape, or even using the “adiabatic” potential (black curve) for sampling Christ and van Gunsteren (2007). These schemes focus on forcing the MD sampling to explore a larger conformational space but continue using Eq. 1 to evaluate . Rather, we believe that the insufficient sampling problem is an inherent property such that the best way to solve it is to use a different working equation than Eq. 1.
Does such a new equation, which allows the system to relax automatically during the simulation, exist? Exploiting the fact that is a constant under the given NVT ensemble, further imposing one additional sampling over a normalized distribution will not change its value, e.g. , as long as the sampling is sufficient. Hence we have,
[TABLE]
where the definitions of all symbols are identical with Eq. 1. In Eq. 2, we further imposed the sampling over the distribution of target state T, which does not affect , since its value is already determined at the sampling of the reference state R. While Eq. 2 seems to introduce more effort in MD sampling to evaluate , this equation actually allows the environmental reorganization. Let us explain. When evaluating Eq. 2, one first performs a short equilibrium sampling to collect the microstates that belongs to state T, and then from each microstate (each frame of the collected trajectory) one performs an MD sampling using the Hamiltonian of state R to evaluate the free energy change within this simulation. Thus, each microstate of state T gives one that will later participate in the ensemble average over state T. Interestingly, for each microstate, the sampling now always begins at a non-equilibrium high energy conformation. This conformation will then undergo a relaxation process automatically due to the governing Hamiltonian, and hence the sampling is more efficient than waiting for rare events to happen. For practical purposes, Eq. 2 can also be expressed in a reversed sampling form that reads,
[TABLE]
This new format describes the process in Fig. 1: start the sampling under the reference state R, and then introduce the environmental reorganization via the relaxation process governed by the target state T. One additional advantage of Eq. 3 is that we can now assign a common reference state R and save the trajectory for evaluating between the reference state and different target states. This can further save some computational effort. Finally, since Eqs. 2-3 already go beyond the usual FEP theory, we will term our new approach as the multi-layer free energy perturbation (MLFEP), in order to distinguish it from the virtual substitution scan (VSS) Chiang and Wang (2016); Chiang et al. (2016) which is purely based on a single-ensemble approach but also has a dual sampling format.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Zwanzig (1954) R. W. Zwanzig, J. Chem. Phys. 22 , 1420 (1954).
- 2Valleau and Card (1972) J. P. Valleau and D. N. Card, J. Chem. Phys. 57 , 5457 (1972).
- 3Pohorille et al. (2010) A. Pohorille, C. Jarzynski, and C. Chipot, J. Phys. Chem. B 114 , 10235 (2010).
- 4Boresch et al. (2003) S. Boresch, F. Tettinger, M. Leitgeb, and M. Karplus, J. Phys. Chem. B 107 , 9535 (2003).
- 5Woo and Roux (2005) H.-J. Woo and B. Roux, Proc. Natl. Acad. Sci. USA 102 , 6825 (2005).
- 6Mobley et al. (2007) D. L. Mobley, J. D. Chodera, and K. A. Dill, J. Chem. Theory Comput. 3 , 1231 (2007).
- 7Sugita and Okamoto (1999) Y. Sugita and Y. Okamoto, Chem. Phys. Lett. 314 , 141 (1999).
- 8Sugita et al. (2000) Y. Sugita, A. Kitao, and Y. Okamoto, J. Chem. Phys. 113 , 6042 (2000).
