Determining Free Energy Differences Through Variational Morphing

Martin Reinhardt; Helmut Grubm\"uller

arXiv:1906.12124·physics.comp-ph·July 19, 2023

Determining Free Energy Differences Through Variational Morphing

Martin Reinhardt, Helmut Grubm\"uller

PDF

TL;DR

This paper introduces a generalized variational morphing approach for free energy calculations that improves sampling efficiency and accuracy, especially with small sample sizes, by optimizing Hamiltonian transformation sequences.

Contribution

It develops a non-linear Hamiltonian transformation framework that enhances free energy estimation methods like BAR, extending their applicability and efficiency.

Findings

01

Order of magnitude less sampling needed compared to traditional methods

02

Sequences are optimal for both FEP and BAR methods

03

Framework generalizes BAR to small samples and non-Gaussian errors

Abstract

Free energy calculations based on atomistic Hamiltonians and sampling are key to a first principles understanding of biomolecular processes, material properties, and macromolecular chemistry. Here, we generalize the Free Energy Perturbation method and derive non-linear Hamiltonian transformation sequences for optimal sampling accuracy that differ markedly from established linear transformations. We show that our sequences are also optimal for the Bennett Acceptance Ratio (BAR) method, and our unifying framework generalizes BAR to small sampling sizes and non-Gaussian error distributions. Simulations on a Lennard-Jones gas show that an order of magnitude less sampling is required compared to established methods.

Equations43

Δ G_{1, N} = - ln ⟨ e^{- [H_{N} (x) - H_{1} (x)]}) ⟩_{1},

Δ G_{1, N} = - ln ⟨ e^{- [H_{N} (x) - H_{1} (x)]}) ⟩_{1},

H_{s} (x) = (1 - λ_{s}) H_{1} (x) + λ_{s} H_{N} (x), λ_{s} \in [0, 1],

H_{s} (x) = (1 - λ_{s}) H_{1} (x) + λ_{s} H_{N} (x), λ_{s} \in [0, 1],

Δ G_{1, N} = s = 1 \sum N - 1 Δ G_{s, s + 1} .

Δ G_{1, N} = s = 1 \sum N - 1 Δ G_{s, s + 1} .

σ^{2} = E Δ G_{1, N} - s = 1 s odd \sum N - 2 (Δ G_{s \to s + 1}^{(n)} - Δ G_{s + 2 \to s + 1}^{(n)})^{2} .

σ^{2} = E Δ G_{1, N} - s = 1 s odd \sum N - 2 (Δ G_{s \to s + 1}^{(n)} - Δ G_{s + 2 \to s + 1}^{(n)})^{2} .

E [Δ G_{s \to s + 1}^{(n)}] = - \int p_{s} (x_{1}) d x_{1} ... \int p_{s} (x_{n}) d x_{n} ln [\frac{1}{n} i = 1 \sum n e^{- (H_{s + 1} (x_{i}) - H_{s} (x_{i}))}],

E [Δ G_{s \to s + 1}^{(n)}] = - \int p_{s} (x_{1}) d x_{1} ... \int p_{s} (x_{n}) d x_{n} ln [\frac{1}{n} i = 1 \sum n e^{- (H_{s + 1} (x_{i}) - H_{s} (x_{i}))}],

E [(Δ G_{s \to s + 1}^{(n)})^{2}] = \int p_{s} (x_{1}) d x_{1} ... \int p_{s} (x_{n}) d x_{n} (ln [\frac{1}{n} i = 1 \sum n e^{- (H_{s + 1} (x_{i}) - H_{s} (x_{i}))}])^{2} .

E [(Δ G_{s \to s + 1}^{(n)})^{2}] = \int p_{s} (x_{1}) d x_{1} ... \int p_{s} (x_{n}) d x_{n} (ln [\frac{1}{n} i = 1 \sum n e^{- (H_{s + 1} (x_{i}) - H_{s} (x_{i}))}])^{2} .

Δ G_{s, s + 1} = - ln \int e^{- (H_{s + 1} (x) - H_{s} (x))} p_{s} (x) d x .

Δ G_{s, s + 1} = - ln \int e^{- (H_{s + 1} (x) - H_{s} (x))} p_{s} (x) d x .

Δ G_{s^{'} \to (s + 1)^{'}}^{(n)} = Δ G_{s \to s + 1}^{(n)} - C_{s + 1} + C_{s},

Δ G_{s^{'} \to (s + 1)^{'}}^{(n)} = Δ G_{s \to s + 1}^{(n)} - C_{s + 1} + C_{s},

E [Δ G_{s^{'} \to (s + 1)^{'}}^{(n)}] = Δ G_{s^{'}, (s + 1)^{'}} .

E [Δ G_{s^{'} \to (s + 1)^{'}}^{(n)}] = Δ G_{s^{'}, (s + 1)^{'}} .

E [Δ G_{s^{'} \to t^{'}}^{(n)} \cdot Δ G_{u^{'} \to v^{'}}^{(n)}] = = E [Δ G_{s^{'} \to t^{'}}^{(n)}] E [Δ G_{u^{'} \to v^{'}}^{(n)}] Δ G_{s^{'}, t^{'}} Δ G_{u^{'}, v^{'}},

E [Δ G_{s^{'} \to t^{'}}^{(n)} \cdot Δ G_{u^{'} \to v^{'}}^{(n)}] = = E [Δ G_{s^{'} \to t^{'}}^{(n)}] E [Δ G_{u^{'} \to v^{'}}^{(n)}] Δ G_{s^{'}, t^{'}} Δ G_{u^{'}, v^{'}},

E [(Δ G_{s^{'} \to (s + 1)^{'}}^{(n)})^{2}] = \frac{1}{n} \int e^{- 2 (H_{s + 1}^{'} (x) - H_{s}^{'} (x))} p_{s} (x) d x + f_{s^{'}} (Δ G_{s^{'}, (s + 1)^{'}}) .

E [(Δ G_{s^{'} \to (s + 1)^{'}}^{(n)})^{2}] = \frac{1}{n} \int e^{- 2 (H_{s + 1}^{'} (x) - H_{s}^{'} (x))} p_{s} (x) d x + f_{s^{'}} (Δ G_{s^{'}, (s + 1)^{'}}) .

σ^{2} = s = 1 s even \sum N - 2 \frac{1}{n} (\int p_{s} (x) d x e^{- 2 (H_{s + 1}^{'} (x) - H_{s}^{'} (x))} + \int p_{s + 2} (x) d x e^{- 2 (H_{s + 1}^{'} (x) - H_{s + 2}^{'} (x))} + g_{s^{'}} (Δ G_{s^{'}, (s + 1)^{'}}, Δ G_{(s + 2)^{'}, (s + 1)^{'}})),

σ^{2} = s = 1 s even \sum N - 2 \frac{1}{n} (\int p_{s} (x) d x e^{- 2 (H_{s + 1}^{'} (x) - H_{s}^{'} (x))} + \int p_{s + 2} (x) d x e^{- 2 (H_{s + 1}^{'} (x) - H_{s + 2}^{'} (x))} + g_{s^{'}} (Δ G_{s^{'}, (s + 1)^{'}}, Δ G_{(s + 2)^{'}, (s + 1)^{'}})),

\frac{\partial}{\partial H _{s} ( x )} (σ^{2} + ν \int (e^{- H_{s} (x)} - Z_{s}) d x) =! 0

\frac{\partial}{\partial H _{s} ( x )} (σ^{2} + ν \int (e^{- H_{s} (x)} - Z_{s}) d x) =! 0

H_{s} (x) = - \frac{1}{2} ln (e^{- 2 (H_{s - 1} (x) - C_{s - 1})} + e^{- 2 (H_{s + 1} (x) - C_{s + 1})}),

H_{s} (x) = - \frac{1}{2} ln (e^{- 2 (H_{s - 1} (x) - C_{s - 1})} + e^{- 2 (H_{s + 1} (x) - C_{s + 1})}),

H_{s} (x) = ln (e^{H_{s - 1} (x) - C_{s - 1}} + e^{H_{s + 1} (x) - C_{s + 1}}) .

H_{s} (x) = ln (e^{H_{s - 1} (x) - C_{s - 1}} + e^{H_{s + 1} (x) - C_{s + 1}}) .

e^{- 2 H_{s} (x)} = e^{- 2 H_{s - 1} (x)} \cdot r_{s - 1, s}^{- 2} + e^{- 2 H_{s + 1} (x)} \cdot r_{s + 1, s}^{- 2}

e^{- 2 H_{s} (x)} = e^{- 2 H_{s - 1} (x)} \cdot r_{s - 1, s}^{- 2} + e^{- 2 H_{s + 1} (x)} \cdot r_{s + 1, s}^{- 2}

e^{H_{s} (x)} = e^{H_{s - 1} (x)} \cdot r_{s - 1, s} + e^{H_{s + 1} (x)} \cdot r_{s + 1, s}

e^{H_{s} (x)} = e^{H_{s - 1} (x)} \cdot r_{s - 1, s} + e^{H_{s + 1} (x)} \cdot r_{s + 1, s}

Δ G_{1, 3}^{(n)} =

Δ G_{1, 3}^{(n)} =

=

\begin{split}e^{-(\Delta G_{1,3}-C)}=&\left<\frac{1}{1+e^{H_{3}(\mathbf{x})-H_{1}(\mathbf{x})-C}}\right>_{1}\\ &\Big{/}\left<\frac{1}{1+e^{H_{1}(\mathbf{x})-H_{3}(\mathbf{x})+C}}\right>_{3},\end{split}

\begin{split}e^{-(\Delta G_{1,3}-C)}=&\left<\frac{1}{1+e^{H_{3}(\mathbf{x})-H_{1}(\mathbf{x})-C}}\right>_{1}\\ &\Big{/}\left<\frac{1}{1+e^{H_{1}(\mathbf{x})-H_{3}(\mathbf{x})+C}}\right>_{3},\end{split}

\displaystyle\scalebox{0.96}{$\hat{H}_{s}(\mathbf{x})=-\frac{1}{2}\ln\left[(1-\zeta_{s})e^{-2H_{1}(\mathbf{x})}+\zeta_{s}e^{-2(H_{\widetilde{N}}(\mathbf{x})-C)}\right]$},

\displaystyle\scalebox{0.96}{$\hat{H}_{s}(\mathbf{x})=-\frac{1}{2}\ln\left[(1-\zeta_{s})e^{-2H_{1}(\mathbf{x})}+\zeta_{s}e^{-2(H_{\widetilde{N}}(\mathbf{x})-C)}\right]$},

H (r) = 4 ϵ [(\frac{σ}{r})^{12} - (\frac{σ}{r})^{6}]

H (r) = 4 ϵ [(\frac{σ}{r})^{12} - (\frac{σ}{r})^{6}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Determining Free Energy Differences Through Variational Morphing

Martin Reinhardt

Helmut Grubmüller

[email protected]

Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany

Abstract

Free energy calculations based on atomistic Hamiltonians and sampling are key to a first principles understanding of biomolecular processes, material properties, and macromolecular chemistry. Here, we generalize the Free Energy Perturbation method and derive non-linear Hamiltonian transformation sequences for optimal sampling accuracy that differ markedly from established linear transformations. We show that our sequences are also optimal for the Bennett Acceptance Ratio (BAR) method, and our unifying framework generalizes BAR to small sampling sizes and non-Gaussian error distributions. Simulations on a Lennard-Jones gas show that an order of magnitude less sampling is required compared to established methods.

Free energy calculations provide essential insights into numerous physical and biochemical systems. Examples of applications range from predicting binding processes of biomolecules for drug design Williams-Noonan et al. (2018); Cournia et al. (2017); Christ and Fox (2014) to determining thermodynamic properties of crystalline materials Swinburne and Marinica (2018); Freitas et al. (2018); de Koning et al. (1999). For large and complex systems with slow relaxation rates and typically $10^{5}$ to $10^{7}$ particles, only limited accuracy is achieved Zuckerman and Woolf (2002), despite substantial methodological progress Jarzynski (1997); Vaikuntanathan and Jarzynski (2008); Valsson and Parrinello (2014); Shirts et al. (2003) and immense computational effort. Besides force field inaccuracies, insufficient sampling is the main bottleneck Aldeghi et al. (2018). Here, we develop and evaluate a variational approach for optimal sampling that minimizes the sampling error.

Given the Hamiltonians $H_{1}(\mathbf{x})$ and $H_{N}(\mathbf{x})$ of two states $1$ and $N$ , where $\mathbf{x}\in{\rm I\!R}^{3M}$ denotes the position of all $M$ particles of the simulation system, the free energy difference $\Delta G_{1,N}$ between these states is given by the Zwanzig formula Zwanzig (1954),

[TABLE]

where $\langle\rangle_{N}$ denotes an ensemble average defined by $H_{1}(\mathbf{x})$ , which is approximated by averaging over a finite sample of size $n$ obtained from atomistic simulations or Monte Carlo sampling. For ease of notation, $k_{B}T=1$ .

Alchemical transformations substantially reduce sampling errors Lu and Kofke (2001a, b) by introducing $N-2$ intermediate states $s$ ,

[TABLE]

and accumulating small free energy differences between all adjacent states $s$ and $s+1$ ,

[TABLE]

This technique is also employed in other fields, for example in the context of Bayesian statistics, where the plausibility of two different models is compared by calculating their marginal likelihood ratio Gelman and Meng (1998); Habeck (2012). With few exceptions Christ and Van Gunsteren (2007); Pham and Shirts (2012), only the linear interpolation between $H_{1}$ and $H_{N}$ of Eq. (2) is used, that is illustrated for a simple one-dimensional case in Fig. 1(a).

Here, we will generalize this linear interpolation for two of the most established methods, the Free Energy Perturbation (FEP) Zwanzig (1954) and the Bennett Acceptance Ratio (BAR) method Bennett (1976). Specifically, we ask which sequence $H_{2}(\mathbf{x})\ldots H_{N-1}(\mathbf{x})$ amongst all possible functionals $\{H_{s}[H_{1},H_{N}]\}$ yields, on average, the highest accuracy. Figure 1(b) and 1(c) show such a general interpolation sequence, which we refer to as Variational Morphing Free Energy (VMFE) method. Unexpectedly, the result will also turn out to be a generalization of BAR to any $n$ and $N$ .

Note that our approach differs from previous attempts, such as soft-core potentials Steinbrecher et al. (2007), where ad hoc functionals are used. For linear interpolations (Eq. (2)), the distribution of $\lambda$ points has been optimized Naden et al. (2014) which is also not the general solution we aim for.

To solve the above variational problem and to find the optimal sequence of $H_{s}$ , we consider the FEP scheme, displayed in Fig. 2(a), as one possible implementation of Eq. (3) using Eq. (1). In this particular variant, which is symmetric with respect to exchange of the two end states to avoid hysteresis effects, sample points are solely drawn from the odd-numbered ’sampling states’, and not from the even-numbered ’target states’. The average accuracy of this scheme is the average over all sampling realizations of the mean-squared deviation (MSD) of the free energy difference $\Delta G_{1,N}^{(n)}$ from the exact difference $\Delta G_{1,N}$ ,

[TABLE]

As in Fig. 2, the arrows point from sampling to target states.

Assuming for each sample state $s$ a set of $n$ independent sample points $\{\mathbf{x}_{i}\}$ , drawn from ${p_{s}(\mathbf{x})=e^{-H_{s}(\mathbf{x})}/Z_{s}}$ , with partition function $Z_{s}$ , the terms arising from expanding Eq. (4) will be considered one by one. For the linear term, the average over all sample realizations reads

[TABLE]

and for the quadratic term

[TABLE]

Similar expressions are obtained for $\Delta G_{s+2\rightarrow s+1}^{(n)}$ . The exact free energy differences are

[TABLE]

For shifted Hamiltonians ${H_{s}^{\prime}(\mathbf{x})=H_{s}(\mathbf{x})-C_{s}}$ and ${H_{s+1}^{\prime}(\mathbf{x})=H_{s+1}(\mathbf{x})-C_{s+1}}\,$ , Eq. (1) yields

[TABLE]

which also holds for $\Delta G_{s^{\prime},(s+1)^{\prime}}\,$ . Because these offsets cancel out in Eq. (4), the accuracy $\sigma$ is invariant under any choice of offsets $C_{s}$ and $C_{s+1}$ . Choosing $C_{s}$ and $C_{s+1}$ such that the term in the logarithm of Eqs. (5) and (6) is close to one, and thus all $\Delta G_{s^{\prime}\rightarrow(s+1)^{\prime}}^{(n)}$ are small with respect to $k_{B}T=1$ , first order expansion of the logarithm allows to factorize the integrals, and therefore

[TABLE]

For the cross terms in Eq. (4), note that the estimated free energy differences of the individual steps are based on uncorrelated sample sets, and therefore

[TABLE]

for $(s^{\prime}\rightarrow t^{\prime})\neq(u^{\prime}\rightarrow v^{\prime})$ . Using Eq. (9), Eq. (6) yields

[TABLE]

Inserting Eqs. (9) and (11) into Eq. (4),

[TABLE]

where $f_{s^{\prime}}$ and $g_{s^{\prime}}$ denote expressions that only depend on exact free energy differences and thus are dropped for the optimization below.

With these expressions, the variational problem can be solved analytically. For the odd-numbered states $s$ , variation of $\sigma^{2}$ , Eq. (12),

[TABLE]

yields

[TABLE]

where $Z_{s}=\int e^{-H_{s}(\mathbf{x})}d\mathbf{x}$ is the (finite) partition sum and $\nu$ is a Lagrange multiplier.

Similarly, for the even-numbered states,

[TABLE]

An additive term $C_{s}$ in Eqs. (14) and (15) was omitted, as it cancels in ${\Delta G_{s-1\rightarrow s}^{(n)}-\Delta G_{s+1\rightarrow s}^{(n)}}$ . The result is a set of equations for all states $s$ for which each Hamiltonian $H_{s}(\mathbf{x})$ depends only on the two adjacent states. The initial requirement for small $\Delta G^{(n)}_{s^{\prime}\rightarrow(s+1)^{\prime}}$ is fulfilled by setting $C_{s}=-\ln Z_{s}\,$ , as in this case, all $Z_{s}^{\prime}$ are one. Rearranging terms for odd $s$ ,

[TABLE]

and for even $s$ ,

[TABLE]

with $r_{s,t}=Z_{s}/Z_{t}$ . The first main result of this letter is the resulting sequence of Hamiltonians that yields the best accuracy for FEP free energy calculations.

The second main result is that Eq. (15) serves to generalize the BAR method. The latter follows from Eq. (15) for $N=3$ with one intermediate state: Applied to the two involved free energy differences, the Zwanzig formula yields

[TABLE]

Inserting Eq. (15) as the target state Hamiltonian $H_{1}(\mathbf{x})$ yields the BAR formula

[TABLE]

with $C=C_{3}-C_{1}$ .

Notably, the above derivation yields the more general result that Eq. (20) provides the most accurate free energy estimate also for finite and small $n$ , even down to $n=1$ given sufficient configuration space density overlap between adjacent states, which is fulfilled, for instance, in the limit of many intermediates. In contrast, because the derivation by Bennett Bennett (1976) strictly holds only for infinite sampling, so far $n$ was required to be large, and proper convergence had to be assumed. Further, in the original derivation Bennett (1976) the error distribution of the free energy estimates had to be assumed to be Gaussian, which in our above result is also not required. In the context of the Overlap Sampling method Lu et al. (2004), it has been shown that an FEP intermediate can be defined that yields the weighting function from Bennett’s derivation; the above results proof that this intermediate is indeed optimal for the FEP scheme.

Further generalizing the BAR result, Eqs. (16) and (17) yield optimal VMFE intermediates for any (odd) number $N-2$ of intermediate states, as illustrated in Fig. 2: For any two sampling states, using BAR and using FEP with the optimal target state of Eq. (17) is equivalent. Applied recursively, therefore, the $\widetilde{N}=(N+1)/2$ sampling states from any sequence of $N$ FEP-optimal Hamiltonians $\{H_{s}(\mathbf{x})\}$ are also optimal for multistate BAR (MBAR) Shirts and Chodera (2008), where so far, too, only empirically determined linear interpolations have been used as intermediate states. This result, therefore, is a generalization to MBAR.

Conversely, for the setup of one sampling state between two given target end states $1$ and $3$ , with remarkable intuition an empirical potential has been proposed Christ and Van Gunsteren (2007) in the Envelope Distribution Sampling (EDS) method, which is similar to Eq. (14) except for a factor of two in the exponent. In summary, both BAR/MBAR and EDS are special cases of, or approximations to, our more general variational VMFE result that also requires fewer assumptions.

To solve Eqs. (16) and (17) for the optimal intermediate Hamiltonians $H_{s}(\mathbf{x})$ , note that the unknown free energy differences $\Delta G_{s,t}=-\ln r_{s,t}$ are part of the equations which, therefore, have to be solved iteratively. With an initial guess for all $r_{s,t}$ , the set of equations is solved in a point-wise fashion for any given $\mathbf{x}$ . After sampling all odd-numbered states, the $r_{s,t}$ values are updated iteratively, such that the sequence of intermediate states converges towards the optimum. For a typical biomolecular many-body system, the additional computational effort is small compared to computing $H_{1}(\mathbf{x})$ and $H_{N}(\mathbf{x})$ .

For the above illustrative example, Fig. 1(b) and (c) show the optimized Hamiltonians and the configuration space densities, respectively, of the converged sequence of intermediate states. To this end, initial values $r_{s,t}=1$ were used and Eqs. (16) and (17) were iterated until convergence, using numerical integration over $\mathbf{x}$ and updating the $r_{s,t}$ during the process. Unlike the linear interpolations shown in Fig. 1(a), the variational morphing sequence leads to a probability density, which gradually decreases in the region of $A$ and increases in the region of $B$ , while remaining almost constant at the point of maximum configuration space overlap.

Figure 3(a) shows the results of numerical simulations using the one-dimensional test case shown in Fig. 1. Different minimum distances $x_{0}$ are used, thereby varying configuration space overlaps $K=\int_{-\infty}^{\infty}\mathrm{min(p_{1}(\mathbf{x}),p_{N}(\mathbf{x}))}\mathrm{d}\mathbf{x}$ between the end states, indicated by the yellow area in Fig. 1(b). Sets of $n=100$ uncorrelated sample points are drawn from $p_{s}(\mathbf{x})$ through rejection sampling. $\widetilde{N}=3$ sampling states are used with BAR. For each $K$ , the accuracy (Eq. (4)) is calculated by averaging over 600,000 realizations.

VMFE (blue curve) yields the smallest MSD for all $K$ , compared to both the first linear interpolation variant (light green) using a linearly spaced $\lambda_{2}=\frac{1}{2}$ , like in a typical free energy calculation, and even compared to the second variant (dark green) using the empirically determined $\lambda_{2}$ value that yields the best accuracy that can be achieved by linear interpolation. For more details, see Supplementary Material. The largest improvements of VMFE are seen for small configuration space density overlaps that notoriously cause the largest uncertainties.

Figure 3(b) shows how the accuracy of VMFE improves with increasing number of states $\widetilde{N}$ , keeping the total number of sample points, and hence the total computational effort, constant. For this example, the accuracy increases up to $\widetilde{N}=5$ , beyond which no further improvement appears.

The above VMFE scheme, Eqs. (16) and (17) couple all intermediates and, therefore, cannot be run in parallel in a straightforward way. This limitation is overcome by two approximations. First, the sampling states are coupled directly using only Eq. (16). Therefore, while still using BAR between two adjacent sampling states, the corresponding target states are not used for their derivation. Second, Eq. (16) is solved recursively, i.e., the optimal sampling state $H_{\widetilde{N}/2}$ is determined first from $H_{1}$ and $H_{\widetilde{N}}$ , then $H_{\widetilde{N}/4}$ from $H_{1}$ and $H_{\widetilde{N}/2}$ , as well as $H_{3\widetilde{N}/4}$ from $H_{\widetilde{N}/2}$ and $H_{\widetilde{N}}$ , and so on. As a result, the approximate intermediate Hamiltonians read

[TABLE]

with prefactors $\zeta_{s}$ recursively determined, using Eq. (16), such that all $\hat{H}_{s}(\mathbf{x})$ are a functional of only $H_{1}$ and $H_{\widetilde{N}}$ . As above, $C\approx\Delta G$ is determined iteratively. Consequently, no prior knowledge of the differences between the individual states is required, and therefore, the sampling simulations for each state can be run in parallel without communication.

Figure 4 shows a comparison between the configuration space densities $p(x)$ of the approximate intermediate Hamiltonians $\tilde{H}_{s}(\mathbf{x})$ (dashed lines) with those of the optimal ${H}_{s}(\mathbf{x})$ (solid lines), corresponding to the densities in Fig. 1(c). The two sequences are indeed very similar. Even if the optimal $\zeta_{s}$ values are not known a priori, the approximated VMFE sequence covers the transition behavior of the optimal sequence well, particularly for larger numbers of intermediates.

As a more high-dimensional test case, we calculate the free energy difference between an Argon and a Helium Lennard-Jones (LJ) gas (parameters from White (1999)) with $M=20$ atoms. Fig. 5 shows the accuracy, determined through comparison to the result of a converged reference simulation, obtained by approximated VMFE with that of linearly interpolated intermediates. For more details, see Supplementary Material. At 5 ns, an over 4-fold improved accuracy is achieved by VMFE (green) compared to a conventional linear interpolation (red). Conversely, the accuracy achieved by linear interpolation at 5 ns is already obtained at 0.56 ns by VMFE, which thus requires almost 10 times less sampling.

Interestingly, apart from different factors in the exponent, the intermediates of the approximated sequence resemble those suggested in the context of thermodynamic integration (TI) Kirkwood (1935). Using approximations to the solution of the optimization problem for TI for several special cases Gelman and Meng (1998), an expression similar to the approximate Eq. (21) was obtained Pham and Shirts (2012); Blondel (2004). These results require a proper choice of $\lambda$ , and it is unclear if the optimal $\lambda$ states are the same for the different methods. Nevertheless, the similarity is striking and suggests that our result may also allow further improvements of TI.

In summary, we derived the optimal accuracy sequence of intermediate Hamiltonians for free energy perturbation calculations. Compared to the established linear intermediates, the accuracy improvement is substantial, especially for the critical small configuration space density overlap of the end states that are a hallmark of complex systems. The optimal sequences are fundamentally different from the linear ones, suggesting potential improvement, also for other methods that rely on intermediate states, e.g., TI or non-equilibrium methods Jarzynski (1997); Shirts et al. (2003).

VMFE was derived assuming statistically independent sampling points $\mathbf{x}_{i}$ . For atomistic simulation based sampling, as well as, to a lesser extent, for MC sampling, subsequent sampling points are correlated, however, particularly when the relevant configuration space densities are separated by large barriers. In these cases, when combined with enhanced sampling techniques, such as Hamiltonian replica exchange Swendsen and Wang (1986); Liu et al. (2005); Tan (2017), appropriate biasing potentials Grubmüller (1995); Steiner et al. (1998); Laio and Parrinello (2002), or a combination thereof, VMFE should also yield improved accuracy, albeit the obtained intermediate Hamiltonians will not be optimal due to the neglected time correlations. On a more fundamental level, the equivalence of FEP and BAR established here implies that advances in any of these will benefit the other.

Appendix A Supplementary Material

A.0.1 One-dimensional Test Case - Highest Accuracy Linear Interpolation

Figure 3(a) shows a comparison of the accuracy obtained by VMFE with two variants of a linearly interpolated sequence. As $\widetilde{N}=3$ , sampling is conducted in one intermediate state and the two end states. Sets of $n=100$ sample points are drawn from the corresponding $p_{s}(\mathbf{x})$ through rejection sampling, based on which a free energy estimate between the end states is calculated.

For the linearly interpolated sequence, $\lambda_{2}$ can be chosen by the user. To empirically obtain the $\lambda_{2}$ that yields the highest accuracy (dark green), we loop over the allowed range between zero and one in steps of 0.01. To reliably calculate the MSD with respect to the exact value, for each $\lambda_{2}$ 150,000 free energy estimates are calculated. Once the highest accuracy $\lambda_{2}$ is determined, the corresponding MSD is calculated once again using 600,000 repetitions. The result of these is shown in the figure. The procedure is repeated for each value of $K$ (42 values). We note that the $\lambda_{2}$ yielding the highest accuracy varies for different $K$ , and is inaccessible in practice for high-dimensional systems.

A.0.2 Lennard-Jones Gas Simulation

To compare the accuracy of the free energy estimate using a linearly interpolated sequence of states to the approximated VMFE sequence, a set of free energy calculations between an Argon and a Helium Lennard-Jones gas is conducted.

In each state, $M=20$ atoms are placed at random positions without overlap inside a cubic box. The atoms are assigned velocities drawn from the Boltzmann distribution corresponding to the temperature of $T=298$ K. The simulations are conducted in the NVT ensemble using periodic boundary conditions. The volume of the box is set to (43.5 ${\rm\AA})^{3}$ , corresponding to a pressure of about 10 bar. The atomic interaction at a distance $r$ between the centers of two atoms is described through the Lennard-Jones potential,

[TABLE]

with parameters $\sigma=3.405$ ${\rm\AA}$ , $\epsilon=1.0446$ kJ $/$ mol and $m=39.95$ u for Argon, and $\sigma=2.64$ ${\rm\AA}$ , $\epsilon=0.0906$ kJ $/$ mol and $m=4$ u for Helium White (1999).

At the start, an equilibration run of 1 ns is conducted. The leap-frog algorithm with a time step of 5 fs is used and velocity rescaling at every 20th time step. For both sequences, 800 free energy simulations are conducted with 5 ns simulation time in each state. Five intermediate, i.e., seven states in total are used. In absence of further knowledge, equal spacing of $\lambda_{s}$ and $\zeta_{s}$ , i.e, $\{0,0.17,0.33,0.5,0.67,0.83,1\}$ is used. For the approximated VMFE sequence, $C=0$ is used throughout the whole simulation. The difference of the Hamiltonians between adjacent states is recorded at every 400th step. Free energy differences are subsequently calculated using BAR.

A reference free energy difference is determined by conducting a long simulation with each method using 12 states with linearly spaced $\lambda_{s}$ and $\zeta_{s}$ values and computation runs of 10 $\mathrm{\mu s}$ in each state. At this length, the relative difference has decreased below $10^{-5}$ ( $\Delta G$ = 0.23252 $\mathrm{k_{B}T}$ ). Using this reference value, we calculate the MSD of the distribution of 800 free energy differences depending on the simulation time in each state.

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Williams-Noonan et al. (2018) B. J. Williams-Noonan, E. Yuriev, and D. K. Chalmers, Journal of Medicinal Chemistry 61 , 638 (2018) . · doi ↗
2Cournia et al. (2017) Z. Cournia, B. Allen, and W. Sherman, Journal of Chemical Information and Modeling 57 , 2911 (2017) . · doi ↗
3Christ and Fox (2014) C. D. Christ and T. Fox, Journal of Chemical Information and Modeling 54 , 108 (2014) . · doi ↗
4Swinburne and Marinica (2018) T. D. Swinburne and M. C. Marinica, Physical Review Letters 120 , 135503 (2018) . · doi ↗
5Freitas et al. (2018) R. Freitas, R. E. Rudd, M. Asta, and T. Frolov, Physical Review Materials 2 , 093603 (2018) . · doi ↗
6de Koning et al. (1999) M. de Koning, A. Antonelli, and S. Yip, Physical Review Letters 83 , 3973 (1999) . · doi ↗
7Zuckerman and Woolf (2002) D. M. Zuckerman and T. B. Woolf, Physical Review Letters 89 , 16 (2002) . · doi ↗
8Jarzynski (1997) C. Jarzynski, Physical Review Letters 78 , 2690 (1997) . · doi ↗