A microscopic description of acid-base equilibrium
Emanuele Grifoni, GiovanniMaria Piccini, Michele Parrinello

TL;DR
This paper introduces a novel computational approach combining ab initio molecular dynamics and metadynamics with new collective variables to study microscopic mechanisms of acid-base reactions in aqueous solutions.
Contribution
It presents a new method with innovative descriptors for enhanced sampling, enabling detailed microscopic insights into acid-base equilibria.
Findings
Successfully applied to acetic acid, ammonia, and bicarbonate solutions
Revealed molecular mechanisms underlying acid, base, and amphoteric behaviors
Demonstrated effectiveness of new collective variables in sampling free energy barriers
Abstract
Acid-base reactions are ubiquitous in nature. Understanding their mechanisms is crucial in many fields, from biochemistry to industrial catalysis. Unfortunately, experiments only give limited information without much insight into the molecular behaviour. Atomistic simulations could complement experiments and shed precious light on microscopic mechanisms. The large free energy barriers connected to proton dissociation however make the use of enhanced sampling methods mandatory. Here we perform an ab initio molecular dynamics (MD) simulation and enhance sampling with the help of methadynamics. This has been made possible by the introduction of novel descriptors or collective variables (CVs) that are based on a conceptually new outlook on acid-base equilibria. We test successfully our approach on three different aqueous solutions of acetic acid, ammonia, and bicarbonate. These are…
| ACETIC ACID | AMMONIA | BICARBONATE | |
| Reactive molecule | 1 | 1 | 1 |
| Water molecules | 31 | 31 | 31 |
| Ensemble | NVT | NVT | NVT |
| Temperaure (K) | 300 | 300 | 300 |
| Thermostat | CSVR31 | CSVR31 | CSVR31 |
| Cell length (Å) | 9.97 | 9.80 | 10.47 |
| Basis sets | TZV2P-GTH | TZV2P-GTH | TZV2P-GTH |
| Potential | GTH-PBE | GTH-PBE | GTH-PBE |
| Energy cutoff (Ry) | 600 | 600 | 800 |
| Relative cutoff (Ry) | 60 | 60 | 80 |
| EPS SCF | 1.0E-6 | 1.0E-6 | 1.0E-6 |
| XC Functional | SCAN 29 | SCAN 29 | SCAN 29 |
| Time step (fs) | 0.5 | 0.5 | 0.5 |
| Length time (ps) | 258 | 285 | 461 |
| ACETIC ACID | AMMONIA | BICARBONATE | |
| Gaussian hills heights | 0.25 | 0.25 | 0.5 |
| Gaussian hills widths () | 0.2 | 0.2 | 0.2 |
| Gaussian hills widths () | 0.4 | 0.4 | 0.4 |
| Bias factor | 10 | 10 | 15 |
| Temperature (K) | 300 | 300 | 300 |
| Hills deposition rate | 100 | 100 | 100 |
| () | 5 | 5 | 5 |
| () | 8 | 8 | 8 |
| () | 12 | 12 | 12 |
| () | 1.0E-4 | 1.0E-4 | 1.0E-4 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Quantum Chemical Studies · Free Radicals and Antioxidants · Computational Drug Discovery Methods
A microscopic description of acid-base equilibrium
Emanuele Grifoni
[
GiovanniMaria Piccini
[
Michele Parrinello
[
Abstract
Acid-base reactions are ubiquitous in nature. Understanding their mechanisms is crucial in many fields, from biochemistry to industrial catalysis. Unfortunately, experiments only give limited information without much insight into the molecular behaviour. Atomistic simulations could complement experiments and shed precious light on microscopic mechanisms. The large free energy barriers connected to proton dissociation however make the use of enhanced sampling methods mandatory. Here we perform an ab initio molecular dynamics (MD) simulation and enhance sampling with the help of methadynamics. This has been made possible by the introduction of novel descriptors or collective variables (CVs) that are based on a conceptually new outlook on acid-base equilibria. We test successfully our approach on three different aqueous solutions of acetic acid, ammonia, and bicarbonate. These are representative of acid, basic, and amphoteric behaviour.
keywords:
acid-base metadynamics collective variables enhanced sampling
ETH] Department of Chemistry and Applied Biosciences, ETH Zurich, c/o USI Campus, Via Giuseppe Buffi 13, CH-6900 Lugano, Ticino, Switzerland \alsoaffiliation[USI] Institute of Computational Science, Università della Svizzera italiana (USI), Via Giuseppe Buffi 13, CH-6900, Lugano, Ticino, Switzerland
ETH] Department of Chemistry and Applied Biosciences, ETH Zurich, c/o USI Campus, Via Giuseppe Buffi 13, CH-6900 Lugano, Ticino, Switzerland \alsoaffiliation[USI] Institute of Computational Science, Università della Svizzera italiana (USI), Via Giuseppe Buffi 13, CH-6900, Lugano, Ticino, Switzerland
ETH] Department of Chemistry and Applied Biosciences, ETH Zurich, c/o USI Campus, Via Giuseppe Buffi 13, CH-6900 Lugano, Ticino, Switzerland \alsoaffiliation[USI] Institute of Computational Science, Università della Svizzera italiana (USI), Via Giuseppe Buffi 13, CH-6900, Lugano, Ticino, Switzerland \alsoaffiliation[IIT] Italian Institute of Technology, Via Morego 30, 16163 Genova, Italy
1 Introduction
Acid-base reactions play a key role in many branches of chemistry. Inorganic complexation reactions, protein folding, enzimatic processes, polymerization, catalytic reactions and many other transformations in different areas are sensitive to changes in pH. Understanding the pH role in these reactions implies having control over their reactivity and kinetics.
The crucial importance of pH has stimulated the collection of a large amount of data on acid-base equilibria. These are typically measured in gas and condensed phases using spectroscopic and potentiometric techniques. However, there are practical limitations to the accuracy of these methods especially in condensed phases1. Furthermore it is very difficult to extract from experimental data a microscopic picture of the processes involved. It is thus not surprising that acid-base equilibrium has been the subject of intense theoretical activity2, 3, 4, 1, 5, 6, 7, 8, 9, 10, 11, 12.
The acidity of a chemical species in water can be expressed in terms of , the negative logarithm of the acid dissociation constant. There are two ways of calculating these values, one static and the other dynamic.
The most standard approach is the static one in which solution-phase free energies, and consequently s, are obtained closing a Born-Haber cycle composed by gas phase and solvation free energies3, 4, 1, 5, 6, 7. While extremely successful in many cases, the static approach has some limitations. A solvation model needs to be chosen and continuum solvent models have a limited accuracy. This is particularly true in systems like zeolites or proteins characterized by irregular cavities in which an implicit description of the solvent is challenging. Obviously from such an approach dynamic information cannot be gained. Furthermore, there can be competitive reactions that cannot be taken into account unless explicitly included in the model.
In principle these limitations could be lifted in a more dynamical approach based on MD simulations in which the solvent molecules are treated explicitly. If one had unlimited computer time such simulation would explore all possible pathways and assign the relative statistical weight to the different states. Unfortunately the presence of kinetic bottlenecks frustrates this possibility trapping the system in metastable states, since different protonation states are separated by large barriers. Furthermore in acid-base reactions chemical bonds are broken and formed. This requires the use of ab initio MD in which the interatomic forces are computed on the fly from electronic structure theories. This makes the calculation more expensive and reduces further the time scale that can be explored.
To overcome this difficulty, the use of enhanced sampling methods 13 that accelerate configurational space exploration becomes mandatory. A very popular class of enhanced sampling methods is based on the identification of the degrees of freedom that are involved in the slow reaction of interest. These degrees of freedom are usually referred to as collective variables (CVs) and are expressed as explicit functions of the atomic coordinates . Sampling is then enhanced by adding a bias that is a function of the chosen CVs14, 15, 16. Furthermore, designing a proper set of good CVs has also a deeper meaning. Successful CVs capture in a condensed way the physics of the problem, identify its slow degrees of freedom and lead a useful modellistic description of the process.
In standard chemical reactions, this is relatively simple since well defined structures can be assigned to reactants and products17, 18, 19. This is not the case for acid-base reactions in which a proton is added to or subtracted from the solute. Once this process has taken place, water ions (\ceH+ or/and \ceOH-) are solvated and their structure becomes elusive. In fact water ions can rapidly diffuse in the medium via a Grotthuss mechanism20. They became highly fluxional and the identity of the atoms taking part in their structure changes continuously. The nature of these species is thus difficult to capture in an explicit analytic funcion of . However, given the relevance of acid-base reactions, many attempts have been made at defining these entities8, 9, 10, 11, 12. Unfortunately these CVs have an ad-hoc nature and, while successful in this or that case, cannot be generally applied.
In order to build general and useful CVs we make two conceptual steps. One is to look at the acid-base process as a reaction involving only a few moieties. Namely the whole solvent and the reacting residues in the solvated molecule. For example when there is only one type of dissociating residue we think of the acid-base equilibrium as a reaction of the type
[TABLE]
where is the number of water molecules, and are a generic acid-base molecule in solution and its conjugate species respectively, and are integers that can assume values and according to the acid-base behaviour of the species and .
This implies that we do not look at the solvent as a set of molecules that compete to react with the acid-base species. Rather we consider the solvent in its entirety as one of the two adducts. Taking this point of view is especially relevant in polar solvents like water that are characterized by highly structured networks. In this case the presence of an excess or a deficiency of protons changes locally the network structure and this distortion propagates along the entire network.
Since the very early days of Eigen and Zundel21, 22, researchers have struggled with how many molecules should be included in the definition of the perturbation23, 24, 25. Given the absence of physical parameters capable of giving a clear and unequivocal answer to this question, the idea of considering the solvent as a whole circumvents this problem. Thus the solvent is not just a medium with a passive role, but it is looked at as an ensemble of molecules that contribute collectively to the formation of the conjugate acid-base pair. This point of view is much closer to the original one proposed by Brønsted and Lowry in which the reaction can be seen as a simple exchange of an hydrogen cation between an acid-base pair.
For the reaction to take place the center of the perturbation has to move away from the solute. Thus the second important step is to monitor the center of the perturbation. Due to Grotthuss-like mechanisms the perturbation moves along the network. This can lead to different definitions of the defect center. However, if we tessellate the whole space using Voronoi polyhedra centered on water oxygen atoms we can assign unequivocally every hydrogen atom to one and only one of these polyhedra. The site whose Voronoi polyhedron contains an anomalous number of protons is taken as the center of the perturbation (see Fig. 1).
This new point of view gives the method a very general nature making it applicable to every acid-base system, without the need of fixing beforehand the reacting pairs. Thus it is possible to explore all the relevant protonation states even in systems composed by more than one acid-base pair.
This general approach allows defining CVs without having to impose specific structures or select the identity of the atoms involved. We test our method by performing metadynamics simulations in a weak acid case (acetic acid), a weak base (ammonia) and in an amphoteric species (bicarbonate) chosen as benchmark because of their comparable strength, but different acid-base behaviour.
Methods
As discussed above we introduce two CVs, one related to the protonation state and the other that locates the charge defects and measures their relative distance. Both of these CVs need a robust definition for assigning the hydrogen atoms to the respective acid-base site. In order to achieve this result we partition the whole space into Voronoi polyhedra centered on the acid-base sites located at . The sites include all the atoms able to breaking and forming bonds with an acid proton. The standard Voronoi space partition is described by a set of index functions centered on the different s such that if the -th atom is the closest to , and equal to 0 otherwise. For their use in enhanced sampling methods CVs need to be differentiable. To this effect we introduce a smooth version of the index functions, . These are defined using softmax functions:
[TABLE]
where and run all over the acid-base sites and controls the steepness with which the curves decays to 0, that is the selectivity of the function. With an appropriate choice of this definition achieves the desired result as shown in Fig. 2. In such a way, an hydrogen atom in a position is assigned to the polyhedron centered on the site with the weight . Then, the total number of hydrogen atoms assigned to the -th acid-base site is:
[TABLE]
where the summation on runs all over the hydrogen atoms.
One can associate to each acid-base site a reference value that counts the number of bonded hydrogen atoms in the neutral state. The difference between the instantaneous value of hydrogen atoms and the reference one is
[TABLE]
When different from zero will signal whether the -th site has gained or lost a proton. In the case of water oxygen atoms, a hydronium ion has a while a hydroxyde ion has .
We then group the acid-base sites in species. For instance in the case of the simplest amino acid glycine in aqueous solition the number of species will be equal to 3. All water oxygen atoms belong to one species, then one counts in another species the two carboxylic oxygen atoms and finally one considers as the third species the nitrogen atom of the amino group.
In the spirit of this work we count the total excess or defect of proton associated to each species,
[TABLE]
This implies that we are not interested in the specific identity of the reacted site, but whether or not the -th species in its entirety has gained (), lost () or has not changed its number of protons. If we consider a solute with only one reactive moiety then each possible state of the system can be described by one of the three two dimensional vectors (0,0), (-1,1) or (1,-1).
In the general case each protonation state can be described by a vector with dimension equal to the number of inequivalent reactive sites, . A more exhaustive explanation is provided in the S.I.
For use in enhanced sampling these vectors need to be expressed as a scalar function such that, for each physically relevant , attains values able to distinguish the different overall protonation states. There are infinite many ways of constructing a scalar from a vector. Possibly the simplest choice is to write and, in order to distinguish between different protonation states, to choose .
This leads to the following definition for the CV, that is used to describe the protonation state of the system:
[TABLE]
where are the indexes used to label the respective reactive site groups. In the appendix an example is worked out in detail. Of course the CV is made continuous by the use of in the calculation of the needed to evaluate in Eq. 5.
The second CV is a summation of distances between every acid-base sites multiplied for their partial charge .
[TABLE]
where the indexes and run all over the acid-base sites belonging to different groups, and is the distance between the two atoms. In this way, just the acid-base pair that has exchanged a proton gives a contribution different from zero. Eq. 7 is valid only when one single conjugate acid-base pair is present. However, due to the action of bias it may occur occasionally that several acid-base pairs may be formed. In order to avoid sampling these very unlikely events we apply a restraint on the number of pairs. Further details are provided in the S.I.
Results
We have applied our method to three aqueous solution of acetic acid, ammonia and bicarbonate as representations of a weak acid, a weak base and an amphoteric compound respectively. The setup of all three simulation is identical except for the identity of the solvated molecules. This ensures that the outcome reflects the different chemistry of these three systems and that there is no bias due to the initial condition.
Each simulation of the systems has been performed with Born-Oppenheimer MD simulations combined with well-tempered metadynamics 14, 26 using CP2K package 27 patched with PLUMED 2 28 and SCAN functional 29 for the xc energy, . See the S.I. for details.
In Fig. 3 we plot the Free Energy Surfaces (FESs) as a function of and . These FESs vividly reproduce the expected behaviour. They all have a minimum at that correspond to the state in which no charges are present in the solvent. In the acetic acid FES (Fig. 3-a) a second minimum close to reflects its acid behaviour. By contrast, the ammonia FES (Fig. 3-b) shows a second minimum close to . The shape of ammonia and acetic acid FES are approximately related by a mirror symmetry reflecting their contrasting behaviour. Similarly the bicarbonate symmetric FES (Fig. 3-c) mirror its amphoteric character.
As the conjugate pair is formed starts to assume positive values corresponding to the separation and diffusion of the conjugate pair. Compared to the undissociated state in which only is allowed, states where a conjugate pair is present show an elongated shape of the basins along this variable. This is caused by the diffusive behaviour of hydronium e hydroxide ion in solution that makes accessible a continuum range of distances. Moreover, along this CV we can observe a barrier around 1.5 corresponding to the breaking of the covalent bond between the hydrogen atom and the acid-base site.
Conclusions
The general applicability of this method to systems with different nature is an important step made in their understanding and description. The scheme can be extended to include quantum nuclear effects with the use of path integrals molecular dynamics30. This would be of quantitative significance since for instance values are affected by deuteration. Moreover, the absence of assumptions or impositions about reactive candidates or reaction paths allows extending this method to systems of increasing complexity which cannot be addressed with traditional methods. Examples of questions that can now be answered are tautomeric equilibria in biochemical processes, acid behaviour in Zeolites and on the surface of oxides exposed to water.
{acknowledgement}
This research was supported by the European Union Grant No. ERC-2014-AdG-670227/VARMET. Calculations were carried out on the ETH Euler cluster and on the Mönch cluster at the Swiss National Supercomputing Center (CSCS).
2 Supporting Information
CV1:
As described in the main text, the CV adopted to described the protonation state is
[TABLE]
Here, every component can assume a value equal to for the -th group whose chemical behaviour is acid, for a basic one, and equal to zero for unreacted groups. Then, these values are summed with a different weight given by the power of two of the group index . The prefactor allows to linearly combine a multiplet of values with a mathematical trick reducing the vector (,,…,) in a single unambiguous scalar number. Assuming we don’t know anything about the reactivity of a system composed by 3 different groups able to react, a priori we cannot exclude any of its 7 different protonation states (see Tab. 1).
As shown, all of these protonation states occupy different positions in the CV space without overlap among the states. This ensures the possibility to explore all of them starting from the most energetically accessible until the highest one in energy. Moreover, this approach allows to address systems in which multiple and unknown competitive reactions are present without beforehand fix the reactive pairs.
CV2:
This CV returns a value proportional to the distance between the fully formed conjugate acid-base pair. Once the proton transfer has taken place, two acid-base sites will have an anomalous number of hydrogen atoms within their Voronoi polyhedra. We can define the partial charge assigned to the -th Voronoi polyhedron as
[TABLE]
where and are constants indicating respectively the total number acid-base sites belonging to the -th group and the hydrogen atoms bonded to them in the equilibrium state, while is the instantaneous number of hydrogen atoms assigned to the -th acid-base site.
This means, for example, that in an system composed by a molecule of acetic acid and 31 molecules of water, every water oxygen atoms has and while the acid ones have and . In the initial frame all the water sites have a values of close to 2, and therefore close to zero. After having subtracted a proton by the acid molecule, one of the sites of the solvent will assume a value of close to 3 and to 1. The two acetic acid oxygen atoms have only one hydrogen assigned to them and then . In the undissociated species is 1 for the site bonded to the hydrogen atom and 0 to the other one making the values equal to +0.5 and -0.5 respectively. After the dissociation these sites must be indistinguishable and the acid molecule able to capture again a proton with one of them without any preference. Then, the partial charges will be -0.5 for both of the sites. This ensures that the opposite sign terms cancels each other in the undissociated case (Fig. 4-A) and gives an averaged contribution in the dissociated one (Fig. 4-B).
Restraint:
A restraint has been applied in order to avoid the formation of more then one conjugate acid-base pair.
[TABLE]
where run all over the acid-base site indexes and is a positive number much less than 1. With a proper value of the square root term is a good approximation of the absolute value that allows to avoid the singularity for (see Fig. 5)
This CV returns the summation of the all the partial charge moduli. This function can be restrained limiting at the given time the number of reacted pairs simultaneously present.
Ab initio MD setup
As reported in the main text, all the simulations have been performed with Born-Oppenheimer MD simulations using CP2K package 27 patched with PLUMED2 28. Details and parameters adopted in the ab initio MD simulations are reported in Tab. 2 .
Box thermalization
Each system, composed by 31 molecules of water and 1 of solute, has been thermalized as reported in Tab. 3.
The reason for this somewhat odd looking schedule is that the NPT ensemble module of CP2K does not support SCAN.
Well-tempered metadynamics setup
Parameters adopted for PLUMED2 settings are reported in Tab.4
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ho and Coote 2011 Ho, J.; Coote, M. L. First-principles prediction of acidities in the gas and solution phase. Wiley Interdiscip Rev Comput Mol Sci 2011 , 1 , 649–660
- 2Elstner et al. 2001 Elstner, M.; Hobza, P.; Frauenheim, T.; Suhai, S.; Kaxiras, E. Hydrogen bonding and stacking interactions of nucleic acid base pairs: A density-functional-theory based treatment. J. Chem. Phys. 2001 , 114 , 5149–5155
- 3Saracino et al. 2003 Saracino, G. A.; Improta, R.; Barone, V. Absolute p Kadetermination for carboxylic acids using density functional theory and the polarizable continuum model. Chem. Phys. Lett. 2003 , 373 , 411–415
- 4Schüürmann et al. 1998 Schüürmann, G.; Cossi, M.; Barone, V.; Tomasi, J. Prediction of the p Ka of Carboxylic Acids Using the ab Initio Continuum-Solvation Model PCM-UAHF. J. Phys. Chem. A 1998 , 102 , 6706–6712
- 5Ho and Coote 2009 Ho, J.; Coote, M. L. A universal approach for continuum solvent p Ka calculations: Are we there yet? Theor. Chem. Acc. 2009 , 125 , 3–21
- 6Silva et al. 2000 Silva, C. O.; da Silva, E. C.; Nascimento, M. A. C. Ab Initio Calculations of Absolute p K a Values in Aqueous Solution II. Aliphatic Alcohols, Thiols, and Halogenated Carboxylic Acids. J. Phys. Chem. A 2000 , 104 , 2402–2409
- 7Rebollar-Zepeda and Galano 2016 Rebollar-Zepeda, A. M.; Galano, A. Quantum mechanical based approaches for predicting p K a values of carboxylic acids: evaluating the performance of different strategies. RSC Adv. 2016 , 6 , 112057–112064
- 8Davies et al. 2002 Davies, J. E.; Doltsinis, N. L.; Kirby, A. J.; Roussev, C. D.; Sprik, M. Estimating p Ka values for pentaoxyphosphoranes. J. Am. Chem. Soc. 2002 , 124 , 6594–6599
