Assessing Nonbonded Aggregates Populations: Application to the Concentration-Dependent IR O–H Band of Phenol

J. Pablo Gálvez; José Zúñiga; Javier Cerezo

PMC · DOI:10.1021/acs.jctc.5c00281·April 14, 2025

Assessing Nonbonded Aggregates Populations: Application to the Concentration-Dependent IR O–H Band of Phenol

J. Pablo Gálvez, José Zúñiga, Javier Cerezo

PDF

Open Access

TL;DR

This paper compares two methods for calculating nonbonded aggregate populations in phenol solutions and finds that molecular dynamics better predicts IR spectra trends.

Contribution

A new graph-theory-based framework for identifying aggregate conformations and a comparative assessment of MD and quantum mechanical methods.

Findings

01

Molecular dynamics better captures experimental IR trends due to improved entropic contributions.

02

The graph-theory framework enables consistent comparison of aggregate population methods.

03

The protocol provides a benchmark for evaluating intermolecular force fields.

Abstract

In this work, we present two alternative computational strategies to determine the populations of nonbonded aggregates. One approach extracts these populations from molecular dynamics (MD) simulations, while the other employs quantum mechanical partition functions for the most relevant minima of the multimolecular potential energy surfaces (PESs), identified by automated conformational sampling. In both cases, we adopt a common graph-theory-based framework, introduced in this work, for identifying aggregate conformations, which enables a consistent comparative assessment of both methodologies and provides insight into the underlying approximations. We apply both strategies to investigate phenol aggregates, up to the tetramer, at different concentrations in phenol/carbon tetrachloride mixtures. Subsequently, we simulate the concentration-dependent OH stretching IR region by averaging the…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Chemicals2

phenol

carbon tetrachloride

Figures5

Click any figure to enlarge with its caption.

Molecular graphs used to classify interaction patterns in phenol dimers. Red and gray circles represent hydroxyl and phenyl groups, respectively. Black edges indicate the existence of covalent bonds between atoms within each node, while gray edges represent noncovalent interactions.

Diagram of the two protocols adopted in this work to obtain the populations of the different interaction-pattern conformers using C/d (above) and Q/s (below) strategies.

Overall population of aggregates (up to tetramers) with respect to phenol concentration, computed with Q/s (solid continuous lines) and C/d approach (stars). The experimental populations for the monomer from Schaefer et al. are included for reference (crosses).

IR spectrum of phenol–tetrachloride mixtures at different concentrations, simulated through C/d (bottom panels) and Q/s (top panels) strategies. The contributions of monomers to tetramers are depicted individually. Inset: zoom of the plot to make evident the small bands appearing in the intermediate region in the case of a concentration of 0.115 M with the Q/s approach.

IR spectrum of phenol-tetrachloride mixtures at different concentrations (0.115, 0.450, and 1.100 M), simulated through C/d (bottom panel) and Q/s (top panel) strategies, with a phenomenological broadening in the O–H region using a Gaussian with a HWHM of 60 cm–1. The experimental spectra for these concentrations are included in the central panel.

Tables1

1. Populations of the Different Conformers Computed with QM and Classical MD Methodologies

	dimer
	C/d	Q/s
OH-bonded	68.65	51.00
π–π	8.58	33.43
OH-π	22.76	15.56

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpectroscopy and Chemometric Analyses · Water Quality Monitoring and Analysis · Dye analysis and toxicity

Full text

Introduction

1

Noncovalent interactions, such as dispersion, hydrogen or halogen bonding, and π–π stacking, play a pivotal role in the condensed phase. These interactions are fundamental to the supramolecular organization of biologically and technologically relevant systems, including protein structures, DNA base-pairing, and liquid crystals.? Detailed understanding of these interactions is, therefore, crucial for advancements in material design, drug discovery, or protein folding.

Molecular aggregates, which are held together by such nonbonded interactions, may present a remarkable challenge, as their relative populations are governed by a delicate balance of enthalpic and entropic contributions. Indeed, predicting these populations requires (i) an accurate description of nonbonded interactions, which determine the enthalpic component, and (ii) a thorough accounting of the extensive range of conformers arising from complex potential energy surfaces (PESs) with numerous energetically accessible minima, which govern the entropic contribution.

Accurate modeling of nonbonded interactions necessitates electronic structure methods that incorporate nonlocal correlation effects, often requiring computationally demanding post-Hartree–Fock techniques. However, density functional theory (DFT), enhanced by suitable correction schemes such as Grimme’s empirical dispersion corrections, ?,? provides a computationally feasible alternative for systems dominated by noncovalent interactions. Combined with implicit solvent models,? DFT has significantly expanded its applicability to biologically and technologically important systems. Still, such approaches become impractical for large systems requiring numerous potential evaluations, such as in molecular dynamics (MD) simulations. In such cases, interatomic interactions are typically described using empirical molecular mechanics (MM) potentials, where nonbonded interactions are represented by Coulombic terms for electrostatics and Lennard–Jones (LJ) potentials for van der Waals forces. While polarizable force fields offer improved descriptions, ?,? simpler point charges and LJ potentials still remain the standard in MD simulations, whose quality can be improved by reparameterizing them against quantum mechanical (QM) energy databases.?

Moving to the exploration of conformational flexibility, automated conformational searches are commonly employed to identify the most relevant solute configurations, ?,? particularly when interactions are described at QM level. The PES around these minima is typically approximated using harmonic models, with specific adaptations available to account for nonharmonic regions, such as torsional degrees of freedom. Solvent effects are generally included through implicit solvation models, and aggregate populations are ultimately derived by constructing partition functions that incorporate quantum mechanical treatment of vibrations.

Alternatively, classical MD simulations, which explicitly represent the solvent, provide a highly effective approach for exploring the configurational space of systems, encompassing both the solute and solvent. However, few studies have applied MD simulations to directly calculate the populations of clusters of varying sizes within the context of organic aggregates. ?−? ? ? Unbiased MD simulations have primarily been limited to obtaining statistical insights into aggregate sizes and structural characteristics. These analyses typically involve tracking the average number of noncovalent interactions throughout the trajectory, ?,?−? ? ? ? ? computing radial or pair distribution functions, ?−? ? ? or monitoring the evolution of structural parameters. ?,?

A critical step in these strategies is the identification of unique conformers from the vast number of molecular configurations typically generated. This challenge can be effectively addressed using graph theory tools. Specifically, adjacency matrix analysis has been successfully applied to classify molecular configurations into different topological conformers in MD snapshots of individual solute molecules in solution ?,? or to distinguish topologically distinct, QM-optimized isomers.? However, a comprehensive top-down approach that identifies individual aggregates coexisting in solution potentially within the same snapshot, in terms of their isomorphic conformations, is still lacking. Such a framework is essential for accurately calculating populations in systems involving thousands of particles.

A notable strength of MD approaches for thermodynamic predictions is their independence from prior knowledge of aggregate structures, unlike strategies that compute partition functions that require the identification of the relevant minima. Nevertheless, several challenges persist for MD-based approaches. Accurate determination of population distributions requires a dynamic evaluation of chemical identities at each simulation step, which typically relies on geometric criteria that may involve some arbitrariness. Furthermore, for species present at low concentrations, achieving sufficient conformational sampling demands extended simulation times and large system sizes. In contrast, first-principles quantum chemistry calculations are well suited to these scenarios, employing efficient reaction equilibrium algorithms to compute species concentrations without the need for extensive sampling. ?−? ? Still, a comprehensive and reliable framework to compare the performance of MD-based and partition-function-based approaches remains an open challenge.

The effects of noncovalent interactions, which influence the populations of aggregates, are ultimately manifested in the spectroscopic signals of the system. Infrared (IR) spectra, in particular, contain exquisite information about microscopic characteristics and provide detailed insights into key noncovalent interactions, notably hydrogen bonds. ?,? Recent advances have reported simulations of IR and vibrational circular dichroism (VCD) spectra based on aggregate populations for pure solvents ?−? ? ? ? ? ? ? ? and solvent mixtures with various solutes. ?−? ? ? ? ? ? ? ? ? These approaches involve averaging the spectra of optimized clusters and calculating their populations using either the quantum cluster equilibrium theory ?−? ? ? ? or the clusters-in-a-liquid framework.? While most studies focus on the fingerprint region, simulations of the O–H stretching region for hydrogen-bonded aggregates ?,?,?,?,?,? have revealed limitations in addressing the challenging broadband that is observed. The formation of hydrogen bonds significantly affects the vibrational spectrum, shifting the O–H stretching bands of the acceptor to lower wavenumbers. Indeed, this band is highly sensitive to the number and arrangement of hydrogen bonds, whether intra- or intermolecular. Consequently, these characteristics make the simulations of such an O–H stretching region (3100–3800 cm^–1^) a stringent test to compare populations derived from MD and conformational search-based methods.

Building on these ideas, we first develop two methodologies to evaluate the populations of nonbonded aggregatesone based on the analysis of MD trajectories and the other on conformational searches followed by a statistical thermodynamic treatment using model potential energy surfaces, and subsequently assess their accuracy by simulating the IR spectrum, focusing on the O–H stretching region. Our methodologies are applied to phenol/carbon tetrachloride mixtures, which presents marked changes in the O–H IR band with concentration. In these mixtures, phenol monomers coexist with various aggregates, whose distinct peaks in the 3100–3800 cm^–1^ range reflect the structure and conformational landscape of the system.? Self-association of phenol in CCl_4_ indeed remains a challenging problem and, despite extensive experimental studies over the past decades, ?−? ? ? ? ? ? ? ? ? ? ? ? no general consensus exists regarding the degree of aggregation or the structures of the resulting clusters.

In this study, we analyze the populations of phenol clusters in solution through atomistic simulations, extending our analysis to tetramers. The two developed strategies are applied to compute the populations of phenol aggregates, which are eventually used to simulate the average IR spectra at varying phenol/CCl_4_ ratios, focusing on the challenging O–H broadband. Our results offer valuable insights into the strengths and limitations of both approaches in capturing enthalpic and entropic contributions to aggregate formation.

This paper is organized as follows: Section provides a detailed description of the newly developed approaches for determining aggregate populations and outlines the method for calculating averaged IR spectra. Section presents the computational details. Section discusses the main findings, and the conclusions are presented in Section.

Theoretical Methods

2

This section begins by introducing the graph-theory-based approach used to identify equivalent conformers, which applies to both methodologies for determining aggregate populations. The two strategies, termed Q/s and C/d, are then described in detail, eventually providing a concise comparison between them. Finally, we outline the procedure to compute IR spectra based on the populations derived from these two approaches.

Unique Classification of Conformers Based

on Graph Theory

2.1

In this study, we have developed and implemented a systematic procedure for identifying and classifying macromolecular aggregates based on powerful algorithms of graph theory. Our method is primarily intended to analyze MD trajectories, which usually contain thousands of particles. Indeed, their analysis is far from straightforward, which explains why traditional MD analysis tools do not directly provide information about these aggregates that may appear for short periods of time during the simulation. Moreover, as we will show, the top-down nature of our approach makes it generally applicable to any set of configurations, for instance, those arising from a systematic conformational search.

A molecular graph is a two-dimensional (2D) representation of a molecule where atoms and bonds are typically simplified as vertices and edges of the graph, respectively. Graphs can be represented in matrix form, leading to computationally efficient methods to extract information from the system. In this work, we resort to the adjacency matrix, which contains topological information, i.e., the covalent and noncovalent bonds within the atoms. The knowledge of this matrix at each snapshot ultimately provides us with the information about the isolated molecular clusters present during the simulation.

In the initial step, the analysis of the adjacency matrix of the entire system determines the number of aggregates of each size, n. This is achieved by transforming the adjacency matrix of the system into a block-diagonal form, where each diagonal block represents the individual adjacency matrix of a distinct aggregate in the system. The dimensions of this sub-matrix determine the size of the aggregate. In this way, from the number of blocks of identical dimensions and the total number of blocks, we can calculate the population of aggregates of varying sizes.

For the sake of simplicity, we consider only the connectivity between the functional groups of the molecules, represented as the nodes in an undirected graph. In the case of phenol, we focus on the hydroxyl group (OH) and the phenyl group (π) of the monomer. Aggregate configurations, up to tetramers, are classified according to their interaction patterns, i.e., the topology according to the functional groups. Although interactions are determined based on the position (or center of mass) of atoms (or groups of atoms), we employ a simplified representation based on functional groups to classify the conformations explored during the simulation according to their interaction patterns in topologically distinct isomers. While using this simplified graphical representation does not retain information about the specific donor and acceptor molecules forming, e.g. a hydrogen bond, it offers a streamlined approach for evaluating the effectiveness and limitations of this clustering-like method in analyzing molecular configurations.

The interaction patterns considered for the phenol dimer and the associated molecular graphs are illustrated in Figure. Hereafter, we refer to the set of geometric configurations that can be associated with the same interaction pattern as conformers of a given isomer. One interaction (edge) between hydroxyl groups corresponds to a hydrogen bond (H-bond) between them. The presence of an edge between two phenyl groups can indicate either a pure π···π interaction between the aromatic rings or C–H···π interactions between a C–H bond in one molecule and the π-electron cloud of an aromatic ring. Finally, OH–π bonded isomers correspond to either an H-bond between an aromatic hydrogen atom and the hydroxyl group or a hydroxyl proton with the π-electron cloud.

Molecular graphs used to classify interaction patterns in phenol dimers. Red and gray circles represent hydroxyl and phenyl groups, respectively. Black edges indicate the existence of covalent bonds between atoms within each node, while gray edges represent noncovalent interactions.

Naturally, the number of possible molecular graphs increases substantially in higher-order clusters. To simplify our analysis, trimer and tetramer configurations are classified to attend to the topology of the H-bonds among molecules. If no H-bond is present, they are categorized as either π–π or OH–π bonded clusters, depending on whether any monomers interact solely through OH···π interactions or not. This results in five distinct interaction patterns for trimers and ten for tetramers, which are schematized in Figures S15 and S16 in the Supporting Information (SI) and listed in the first column of Table.

In our approach, intermolecular interactions are defined based on specific geometrical criteria (detailed information on the selection of the criteria used to determine the noncovalent interactions in the system is provided in Section A of the SI). Two phenol molecules are considered to form a hydrogen bond when the O···H distance is less than 2.6 Å and the O···H–O angle is between 135 and 180°. The π-electronic clouds of two phenyl groups are considered to interact when the distance between the centers of mass of the carbon atoms in each molecule is less than 4.0 Å. C–H···O interactions are present if the H···O distance is lower than 2.8 Å, and the angle with the adjacent C atom is between 120 and 180°. Finally, X–H···π interactions are defined when the H···π distance is lower than 3.0 Å, with a C–H···π or O–H···π angle in the range of 120–180° or 135°–180°, respectively. The effect on the populations of using different thresholds has been proven to be rather limited, as shown in the SI (Section D), which supports the robustness of our approach.

This protocol can be applied to the snapshots generated through MD simulations, which may include all or just part of the interaction patterns. The total population would be given by the average over all snapshots, which would generally include all of the interaction patterns for sufficiently long trajectories. The same protocol can be applied to an ensemble of configurations generated through a systematic conformational search. In this case, the number of interaction patterns identified will depend on the extent of the automated search. All algorithms described here have been implemented in aggregate,? a code developed by one of us to identify and classify interaction patterns.

Quantum/Static (Q/s) Approach

2.2

The Q/s strategy starts from a systematic conformational search to locate all possible conformers for monomers to tetramers, which are subsequently analyzed by means of statical thermodynamics, in line with quantum cluster equilibrium theory ?−? ? ? ? or the clusters-in-a-liquid framework.? Here, we used AutoMeKin

?−? ? software, with some adaptations as explained in Section, to generate the conformational space. This results in a set of unique isomers for each aggregation size, containing 1 monomer, 38 dimers, 174 trimers, and 144 tetramers. Each of them is optimized, and their second derivatives (frequencies) are subsequently computed at the DFT/PCM level (see Section for details).

All of the aforementioned isomers may be grouped into macroscopically differentiable conformers. Specifically, we here adopt the interaction-pattern conformers identified with the graph-theory tool described in Section, aggregate, in terms of hydrogen bonds, π–π interactions, and π–OH contacts. Therefore, each interaction-pattern conformer includes, in general, several isomers. We then account for such a multimolecular description of the conformer treating the molecular system as an equilibrated mixture of all of the minimum-energy structures with a total free energy given by eq

?,?−? ?

[eqn]

where the sum runs over the set of all energy-minimized molecular conformations (isomers) that are characterized by a given interaction pattern, {X}. In the previous expression, ΔG(X _ i _) = G(X _ i _) – G(X ref), with X ref being some arbitrarily chosen reference configuration and G(X _ i _) the absolute free energy corresponding to the ith conformation. m _ i _ is the number of equivalent energy wells in the PES of the system compatible with species i (conformers and rotamers), which can be obtained by using either a direct counting method or automatic conformational search algorithms. When dealing with noncovalent complexes, this PES degeneracy must also account for the indistinguishability of monomer molecules.?

At this point, we still have to determine the Gibbs free energy of each individual isomer (G(X _ i )). This corresponds to the standard free energy at temperature T and pressure *p, G(X_i) ≡ G* _ i _ ^*^(T, p), of solute i, which is determined as the sum of the following terms?

[eqn]

In this equation, E _ i,gas_ ^elec^ is the gas-phase total electronic energy of the solute and ΔG _ i _ ^solv^(T, p°) is its solvation Gibbs free energy (defined to transfer a solute from an ideal gas at a concentration of 1 mol L^–1^ to an ideal solution at a concentration of 1 mol L^–1^). The third term of this equation is usually termed the vibrational–rotational–translational contribution to the gas-phase Gibbs free energy. It can be written as $[eqn]$ , and it includes the free energy change associated with bringing the solute from its ideal gas reference state at p° = 1 bar to a density of unit molarity,? i.e., a reference state concentration of 1 mol L^–1^. Note that omitting this factor would lead to a systematic error of 1.89 kcal mol^–1^ at 298.15 K to the absolute free energies.? The last term involves the density and, despite being systematically omitted in the literature, has a notable impact on the free energies at high temperatures, especially for nonaqueous solvents, as shown by Noroozi and Smith.?

Eq provides the absolute free energies in solution at the standard state of 1 mol L^–1^ used in this work. Thermal contributions to solution-phase free energies (δG**VRT*) are computed by referencing the partition function to the ground-state electronic energy, which inherently includes the zero-point vibrational energy (ZPE). These contributions are calculated using ideal gas partition functions under the quasi-rigid rotor harmonic oscillator (quasi-RRHO) approximation, ?,? employing scaled? solution-phase harmonic vibrational frequencies, with the temperature and pressure fixed at 298.15 K and 1 bar, respectively. The vibrational frequencies were scaled for thermal correction calculations using scaling factors derived in this work from experimentally measured fundamental frequencies of the phenol monomer in the gas phase and in CCl_4_ solution (see Section C in the SI).

Once the free energies of each interaction pattern, α, are evaluated, we can now obtain the populations, which we here define as the molar fraction over the total amount of the aggregate with the same size, n, as

[eqn]

where G _ nα_ is computed according to eq including all of the isomers classified within the interaction pattern α.

Eqation provides the populations of each conformer relative to the aggregate of size n. The complete elucidation of the system composition thus requires the macroscopic concentration of each aggregate, which is computed from the self-association equilibria

[eqn]

The thermodynamic constant of the above equilibrium is calculated from the global free energy of the aggregate of size n

[eqn]

where the free energies include the contribution of all interaction-pattern conformers of this size through eq.

As dictated by the self-aggregation equilibrium, eq, the concentrations of the monomer (C 1) and that of the n-th aggregate (C _ n _) at equilibrium must obey

[eqn]

where we have included the standard concentration, C°, which is equal to 1 mol L^–1^ and can be effectively omitted. In turn, the stoichiometric concentration of phenol, C 0, assuming that only aggregates up to tetramers exist, is given by C 0 = ∑_ n = 1_ ^4^ nC _ n _. This mass-conservation condition, together with all three self-association equilibria, eq, completes a system of four equations whose solution provides the concentration of the monomer and the three aggregates.

Classical/Dynamic (C/d) Approach

2.3

The C/d approach leverages the sampling provided by an MD simulation (details of the simulation are given in Section). Our novel methodology determines aggregate populations directly comparable to the Q/s approach by identifying the same interaction pattern conformers also using the aggregate tool.

Namely, the proposed protocol provides the number of conformational pattern conformers at each step, N _ nαj _, from the block-diagonalization adjacency matrix and subsequent classification of the aggregate. The equilibrium concentrations are then computed according to the following equation

[eqn]

The population of the α-th interaction pattern is then given by

[eqn]

The global concentration of each aggregate is, in this case, given by the sum over all of the conformers of the same size, ⟨C _ n ⟩ = ∑α⟨C _ nα⟩. Naturally, during the simulation, aggregation states larger than those of tetramers are also encountered, which will make the sum from the monomer to the tetramer lower than the stoichiometric phenol concentration, C 0. However, as their fraction in the mixture is below 5%, they can be safely disregarded, therefore keeping the same monomer–dimer–trimer–tetramer equilibrium model as in the Q/s case. In this sense, to maintain consistency with the Q/s protocol, the equilibrium concentrations considered here are not those measured directly from the MD simulation but rather those obtained from the mass-conservation law (assuming species up to the tetramer) and the self-association equilibria, as shown in eq. The required equilibrium constants are obtained directly from the ratio of the average equilibrium concentrations of the aggregates within the simulation box

[eqn]

In Table S8 of the SI, we report the concentrations computed directly from the MD simulations and those eventually used in this work. Although the overall trends are consistent, the MD populations reveal the noticeable presence of higher-order aggregates. In the most concentrated solution, these account for up to 15% of the total phenol population. Nevertheless, the molar concentrations of these larger species remain a small fraction of all aggregates in solution. Indeed, our results suggest that beyond the tetramer, the influence of higher-order aggregates is quite limited. Since the focus of this study is on comparing the Q/s and C/d approaches, we have decided to maintain our simplified treatment, including only aggregates up to the tetramer. We acknowledge, however, that the presence of higher-order aggregates raises certain limitations, which could be addressed in future work by adopting generalized methodologies that randomly select individual phenol molecules and explicitly consider their surrounding environments.

Summary of the Proposed Methodologies

2.4

The two strategies described above are summarized in the diagram presented in Figure. On the one hand, the C/d approach (top) applies the aggregate program to the snapshots generated along the MD trajectory. These snapshots typically contain aggregates of varying sizes, each spanning multiple interaction patterns. The aggregate program identifies the size and interaction pattern of each aggregate, ultimately providing their populations. On the other hand, the Q/s approach (bottom) implies an initial systematic conformational search that generates isomers for aggregates of different sizes. The aggregate program is subsequently run over these isomers to map them to their corresponding interaction-pattern conformers, and their populations are determined by accounting for the free energies of all of the isomers contributing to each interaction pattern.

Diagram of the two protocols adopted in this work to obtain the populations of the different interaction-pattern conformers using C/d (above) and Q/s (below) strategies.

It is important to emphasize that both approaches yield populations for the same set of interaction-pattern conformers, as identified by the aggregate program. This ensures a consistent and reliable comparison between the populations derived from MD simulations (C/d) and those obtained via systematic conformational searches (Q/s).

Additionally, to maintain consistency, in all cases, we provide the populations for each interaction pattern corresponding to a specific aggregate size (up to tetramers). Then, the relative concentrations of monomers, dimers, trimers, and tetramers are computed from the equilibrium constants for aggregate formation. These constants are determined from either free energies (Q/s method, eq) or average populations (C/d method, eq).

Calculation of Averaged IR Spectra

2.5

Once the populations of the interaction-pattern conformers are obtained using both methodologies, their performance is assessed by evaluating the averaged broad IR O–H stretching band arising from the contributions of these conformers. A key step in this process is computing the IR spectra associated with each interaction-pattern conformer.

In this work, the IR spectra for each conformer are derived from the Q/s approach, assuming that the flexibility within the interaction pattern is properly captured by this methodology. Specifically, the spectra are obtained by averaging the harmonic IR spectra of all isomers mapping to a given interaction-pattern conformer. Vibrational frequencies were scaled using the scaling factor computed in this work (0.96) to correct for errors stemming from anharmonicity and the inherent approximations of the DFT functional. Further details on the calculation of the scaling factor are provided in Section C of the SI. The Boltzmann-averaged molar absorption coefficient for the conformer α, of size n, ε _ nα_(ω), is computed as

[eqn]

where i runs over all isomers contributing to the interaction pattern, and the relative weight of each isomer, w _ nαi _, is given by

[eqn]

These molar absorption coefficients are then used to compute the total absorbance of the sample by exploiting the additivity of the Beer–Lambert law. The total absorbance is the sum of contributions from all conformers, weighted by their populations and macroscopic concentrations. The absorbance spectrum is expressed as

[eqn]

Here, l is the cuvette path length, set to the experimental value;? C _ n _ is the concentration of the n-th aggregate, related to the stoichiometric concentration via the self-association equilibria (eq), which accounts from the monomer (n = 1) to the tetramer (n = 4), and N IP ^(n)^ represents the number of interaction-pattern conformers for each aggregate size. *P_nα_

is the population of the conformer evaluated with either Q/s or C/d methods. This approach enables a quantitative comparison to experimental data at different concentrations. To facilitate this comparison, the computed absorbance spectra are converted into transmittance to match the experimental format.

The individual IR spectra of conformers are convoluted using a Lorentzian function with a half-width at half-maximum (HWHM) of 10 cm^–1^ for frequencies below 3200 cm^–1^ and above 3650 cm^–1^, a commonly adopted value. However, for frequencies in the 3200–3650 cm^–1^ range, corresponding to the O–H stretching of phenol aggregates, a Gaussian function with a larger HWHM of 60 cm^–1^ is applied. This increased broadening accounts for the inhomogeneous effects caused by the complex conformational flexibility of the aggregates. Frequencies below 3200 cm^–1^ correspond to C–H stretching vibrations, while those above 3650 cm^–1^ arise from free O–H stretching in bare phenol and chain- or branched-like aggregates. The large Gaussian broadening in the O–H region reflects the structural diversity of the aggregates, which induces significant inhomogeneous broadening.

Computational Details

3

Electronic Structure Calculations

3.1

All electronic structure calculations were performed with Gaussian? using the B3LYP functional including D3 Grimme’s dispersion correction for improved noncovalent interactions. Geometry optimizations were carried out using the N07D? basis set, including the environment effect within the IEF-PCM? implicit solvation model. Thermal corrections are computed from the geometries and frequencies obtained at this level of theory. Final energies are refined by increasing the basis set size (N07Tdiff), computing the electronic gas phase and solvation-free energy contributions in eq independently. Moreover, the basis set superposition error was corrected using the counterpoise correction in the calculation of the total gas-phase energies. For the standard state correction in eq, we used the value of the experimental density of CCl_4_.

Systematic Conformational Search

3.2

The PES of noncovalent complexes is characterized by the presence of many local, low-lying energy minima, which need to be taken into account effectively, as an insufficient conformational sampling could easily contribute several kcal mol^–1^ to the free-energy differences.? In this work, we have conducted such conformational sampling with AutoMeKin package,? with some modifications to ensure an efficient and complete as possible search. We note that a reliable screening is essential to reduce as much as possible the computation of vibrational frequencies for the clusters, which consume most of the computational time and represent by far the computational bottleneck of the entire procedure. The specific strategy employed to minimize the computational cost of conformational exploration is described in detail in the SI (Section B).

Of course, this modeling strategy based on partial optimizations can be coupled to other automated conformational search methods from the very beginning, ?,?,? or even MD simulations to obtain the candidate structures. In fact, after analyzing the MD trajectories, some interaction pattern conformers listed in Table could not be identified using the procedure described above. To address this, we conducted a second round of conformational sampling post hoc, starting from a set of C/d structures classified as the missing conformers. Still, some isomers could not be optimized, but they can be safely ignored, because their low populations make their overall contribution to the spectrum negligible. The effect of expanding the aggregate set on the Q/s populations and the computed IR spectra is discussed in the SI (see Section E).

Molecular Dynamics Settings

3.3

The C/d approach involves unbiased MD simulations of phenol in carbon tetrachloride mixtures that were conducted with Gromacs 5.0.7.? The simulation box contains a mixture of both compounds, with approximately 11,000 molecules in total. Specifically, we conducted MD simulations of three solutions with increasing solute concentrations (0.115, 0.45, and 1.1 M) incorporating 10,000 solvent molecules and 111, 434, and 1060 solute molecules into a starting cubic simulation box with edges of 12.0, 12.5, and 13.0 nm, respectively.

Over the simulation, bonds containing hydrogen atoms were constrained using the LINCS algorithm, allowing a time step of 1 fs for the integration of the equations of motion (using the leapfrog algorithm). Particle-mesh Ewald was considered for the calculation of long-range electrostatic forces using a cutoff of 1.1 nm. The same cutoff was used for the short-range van der Waals forces. The general AMBER force field (GAFF) was used to model solvent and solute molecules. GAFF parameters were taken directly from virtualchemistry.org.

After constructing and minimizing the simulation box, the system is equilibrated by subsequent 2 ns of NVT simulation (using the modified Berendsen–velocity rescale–thermostat with a time constant for coupling of 0.1 ps to keep the temperature constant at 298.15 K) followed by 15 ns of NpT simulation. Production trajectories were then extended for a total of 100 ns, with an NpT ensemble at 298 K and 1 atm. The Nose–Hoover thermostat and Parrinello–Rahman barostat were used in the NpT simulations with relaxation times of 1.0 and 5.0 ps, respectively. Configurations used in the subsequent analysis were extracted from the MD trajectory every 1 ps. Such an export rate was assessed by comparing the populations derived from a larger export frequency of 5 ps. The results obtained with both export frequencies show no significant differences beyond statistical uncertainty, thus validating our setup. The corresponding concentrations calculated using both snapshot frequencies are provided in Table S8 of the Supporting Information.

Results and Discussion

4

In this work, we have deployed two alternative strategies to account for the populations of different phenol aggregates, i.e., the Q/s and C/d methodologies described in Sections and 2.3. In the following sections, we first introduce and compare the populations of aggregates provided by each of them and eventually use them to compute the average IR spectra, focusing on the O–H stretching region.

Conformational Sampling and Populations

4.1

The graph theory formalism adopted to identify the configurations of the aggregates provides a unified framework for analyzing the populations computed with both classical (C/d) and quantum (Q/s) methodologies. Namely, we characterize the different molecular configurations of aggregates based on connectivity patterns, allowing us to systematically classify aggregates of varying sizes (monomers, dimers, trimers, and tetramers), each with distinct interaction pattern conformers. These conformers, in turn, lead to different minima (isomers) that arise from the molecular flexibility.

In the Q/s approach, automated sampling generates many of these isomers, with all of them (and potentially more) appearing during MD sampling. However, tracking every possible isomer from the trajectory is infeasible. Instead, by using interaction-pattern conformers, which will generally encompass various minima in the Q/s approach and multiple configurations in the C/d MD sampling, we establish a common descriptor to classify configurations and enable a meaningful comparison of their populations with both methodologies.

As described in detail in the previous section, in the Q/s approach, the population of each conformer is derived from quantum mechanical partition functions, assuming a harmonic model around each minimum and summing the populations of all minima associated with the corresponding conformer. Conversely, in the C/d approach, the population is determined by counting the occurrences of the different conformers, as identified through MD sampling.

The population of aggregates with a given size can then be determined by summing the populations of all of the conformers corresponding to each aggregate. This enables us to determine the concentrations from the monomer to the tetramer, which can then be compared with experimental results, offering an initial assessment of the performance of both Q/s and C/d approaches. Our findings are summarized in Figure, which presents the concentrations of monomers and aggregates (up to tetramers) in terms of the stoichiometric concentration of phenol. Notably, the Q/s approach provides a continuous curve representing populations across concentrations, while the C/d approach is limited to the discrete systems simulated.

1: Populations of the Different Conformers Computed with QM and Classical MD Methodologies

Overall population of aggregates (up to tetramers) with respect to phenol concentration, computed with Q/s (solid continuous lines) and C/d approach (stars). The experimental populations for the monomer from Schaefer et al. are included for reference (crosses).

Our results clearly highlight the different performances of the methodologies. In both cases, the monomer is the most populated species; however, the relative populations of the different aggregates differ between the two strategies. Specifically, in the C/d approach, aggregate populations follow the order dimer > trimer > tetramer over the entire range of concentrations. In the Q/s approach, this order holds only at low concentrations, but the tetramer population increases more sharply, surpassing the trimer population at concentrations above approximately 0.1 M and the dimer population at concentrations above approximately 0.5 M. Although no specific evidence on the relative populations of the aggregates is experimentally available, the monomer population derived from experiments? aligns more closely with the results from the C/d method. Experimental estimations are available, ?−? ? ? ? ? ? ? ? ? ? ? ? however, for the self-association equilibrium constants, which can be compared with those computed with Q/s and C/d. Notably, equilibrium constants calculated with the C/d approach align better with the experimental values. In contrast, those derived from Q/s consistently shift the self-association equilibria toward the monomer compared to experimental findings. This explains the overpopulation of the monomer reported in Figure with Q/s. The equilibrium constants used to generate the distribution curves shown in Figure, along with their experimental counterparts, are provided in Table S6 of the SI. Moreover, to facilitate the analysis, in Table S7 we also provide the equilibrium constants referred to reactions normalized to a stoichiometric coefficient of one for the monomer, which show a steady increase with aggregate size. The relative values of the equilibrium constants between both approaches also explain the observed trends for each method reported in Figure. Namely, the ratio between the C/d and Q/s equilibrium constants decreases as aggregate size increases: from 4.8 for the dimer to 3.9 for the trimer and down to 2.1 for the tetramer. This trend leads to a relative enhancement of the tetramer’s population in the Q/s approach compared to that of the smaller aggregates. As a result, the tetramer becomes the dominant species in the Q/s predictions at much lower concentrations compared to the C/d ones. For example, Figure shows that the tetramer surpasses the trimer at approximately 0.2 mol L^–1^, and the dimer surpasses the trimer at around 0.6 mol L^–1^. In contrast, with the C/d approach, although the population of the tetramer increases and approaches those of the trimer and dimer, it does not surpass those within the concentration range considered.

We now turn our attention to the relative distribution of interaction-pattern conformers for each aggregate. Table lists all such interaction patterns in the first column, along with the populations of each species obtained using both the Q/s and C/d approaches, allowing for a more detailed comparison of the performance of each strategy.

For dimers, the populations of different conformers are relatively similar across both methods, with the hydrogen bonded conformer being the most populated in both cases. However, notable differences emerge as the aggregate size increases. For trimers, the most populated conformer changes from dimer-like with Q/s to linear with C/d. This trend, favoring linear conformers in the C/d approach, continues with tetramers, where linear conformers remain the most populated with C/d, whereas the cyclic conformer becomes dominant with Q/s.

Notably, the cyclic conformer represents one of the most striking differences between the two methodologies: it is significantly populated in both trimer and tetramer aggregates with Q/s, while C/d predicts a relatively low population. This high population of cyclic structures observed with Q/s may, in fact, be overestimated, as previously noted in similar studies.? The relative accuracy of each strategy will be evaluated further in the next section, where these populations are used to simulate the broad IR O–H stretching band of phenol as a function of concentration.

Simulation of the IR Spectrum

4.2

We focused on simulating the OH region of the IR spectra for the mixtures. The spectral shapes in this region arise from the variety of aggregates formed, each exhibiting distinct IR features whose populations depend on the relative concentrations of phenol and carbon tetrachloride, as shown in the previous section. This makes calculating this spectral region particularly challenging, as it requires careful consideration of the interplay of nonbonding interactions, especially hydrogen bonding, that influence the relative populations of the aggregates.? For this reason, such a simulation serves as a valuable benchmark to assess the accuracy of the populations computed using the two methodologies proposed in this work, Q/s and C/d.

To isolate the effects of relative populations, we evaluated the spectra from a weighted average over the interaction-pattern conformers, using identical individual spectra for each of them. In turn, these individual spectra are computed as weighted averages of the IR spectra for each optimized isomer, calculated at the DFT level, with relative weights provided by the Q/s method. The spectra for each conformer are included in Section F of the SI.

Let us begin by analyzing the shape and position of the different aggregates displayed in Figure for both the Q/s and C/d strategies. In general, we find the expected behavior, ?,?−? ? with the position of the band going toward lower frequencies as the size of the aggregate increases. It should be noted, however, that the average bands for each aggregate generally consist of a complex superposition of different conformers. In this sense, since not all conformers are solely dictated by hydrogen bonds, e.g., those with π-π stacking interactions, there are some features that lie at higher frequencies. Moreover, it is worth noting that the shape of each aggregate also depends on the adopted strategy. Namely, C/d generally results in broader spectra, suggesting the inclusion of a wider variety of species in the average, which follows the distribution of relative populations displayed in Table.

IR spectrum of phenol–tetrachloride mixtures at different concentrations, simulated through C/d (bottom panels) and Q/s (top panels) strategies. The contributions of monomers to tetramers are depicted individually. Inset: zoom of the plot to make evident the small bands appearing in the intermediate region in the case of a concentration of 0.115 M with the Q/s approach.

We continue by identifying the contributions of monomers, dimers, trimers, and tetramers to the final spectra. It is noteworthy that the spectra for each aggregate differ not only in intensity between methods but also in shape. The relative weight of each aggregate aligns with the populations shown in Figure. For example, at the lowest concentration, the spectrum is clearly dominated by the monomer in the Q/s method, while the contribution of the dimer, although lower than that of the monomer, is still notable in the C/d approach.

As the phenol concentration increases to 0.450 M, Q/s reveals a predominant contribution of the tetramer, even if the monomer remains as the major component. In contrast, C/d shows a more balanced contribution across the different aggregates. Finally, in the most concentrated sample, the tetramer band is the most intense with Q/s, while C/d again provides a more evenly distributed contribution among the aggregates with an increased contribution of the tetramer compared to lower concentrations.

The spectra in Figure display a complex band structure with numerous well-resolved peaks in the 3150–3600 cm^–1^ region. This effect, observed with both the Q/s and especially the C/d strategies, contrasts with the broad, structureless band seen in experimental data.? This discrepancy likely arises from the conformational flexibility associated with a relatively flat potential surface and the sensitivity of OH stretching frequencies to hydrogen bonding in this region. Accurately capturing this effect would require incorporating contributions from out-of-equilibrium structures, which, while accessible through the C/d protocol, lies beyond the scope of this study. Instead, we address this phenomenon phenomenologically by applying larger Gaussian broadening specifically in this region to account for inhomogeneous broadening. This adjustment results in smoother curves that more closely resemble the experimental band. Specifically, we apply Gaussian broadening with a half-width at half-maximum (HWHM) of 60 cm^–1^ in the 3150–3600 cm^–1^ region while retaining a Lorentzian broadening with HWHM of 10 cm^–1^ in the rest of the spectral region.

The resulting spectra, simulated with both Q/s and C/d, alongside the experimental one,? are shown in Figure. Focusing first on the experimental data, it is evident that for all concentrations there are a number of bands that remain nearly unchanged. These can be associated with the monomer, at approximately 3600 cm^–1^, and CH stretching vibrations (the double-peaked band around 3000 cm^–1^). Both of these features are identifiable in the simulated spectra with the Q/s and C/d methods, with slight deviations in position that can be attributed to the usual inaccuracies in DFT-computed frequencies at the harmonic level even if using a scale factor.

IR spectrum of phenol-tetrachloride mixtures at different concentrations (0.115, 0.450, and 1.100 M), simulated through C/d (bottom panel) and Q/s (top panel) strategies, with a phenomenological broadening in the O–H region using a Gaussian with a HWHM of 60 cm–1. The experimental spectra for these concentrations are included in the central panel.

The main concentration-dependent changes in the spectrum occur in the intermediate region between these relatively stable bands. Signals in this region correspond to the aggregates as discussed above, and in the following, we will focus our discussion on them, comparing the experiments with the theoretical simulations at each concentration.

Starting with the most diluted sample, the experimental spectrum shows a band in the intermediate region located just below the band of the monomer. While appreciable, this band is significantly less intense than the monomer one. In the simulations, the Q/s method does not produce any significant signal in this region, whereas C/d displays a small contribution from the dimer, which is consistent with the experimental data, although shifted to lower frequencies.

Moving to the intermediate concentration, the experimental band in the intermediate region broadens and increases in intensity. In this case, the Q/s approach produces a signal mainly associated with the tetramer, which is again notably less intense than the experimental band. In contrast, the C/d method is able to capture the broadening and intensity increase in this intermediate region, resulting in a simulated band that aligns quite well with the experimental data, although it is shifted to higher frequencies, suggesting a possible underestimation of tetramer-related bands.

Finally, for the most concentrated sample, the experimental band in the intermediate region shows a moderate increase in intensity compared to that of the previous concentration, but its shape remains mostly the same, with a slight enhancement on the low-frequency side. The Q/s method displays a significant increase in intensity, dominated by the tetramer-related band. Although the intensity is similar to that in the experiment, the band is narrower and overly centered on the tetramer. In contrast, the C/d method provides a simulated band that follows the experimental trend, showing an increase in intensity compared to the previous concentration. Regarding the shape, there is a change induced by the increased contribution of the tetramer, resulting in a band that aligns more closely with the experimental data.

Following analysis of the OH region in the IR spectrum (Figure), it is evident that the C/d strategy outperforms the Q/s approach, indicating that the populations derived from the classical MD trajectory better reflect the experimental behavior. But which factors account for the superiority of C/d populations over those from Q/s? To address this question, we need to consider the two thermodynamic contributions to population distributions: the enthalpic and entropic terms associated with the formation of aggregate.

The enthalpic contribution is primarily dictated by intermolecular interactions, including hydrogen bonds and π-π interactions. In this sense, it is expected that the dispersion-corrected DFT calculations in the Q/s approach will outperform the force field calculations applied in C/d. Regarding the treatment of the environment, Q/s employs an implicit solvation model (PCM), which although may face limitations in describing specific interactions such as hydrogen bonds, provides a reasonable account of polarization effects, which are likely more relevant in the case of carbon tetrachloride solutions. Indeed, the potential inaccuracies in the FF, which, despite offering an explicit description of the solvent, completely neglects key effects such as polarization, could partly explain some of the deficiencies in C/d, such as its apparent underestimation of tetramer populations. Therefore, enthalpy differences cannot explain the better performance of the C/d strategy. The key to understanding this difference would necessarily lie in the entropic term, which is linked to the vast number of configurations arising from the significant molecular flexibility in loosely bound aggregates. This implies that many configurations are necessary to achieve population distributions in the experimental system, which becomes unmanageable for the automated conformational search of the Q/s approach. Conversely, this conformational flexibility is, by construction, effectively captured by the MD simulation adopted in the C/d strategy.

The entropy-based analysis further elucidates the prevalence of cyclic conformers in the Q/s approach, whereas linear conformers are more common with the C/d strategy, as shown in Table. The significantly larger number of configurations available to linear conformers makes it challenging to identify all possible configurations via the automated search used in the Q/s approach. This limitation leads to an underestimation of the entropic contributions that favor linear conformers over cyclic conformers, which, while being enthalpically favorable, display a much lower entropy. Thus, the apparent overestimation of cyclic structures with Q/s-like approaches previously reported,? would rather correspond to an underestimation of chain-like configurations.

Such an analysis in terms of entropy also helps rationalize the trends observed in the equilibrium constants for aggregation discussed in relation to the populations in Figure. While the constants are consistently lower with the Q/s approach, they tend to converge toward the C/d values as aggregate size increases. This can be understood by considering the balance between enthalpic and entropic contributions to the free energy: as the aggregate size grows, the number of stabilizing interactions (e.g., hydrogen bonds or π-stacking) increases, making the enthalpic term, similarly described in both approaches, more dominant. Since the entropic term, which is poorly captured in Q/s, becomes less relevant for larger aggregates, the agreement between Q/s and C/d improves with aggregate size.

It is important to note that once an optimized configuration of an isomer is identified in the Q/s method, conformational motions around that configuration are included through the partition function, which is based on harmonic model surfaces and a quantum mechanical treatment of vibrations. For intramolecular vibrations, the entropic terms are better described by this quantum description compared to the classical treatment of MD trajectories in the C/d approach. However, aggregate formation is primarily governed by intermolecular nonbonded interactions, which lead to loose nuclear motions that are effectively described at the classical level. For the study of nonbonded species, the C/d strategy thus emerges as a very suitable approach.

Moreover, the C/d approach demonstrates remarkably good agreement with experimental data. This is a notable achievement, as the shape of the OH band results from a complex interplay of nonbonding interactions governing aggregate formation, which in turn defines the band structure.

Conclusions

5

In this study, we have introduced and evaluated two distinct methodologies for sampling the nuclear degrees of freedom in a system governed primarily by nonbonded interactions: a liquid mixture of phenol and carbon tetrachloride. Our aim was to determine which protocol provides a more reliable description of the aggregate populations. Subsequently, we assess their performance by comparing the average IR spectra, computed with the populations resulting from each methodology, against the experimental one at varying concentrations, particularly focusing on the OH stretching region.

The first method employs a systematic conformational search followed by a potential energy surface (PES) expansion around each identified structure. This approach, which we termed Q/s, computes the nuclear partition function by assuming a harmonic PES and treats vibrational modes at a quantum mechanical level (with rotational and translational contributions considered semiclassically and classically, respectively). The second method, termed C/d, calculates populations by conducting a classical molecular dynamics (MD) simulation, thus treating nuclear motion fully classically.

A rigorous framework was essential to identify and categorize aggregates consistently across both methodologies. We accomplished this through a graph-theory classification based on nonbonded interaction patterns that define each aggregate, including hydrogen bonds, π-π stacking, and π–OH bonds. The Q/s approach identifies several optimized structures for each conformer during the automated search, capturing some degree of conformational flexibility. The C/d approach inherently includes this flexibility through MD trajectories.

Upon comparing the aggregate populations obtained with the Q/s and C/d approaches, we found that they align closely for smaller aggregates, such as dimers, but diverge significantly as the aggregate size increases. Specifically, Q/s tends to favor cyclic conformers, while C/d predicts a predominance of linear aggregates. This divergence arises due to the different entropic and enthalpic contributions captured by each method: while Q/s benefits from quantum mechanical accuracy to evaluate the interactions, it under-represents the vast configurational entropy associated with loosely bound aggregates due to the limitations of systematic conformational search. In contrast, C/d captures a more realistic entropic profile by exploring a broader ensemble of conformers through MD, even though it approximates intermolecular interactions by using a molecular mechanics force field.

The ability of the Q/s method to align with experimental observations is limited by the extent of its conformational search. While an exhaustive search could yield results closer to those of C/d, nonbonded aggregates exhibit high configurational entropy, imposing practical limits on systematic search-based approaches. Aggregates with such loosely bound, flexible structures inherently challenge even sophisticated automated search techniques as capturing their full entropy becomes infeasible. Our findings thus indicate that the C/d approach provides a more reliable representation of aggregate populations in this system, as demonstrated by its superior performance in simulating IR spectral regions sensitive to nonbonded interactions, particularly, the broad OH stretching band.

To achieve such complete conformational sampling within Q/s-based strategies, we recommend carrying out the screening of the structures not solely based on energy criteria (relative population of the aggregates) but on entropy-like criteria based on the convergence of the multiconformational contributions to the free-energy, or equivalently, the conformational entropy.? Moreover, when possible, this convergence must be achieved for each isomer group. Naturally, this is associated with a significant increase in computational time, particularly for noncovalent complexes, where large basis sets or the counterpoise correction are required to mitigate basis set superposition error.

It is noteworthy that the C/d approach not only outperforms Q/s but also achieves an impressive absolute accuracy. This method closely reproduces the experimental trends of the broad O–H stretching band in IR as phenol concentration changes. This is indeed a challenging task, given that the observed broadband reflects the complex network of nonbonding interactions that give rise to different populations of conformers as the concentrations increase. With minor deviations, C/d captures these experimental variations nearly quantitatively. In essence, its capacity to handle loosely bound, entropically driven conformers makes the C/d approach a compelling choice for systems governed by weak, noncovalent interactions. Of course, it is also possible to adapt Q/s approaches to incorporate the entropic effects in an effective manner, e.g., through Bayesian inference.? In this context, our results can also be helpful in improving such methodologies.

The observed deviations with the C/d approach are likely due, in part, to inaccuracies in the FF used to evaluate the intermolecular interactions. In this sense, this system may serve as an effective benchmark to assess refined intermolecular FF quality. In this study, we adopted standard GAFF parameters for the intermolecular potential, though we are currently pursuing refined parametrizations based on quantum mechanical (QM) calculations, as implemented in Picky. ?,? These adjustments may further improve the agreement with experimental data. Moreover, another source of potential inaccuracies arises from the presence of aggregates larger than the tetramer, whose influence increases with the concentration. One route for improving the simulations involves extending the C/d approach to avoid strict reliance on predefined interaction patterns, e.g., by randomly selecting individual phenol molecules and subsequently incorporating their environment in the analysis. This strategy is currently being investigated by our group.

Another limitation is that the method does not directly account for the inhomogeneous broadening that affects the region where aggregate signals emerge. In this work, we included this effect phenomenologically, adding specific Gaussian broadening to this region to mimic the experimental band closely. An alternative to incorporate such effects from first principles could involve mixed quantum-classical strategies similar to those used in electronic spectroscopy,? where the QM spectra of stiff solute degrees of freedom are computed across configurations generated via MD simulations. For IR spectra, this would require spectral frequency calculations at nonstationary points, which is feasible today using curvilinear internal coordinates.? These strategies are currently being explored to achieve a fully ab initio simulation of the challenging broad OH IR band.

Supplementary Material

Bibliography97

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Jena S.Dutta J.Tulsiyan K. D.Sahu A. K.Choudhury S. S.Biswal H. S.Noncovalent interactions in proteins and nucleic acids: beyond hydrogen bonding and π-stacking Chem. Soc. Rev.2022514261428610.1039/D 2CS 00133 K 35560317 · doi ↗ · pubmed ↗
2Grimme S.Accurate description of van der Waals complexes by density functional theory including empirical corrections J. Comput. Chem.2004251463147310.1002/jcc.2007815224390 · doi ↗ · pubmed ↗
3Antony J.Grimme S.Density functional theory including dispersion corrections for intermolecular interactions in a large benchmark set of biologically relevant molecules Phys. Chem. Chem. Phys.200685287529310.1039/b 612585 a 19810407 · doi ↗ · pubmed ↗
4Tomasi J.Mennucci B.Cammi R.Quantum Mechanical Continuum Solvation Models Chem. Rev.20051052999309410.1021/cr 990400916092826 · doi ↗ · pubmed ↗
5Liu C.Piquemal J.-P.Ren P.AMOEBA+ Classical Potential for Modeling Molecular Interactions J. Chem. Theory Comput.2019154122413910.1021/acs.jctc.9b 0026131136175 PMC 6615954 · doi ↗ · pubmed ↗
6Inakollu V. S.Geerke D. P.Rowley C. N.Yu H.Polarisable force fields: what do they add in biomolecular simulations?Curr. Opin. Struct. Biol.20206118219010.1016/j.sbi.2019.12.01232044671 · doi ↗ · pubmed ↗
7Prampolini G.Livotto P. R.Cacelli I.Accuracy of Quantum Mechanically Derived Force-Fields Parameterized from Dispersion-Corrected DFT Data: The Benzene Dimer as a Prototype for Aromatic Interactions J. Chem. Theory Comput.2015115182519610.1021/acs.jctc.5b 0064226574315 · doi ↗ · pubmed ↗
8Ferro-Costas D.Fernández-Ramos A.New computational tools for chemical kinetics: the Cathedral Package Theor. Chem. Acc.20231427610.1007/s 00214-023-03012-w · doi ↗