Determination of Nucleotide–Nucleotide and Nucleotide–Amino Acid Binding Interactions from All-Atom Potential-of-Mean-Force Calculations
Alejandro Feito, Eduardo Pedraza, Estefania Cuesta, Alejandro Castro, Antonio Rey, Ignacio Sanchez-Burgos, Rosana Collepardo-Guevara, Andrés R. Tejedor, Jorge R. Espinosa

TL;DR
This paper uses detailed simulations to study how RNA and proteins interact at the molecular level, providing insights for modeling biomolecular condensates.
Contribution
The study provides a detailed quantitative analysis of nucleotide–amino acid interactions using all-atom simulations and PMF calculations.
Findings
Lys, Arg, Asp, Glu, Gln, Ser, and Asn amino acids consistently interact with RNA nucleotides through hydrogen bonding and cation−π interactions.
Phosphate–Arg contacts and π–π stacking with aromatic residues are key for RNA–protein condensate cohesion.
Salt concentration reduces electrostatic self-repulsion in phosphate–phosphate interactions.
Abstract
Biomolecular condensates emerge from multivalent interactions between proteins and nucleic acids and are frequently modeled by using coarse-grained molecular dynamics simulations. The parametrization of these models critically depends on atomistic data describing the underlying molecular interactions. In this work, we employ all-atom molecular dynamics simulations and potential-of-mean-force (PMF) calculations to investigate the landscape of interactions between RNA nucleotides and protein amino acids. We begin by characterizing nucleotide–nucleotide binding modes through canonical base-pairing analysis, observing notable agreement in the predictions from both the AMBER03ws and CHARMM36 force fields. Further rationalization of different nucleotide–nucleotide interaction modes involves the calculation of PMFs for ribose–ribose, phosphate–phosphate, and RNA tertiary interactions such as…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
1
2
3
4
5
6
7- —H2020 European Research Council10.13039/100010663
- —Ministerio de Ciencia, Innovaci?n y Universidades10.13039/100014440
- —Ministerio de Ciencia, Innovaci?n y Universidades10.13039/100014440
- —Ministerio de Ciencia, Innovaci?n y Universidades10.13039/100014440
- —HORIZON EUROPE European Research Council10.13039/100019180
- —Engineering and Physical Sciences Research Council10.13039/501100000266
- —Engineering and Physical Sciences Research Council10.13039/501100000266
- —Emmanuel College, University of Cambridge10.13039/501100000609
- —European Social Fund Plus10.13039/501100004895
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForce Microscopy Techniques and Applications · DNA and Nucleic Acid Chemistry · RNA and protein synthesis mechanisms
Introduction
1
Intracellular compartmentalization is a fundamental process that enables cells to organize their components in space and time to perform diverse biological functions. This biomolecular organization relies on the coordinated dynamics of both membrane-bound and membraneless organelles, the latter also termed biomolecular condensates. These condensates are thought to form through the spontaneous self-assembly of protein and nucleic acid solutions ?−? ? a mechanism that has attracted considerable attention across disciplines due to its broad implications for health and disease. Condensate formation allows cells to establish highly concentrated, yet transient, biomolecular assembliessuch as Cajal bodies,? stress granules,? P bodies,? and nucleoli?that facilitate essential processes including biomolecular compartmentalization,? regulation of gene expression,? cellular stress response,? and signal transduction.? Given that many biomolecular condensates are primarily composed of proteins and nucleic acids (in the form of RNAs or DNAs), it is crucial to understand the central role of their interactions in driving self-assembly and organization through phase-separation. ?−? ? Importantly, transitions from dynamic, functional biomolecular condensates to irreversible, solid-like aggregates are mediated by the local strengthening of interprotein ?−? ? and RNA–RNA interactions. ?,? Some of these transitions have been associated with the development of multiple neurodegenerative disorders, including amyotrophic lateral sclerosis (ALS) or frontotemporal dementia (FTD) ?,? among others,? leading to aberrant condensate formation,? fibril aggregation,? and ultimately, cytotoxicity.?
In the complex network of molecular interactions found in biomolecular condensates, electrostatic interactions play a central role in regulating the condensation of many different biomolecules. ?,? In this context, RNA has been identified as a primary electrostatically driven condensate modulator, ?,?−? ? ? capable of binding protein RNA-recognition motifs rich in positively charged ?,?−? ? and polar? amino acids. More importantly, recent in vitro assays also suggest that RNA alone is able to undergo phase separation, ?,?,? favored by low concentrations of magnesium ions that screen phosphate–phosphate electrostatic repulsion. Furthermore, tertiary interactions such as those in G-quadruplex (G4) structuresnoncanonical DNA/RNA formations composed of two or more guanine tetrads connected by Hoogsteen hydrogen bonds in a square planar arrangement ?,?,? have been associated with RNA aberrant aggregation leading to kinetically trapped states.? The precise interactions underlying the formation of RNA condensates and the role of salt concentration have not yet been fully explored, ?,? although it is well established that π–π stacking,? hydrogen bonding,? and cation−π interactions ?,? modulate RNA and RNA–protein associations, thereby regulating key physicochemical properties of condensates such as viscosity and surface tension. ?−? ?
In the past decade, molecular dynamics (MD) simulations have become a fundamental tool for studying condensate formation, providing mechanistic insights into their phase behavior, time-dependent material properties, and the key residues governing these processes. ?−? ? ? ? In particular, residue-resolution coarse-grained (CG) models such as the CALVADOS2,? HPS-Urry,? or the Mpipi-Recharged ?−? ? have significantly advanced our ability to determine the physicochemical properties of biomolecular condensates in a precise and computationally efficient manner. ?,? The pioneering HPS model and its subsequent parametrizations were mainly developed to reproduce experimental radii of gyration and to qualitatively capture the phase behavior of different proteins. ?,?,? However, the focus of newly designed models has shifted toward bottom-up approaches that leverage atomistic simulations, ?,? bioinformatics,? or machine-learning-based parametrizations. ?,?−? ? ? In particular, potential-of-mean-force (PMF) calculations have gained prominence for evaluating the free energy of binding in various biological systems, including peptides, ?,?,? amino acids, ?,?,?,?−? ? and RNA–protein interactions.? These approaches provide submolecular insights into the mechanisms and interactions underlying condensate formation or aberrant liquid-to-solid transitions. ?,?,?
In this work, we perform a systematic characterization of RNA nucleotide pairs and nucleotide–amino acid interactions by means of all-atom PMF calculations with explicit solvents and ions. We first benchmark representative atomistic force fieldsAMBER03ws? and CHARMM36/TIP4P2005 ?,? for describing canonical base-pairing and verify that both reproduce the expected hydrogen-bonding hierarchy (CG > AU > GU). We analyze different potential interaction modes of nucleotides by calculating their PMF profiles at varying relative orientations. In particular, we estimate binding interaction strengths for base–base, phosphate–phosphate, and ribose–ribose interactions. Moreover, RNA tertiary interactions involving G-quadruplexes are also included in our PMF calculation scheme to further characterize more complex RNA structures. We then explore how specific nucleotide–nucleotide binding affinities depend on the concentration of monovalent salt (0–3 M NaCl), demonstrating that while base–base and ribose–ribose interactions mostly remain unaltered, high salt concentrations effectively screen phosphate–phosphate repulsion. We also extend our calculations to encompass interactions between the 20 natural amino acids and the four RNA nucleotides, covering all possible combinations. To this end, we first validate the AMBER03ws/TIP4P2005 force field ?,? against the a99SB-disp/TIP4P-disp ? model to ensure accurate reproduction of pairwise amino acid interactions. We then determine the binding of amino acids to nucleotides, focusing on those relevant for biomolecular condensate formation, including interactions between phosphate groups and charged residues as well as π–π stacking of aromatic residues with the base and ribose. Overall, our results advance the understanding of nucleotide–amino acid binding, providing an extensive library that serves as a foundation for the future development of improved coarse-grained models of protein and nucleic acid condensates and for rationalizing experimental in vitro and in vivo evidence at the molecular level.
Methodology
2
The primary focus of this study is to rationalize the molecular interactions between the residues and nucleotides involved in biomolecular condensation. To this end, atomistic PMF calculations were performed using the GROMACS simulation package (version 2023).? The molecular interactions of amino acids and nucleotidesincluding nucleotide–nucleotide, amino acid–amino acid, and nucleotide–amino acidwere modeled using the AMBER03ws/TIP4P2005 force field ?,? and validated by contrasting with the CHARMM36/TIP4P2005 ?,? and a99SB-disp/TIP4P-disp ? models for the nucleotide–nucleotide and amino acid–amino acid interactions, respectively. Note that we refer to the water model in a99SB-disp as TIP4P-disp, given that a99SB-disp is a coparameterization of the AMBER99SB*-ILDN-Q? protein force field together with the TIP4P-D water model,? which includes modifications to the C6 dispersion term of the water oxygen. All simulations were performed at 300 K and physiological salt concentration (150 mM NaCl). Additionally, the effect of salt concentration was explored in the range of 0–3 M NaCl for nucleotide–nucleotide and amino acid–amino acid interactions. To build the systems, the phosphate group of each nucleotide was capped with a methyl group, while the N- and C-terminal ends of each amino acid were capped with acetyl and N-methyl groups, respectively.
For nucleotide–nucleotide PMFs, the nucleotide pair was pulled along a dissociation coordinate defined by the distance between the centers of mass (COMs) of the interacting structural regions: the nitrogenous base, the ribose, or the phosphate group. Thus, nucleotide pairs were oriented to face each other via the nitrogenous base, ribose, or phosphate with the COM of the corresponding region serving as the reference in the initial configuration. For nucleotide–amino acid and amino acid–amino acid PMFs, the reaction coordinate was defined as the distance between the COMs of the most relevant regions of both molecules (e.g., bases, riboses, side chains, or aromatic rings). In nucleotide–amino acid PMFs, molecules were positioned such that the base faced the side chain of the amino acid. For amino acid–amino acid PMFs, the pairs were oriented to face through their side chains to enable potential binding. In all cases, the residue pairs were placed in a cubic box of dimensions of 3.4 × 4.4 × 10 nm containing approximately 5000 water molecules. Some water molecules were replaced by Na^+^ and Cl^–^ ions to achieve the desired salt concentration and electroneutrality. Energy minimization was performed with a force tolerance of 1000 kJ mol^–1^ nm^–1^, applying positional restraints of 8000 kJ mol^–1^ nm^–2^ to all heavy atoms of the interacting residues or nucleotides.
During production runs, temperature coupling was achieved using the velocity-rescale (v-rescale) thermostat with a relaxation constant of τ_T_ = 1.0 ps, and pressure coupling was applied via the Parrinello–Rahman barostat with τ_P_ = 1.0 ps at 1 bar. Positional restraints of 1000 kJ mol^–1^ nm^–2^ were applied in directions perpendicular to the pulling axis. The COM distance between nucleotide pairs was controlled using a harmonic umbrella potential with a force constant of 10,000 kJ mol^–1^ nm^–2^, while a lower constant (6000 kJ mol^–1^ nm^–2^) was used for amino acid–amino acid interactions. Bond constraints on hydrogen-containing bonds were enforced using the LINCS algorithm, allowing an integration time step of 2 fs. Periodic boundary conditions (PBC) were maintained in all three spatial directions. Short-range dispersive and electrostatic interactions were calculated using a cutoff of 0.9 nm, and long-range contributions were computed with the Particle-Mesh Ewald method for electrostatic interactions.? Approximately 30 umbrella sampling windows were defined and spaced every 0.05 nm, covering the full COM distance range. Each window was simulated for 10 ns, and the Weighted Histogram Analysis Method (WHAM),? as implemented in GROMACS, was used to reconstruct the free-energy profiles. The initial 2000 ps of each simulation were discarded to ensure equilibration. For RNA tertiary interactions involving G-quadruplexes, we employed a PDB structure as the starting configuration. All parameters and conditions were identical to those described above, except that the pulling was applied to one strand, while the remaining three were treated as a rigid block. The system was solvated in a rectangular box of dimensions 7 × 9 × 10 nm containing approximately 20,000 water molecules. The reaction coordinate was defined as the COM distance between the pulled strand and the rest of the quadruplex, and sampling was performed using a harmonic umbrella potential with a force constant of 55,000 kJ mol^–1^ nm^–2^.
The binding free energy for all PMF simulations was estimated as the global minimum of the resulting free-energy curve. We verified that this approach yields equivalent values to integrating the negative part of the PMF, as confirmed for base–base and amino acid–amino acid systems in Figures S3 and S6. To test the robustness of the positional constraints, additional PMF calculations allowing free rotation of both residues were also performed (Figure S1). The uncertainty of the PMF minimum was estimated using the Bayesian bootstrap method.? To account for local fluctuations, the final error was obtained by combining the Bayesian bootstrap errors of the minimum and adjacent points.
Results
3
Characterization of the Different Binding
Modes in Nucleotide–Nucleotide Interactions
3.1
We systematically performed atomistic PMF calculations for different configurations of nucleotide pairs. With this approach, we unraveled the various binding modes between nucleotides and how they are modulated by their relative orientations and salt concentration. We first computed the PMFs for canonical Watson–Crick base pairs (i.e., CG and AU) and for the noncanonical GU wobble pair, interacting through the base in specific configurations extracted from experimental PDB structures. In particular, we used the PDB codes 1RNA for AU, 259D for CG, and 1A4D for GU. All simulations were performed using the AMBER03ws force field, ?,? which enables intrinsically disordered proteins to explore extended conformational states, reduces nonspecific aggregation, and provides a realistic description of solvation and the structural properties of disordered chains. In addition, AMBER03ws has been shown to realistically capture RNA binding to TDP-43? and recapitulate the free energy of binding between different types of nucleotides. ?,? As a benchmark, we also calculated the PMF profiles for the canonical base pairs using the CHARMM36 force field,? which has been specifically developed and validated for proteins,? lipids,? and nucleic acids.?
In FigureA, we show the PMF profiles of the canonical Watson–Crick pairs (CG, AU) and the noncanonical wobble pair (GU) obtained with both force fields. The results reproduce the expected trend for Watson–Crick base pairing, corresponding to three versus two hydrogen bonds for CG and AU, respectively. The CG pair exhibits the deepest free energy minimum, followed by AU, maintaining a ratio of 0.8 between them, which is reasonably close to the expected 2/3 value derived from the number of hydrogen bonds. Importantly, both force fields yield similar profiles (∼0.5 kcal/mol), although the PMFs from AMBER03ws consistently displayed slightly deeper minima. This behavior can be attributed to the stronger interactions captured by AMBER03ws, as the discrepancies between the two models become more pronounced with an increasing number of hydrogen bonds between the nucleotide pairs. Additionally, the GU wobble pair showed a significantly lower binding strength, as expected, and exhibited a similar shape for both force fields. Crucially, both models reproduced the same relative energetic order (CG > AU > GU), demonstrating the robustness of our approach. In fact, when the absolute minima of the interaction profiles are normalized by the binding energy of CG for each model (FigureB), the relative differences between force fields become negligible. In the following calculations, we therefore employ the AMBER03ws model unless otherwise specified.
(A) PMF profiles as a function of the center-of-mass (COM) distance between the bases of the corresponding pair obtained with AMBER03ws (solid lines) and CHARMM36 (dotted lines) force fields for the canonical pairs adenine–uracil (AU, blue) and cytosine–guanine (CG, red), and the guanine–uracil wobble (GU, green) pair. (B) Free energies estimated from the minimum of the PMF profiles for canonical nucleotide pairs with AMBER03ws (orange bars) and CHARMM36 (green bars) and normalized by the value of the CG pair. Absolute minimum free energies of the PMF profiles for canonical and wobble (C) and noncanonical (D) nucleotide pairs. Above the energy plot, we depict the nucleotide pair configurations used in the PMF calculations, including arrows to indicate the dissociation direction with the atom color code: light blue (carbon), dark blue (nitrogen), red (oxygen), and orange (phosphorus). All calculations were performed at T = 300 K, p = 1 bar, and 150 mM NaCl.
We next extended our free energy calculations to all remaining nucleotide combinations to estimate their interactions through the base (FigureC,D). We report the binding free energies for the canonical and wobble pairs (AU, CG, and GU; FigureC) as well as for the remaining noncanonical base pairs: CC, UU, AA, GG, AC, CU, and AG (FigureD; see Figure S2A in the Supporting Information for the corresponding PMF profiles). For these noncanonical pairs, the configuration was set up by placing the residues in a mirror-image arrangement with the bases in the same plane, as described in Section and illustrated in FigureD. This configuration enables base–base interactions and facilitates hydrogen bonding formation. As shown in FigureD, the most energetically favorable pair is CC, consistent with cytosine’s ability to form hydrogen bonds. ?,? The UU and AC pairs exhibited similar stability, slightly below that of CC, while the remaining pairs displayed interaction energies of the order of thermal fluctuations intrinsic to the system (∼0.5 kcal/mol; see Section for further details on this estimation).
Complementary to our approach, some studies have used the C1′–C1′ (C1′ atom corresponds to the anomeric carbon of the ribose sugar, which is covalently bonded to the nucleobase) distance as a geometric descriptor correlated with nucleotide pairing type, ?−? ? both for canonical Watson–Crick pairs and noncanonical pairs. Evaluating the C1′–C1′ distance provides insight into the stability of the pairs and can be readily computed from our simulations by modifying the reaction coordinate (see Figure S2B in the Supporting Information). We find that as the absolute interaction energy decreases (FigureD), the C1′–C1′ distances tend to increase, indicating reduced stability for noncanonical base pairs relative to Watson–Crick pairs (see Figure S2B), consistent with experimental observations. ?,? Some authors have also proposed calculating the binding free energy from PMFs by integrating the negative part of the profile rather than measuring the depth of the global minimum. ?,?,? In Figure S3, we showed the resulting binding energies obtained using this integration approach for each nucleotide pair (see Supporting Information for further details of this analysis). Importantly, we find that the overall trend remains consistent across all pairs, although the relative binding interaction strength of AA, GG, and CU is slightly increased due to local minima in their PMF profiles at intermediate distances (see Figure S2A). Nevertheless, the relative correlation of binding interaction strength is preserved independent of the applied analysis.
Base–base interactions are crucial for nucleotide binding, particularly for pairs capable of forming persistent hydrogen bonds (i.e., CG, AU, GU). However, the energy landscape of nucleotide interactions is not solely determined by the nitrogenous bases; interactions between other constituent groups, such as ribose and phosphate, can also exert a significant influence. Importantly, a systematic analysis of these interactions allows us to understand how the stability of nucleotide assemblies in various cellular organelles is affected by different nucleotide groups. To this end, we performed PMF calculations evaluating ribose–ribose (FigureA,B) and phosphate–phosphate (FigureC,D) interactions, considering both cis and trans configurations (see snapshots in Figure). In the cis configuration, the nonfacing groups are arranged symmetrically, whereas in the trans configuration, the same groups are oriented to oppose each other (see snapshots in FigureB,D). In general, ribose–ribose interaction profiles exhibit weak attractive energies for both cis and trans orientations (FigureA), particularly when compared to canonical base pairing (Figure). The trans configurations show slightly less attractive interactions than the cis configurations (FigureB), except for the GU pair, likely due to the lower affinity between these nucleotides, which renders their ribose–ribose binding less orientationally dependent. Crucially, ribose–ribose interactions are largely insensitive to the specific nucleotide pair, as the magnitude of the well depth is comparable across the different studied cases. This observation suggests that ribose forms transient interactions rather than driving nucleotide association or RNA phase separation.
(A) PMF profiles as a function of the center-of-mass (COM) distance between the riboses of the canonical AU, CG, and GU wobble pairs in cis (solid line) and trans conformation (dashed line). (B) Free energy interaction from the PMF minimum (panel A) for the configurations shown in the snapshots, including arrows to indicate the dissociation direction. (C) PMF profiles as a function of the center-of-mass (COM) distance between the interacting phosphates of the canonical AU, CG, and GU wobble pairs in cis (solid line) and trans conformation (dashed line). (D) Free energy interaction from the PMF minimum (panel C) for the configurations shown in the snapshots, including arrows to indicate the dissociation direction. (E) Snapshot of the G-quadruplex structure from the experimental PDB code 4XK0 (top) along with the PMF interaction profile as a function of the center-of-mass (COM) distance between the dissociated strand and the ternary cluster (bottom). All calculations were performed at T = 300 K, p = 1 bar, and 150 mM NaCl.
The phosphate–phosphate PMF profiles (FigureC) provide insight into the electrostatic interactions arising from their negative charges. The curves reflect primarily repulsive interactions, with cis configurations exhibiting slightly larger negative energy values, in contrast to the trans configurations (FigureD). It is important to note that the binding energy alone does not fully determine the interaction mode, as the shape of the PMF can also influence how the residues approach each other. In this context, the phosphate–phosphate PMF curves consistently display a repulsive barrier of ∼1–2 kcal/mol for all studied pairs, which hinders effective binding. Interestingly, the trans configuration of the GU pair shows an energy minimum of similar magnitude to that of the cis configuration, supporting the notion that guanine may promote relatively stable interactions between nucleotide pairs.
RNA and DNA secondary structures, governed by tertiary interactions such as G-quadruplexes, are essential for understanding the mechanisms underlying RNA liquid-like condensation versus solid-like pathological aggregation. ?,?,? G-quadruplex (G4) structures are formed through Hoogsteen-type hydrogen bonds between the bases of four guanine nucleotides, creating a tertiary structure that alters the local properties and geometry of the RNA (FigureE). The significance of this structure lies in its association with multiple pathological processes, including cancer, neurodegenerative diseases, and viral replication, which also make G4s promising therapeutic targets. ?,?,? To investigate the binding interaction strength of a G4 structure, we calculated its PMF profile using an experimental PDB structure (code 4XK0;? FigureE) with the following sequence: UGGGGU. One strand forming the G4 was pulled, while the remaining strands were restrained (see Section for details). The resulting PMF curve showed a strong interaction between the strands, with a free energy minimum of −25 kcal/mol at a COM distance of ≈0.85 nm. Each guanine is expected to form two hydrogen bonds via the N1–H donor group and the carbonyl oxygen at C6 as the acceptor. Considering that each G4 strand contains four guanines, the binding energy associated with the dissociation of one guanine from each tetrad is ∼4.2 kcal/mol. These values are alike to those found for canonical base pairs (FigureA,C), consistent with the notion that the global stability of the G4 structure arises from the cooperative accumulation of multiple hydrogen bonds. For reference, intermolecular interactions within protein cross-β-sheets exhibit binding strengths of ∼15–40 kcal/mol in PMF simulations, ?,?,? comparable to the energy estimated for the G-quadruplex, which can also contribute to kinetically arrested condensates as experimentally reported for G_4_C_2_ RNA repeats.?
Importantly, nucleotide interactions are influenced not only by their relative orientation but also by the salt concentration in the medium. Several studies have demonstrated the effect of salinity on the interaction profiles of amino acids. ?,? Salt plays a crucial role in the stability and dynamics of biomolecular condensates by modulating multivalent interactions that can either promote or inhibit condensate formation in a context-dependent manner. ?,?,?,? In particular, RNAs can undergo phase separation at low concentrations of MgCl_2_ (∼10 mM), as the ions screen electrostatic self-repulsion and enable base pairing. ?,?,?,? To investigate the role of salt in modulating nucleotide–nucleotide interactions, we first benchmarked PMF profiles for different amino acid pairs at 0, 1.5, and 3 M NaCl, comparing our results with those of Krainer et al.? (Figure S4 in the Supporting Information). Our configurations were designed to mimic the setup shown in Figure 6 of the original work.? The resulting PMF curves quantitatively matched those reported by Krainer et al., with only slight shifts in the distance scale for some pairs (e.g., Arg–Tyr, Arg–Glu). Importantly, when normalizing the global minimum energies by the highest value (Figure S4G), the trends closely reproduced the original calculations, confirming that our force field implementation accurately captures the effect of salt concentration on residue–residue protein interactions.
We next extended the analysis to nucleotides, examining canonical base pairs at NaCl concentrations ranging from 0 to 3.00 M. In FigureA–C, we show the free energy profiles for the canonical pairs at 0, 0.15, 0.50, 1.50, and 3.00 M, as indicated in the panels. For base–base interactions, the binding strength moderately varies across significant changes in the NaCl concentration, with only a significant decrease observed as salt increases from 0 to 3 M. A closer inspection of the energy minima (see insets in FigureA–C) revealed that the minimum energy becomes less negative compared to the 0 M reference: by 0.8 kcal/mol for CG, 0.5 kcal/mol for AU, and 0.3 kcal/mol for GU. While the trend appears to be more pronounced for the CG pair since its PMF minimum is deeper in absolute value, the relative variation as a function of salt concentration is similar for the three distinct base pairs, as shown in FigureD. When normalizing the free energy minima by the value at 3 M NaCl, the slight progressive decrease in stability with increasing salt is highlighted, confirming that salt has only a moderate effect on hydrogen-bond-mediated base pairing.
PMF profiles as a function of the center-of-mass (COM) distance between the canonical nucleotide pairs AU (A), CG (B), and GU wobble (C), facing their nitrogenous bases, and for several NaCl concentrations from 0 to 3 M using the AMBER03ws force field. Insets show a zoom on the minima of the PMF profiles. (D) Free energies derived from the minima of the PMF profiles shown in panels A–C at various NaCl concentrations, normalized with respect to the minimum value of each pair at 3 M NaCl. PMF profiles as a function of the center-of-mass (COM) distance between the phosphates of the corresponding pair obtained at different salt concentrations for the canonical nucleotide pairs AU (E) and CG (F) oriented via their phosphate groups. PMF profiles as a function of the center-of-mass (COM) distance between the riboses for AU (G) and CG (H) and interacting through the riboses. Alongside the interaction profiles are the nucleotide pair configurations used in the PMF calculations, including arrows to indicate the dissociation pathway.
Increasing the salt concentration is expected to primarily modulate electrostatic interactions, which in nucleotides are largely associated with the negatively charged phosphate groups. To explore this effect, we analyzed phosphate–phosphate interactions for the AU and CG pairs at NaCl concentrations ranging from 0.15 to 3 M, using the cis configuration with phosphate groups aligned and facing each other (Figures and ?E,F). As expected, the electrostatic repulsion is increasingly screened as salt concentration rises. In particular, the PMF for the CG pair (FigureF) shows a notable reduction in the repulsion at concentrations above 0.15 M NaCl. To further investigate this behavior, we constructed an alternative configuration in which the phosphate oxygens are directly aligned (see Figure S5). In this setup, the AU pair exhibits a pronounced decay in the PMF with increasing NaCl, similar to the cis configuration of FigureE. In contrast, the CG pair shows a nearly constant PMF minimum across all salt concentrations, although the curve smooths the energy barrier at intermediate distances (∼0.9 nm) and shifts the repulsive region to shorter distances (from ∼0.7 to ∼0.6 nm). Finally, we examined the influence of salt on ribose–ribose interactions for the canonical AU and CG pairs. Using a configuration in which the ribose groups face each other symmetrically (FigureG,H), we find that short-range interactions become slightly more attractive at higher salt concentrations. The resulting PMF profiles become significantly deeper as salt concentration is added with the free energy minimum increasing by approximately 0.5 kcal/mol, effectively doubling the well depth.
Benchmark of Amino Acid Pairwise Interactions
through Different Force Fields
3.2
Proteins are major building blocks driving the formation of biomolecular condensates, making the calculation of amino acid–nucleotide interactions central to understanding phase separation at the submolecular level. Before analyzing amino acid–nucleotide interactions, we first assessed whether amino acid interactions are similarly described by the AMBER03ws/TIP4P2005 force field, compared to a99SB-disp/TIP4P-disp, a widely used model for evaluating amino acid and peptide interactions. ?,?,?,?,? To this end, we computed PMF profiles for five amino acid pairs abundant in phase-separating proteins: Ala–Ala, Arg–Arg, Arg–Asp, Arg–Tyr, and Tyr–Tyr, arranging the side chains to face each other (FigureA–E; see Section for further details). The resulting PMFs showed energetically consistent profiles between the two force fields and aligned well with the expected behavior. ?,? Specifically, Arg–Tyr (cation−π?) and Tyr–Tyr (π–π ?,? ) exhibited the highest binding affinities. Differences between the force fields are also observed: AMBER03ws predicts deeper minima for Arg–Asp, Arg–Tyr, and Tyr–Tyr pairs compared to a99SB-disp, reflecting distinct capacities to capture intermolecular interactions between side chains. ?,? Conversely, a99SB-disp favored attractive interactions between like-charged residues such as Arg–Arg, whereas AMBER03ws showed a stronger affinity for oppositely charged pairs, such as Arg–Asp (FigureC). FigureF summarizes these results, showing the PMF energy minima for each pair normalized by the Arg–Tyr interaction, the strongest residue–residue interaction stabilizing protein condensates.? Overall, while a99SB-disp yields slightly more attractive interactions for Ala–Ala and Arg–Arg, AMBER03ws predicts stronger binding for Arg–Asp and Tyr–Tyr. Nevertheless, both force fields provide a consistent description of the interaction binding modes relevant for protein condensate formation.
PMF profiles along the COM distance between the side chains of the corresponding interacting amino acids obtained with AMBER03ws (blue curves) and a99SB-disp (pink curves) for the amino acid pairs alanine–alanine (A), arginine–arginine (B), arginine–aspartic acid (C), arginine–tyrosine (D), and tyrosine–tyrosine (E), oriented to face through their side chains, at 300 K, 1 bar, and 150 mM NaCl. The amino acid pair configurations used in the calculations and the arrows to indicate the dissociation direction are shown next to the interaction profiles. (F) Free energy interaction from the minima of the PMF profiles for both models, normalized with respect to the binding energy of the arginine–tyrosine pair.
Characterization of Nucleotide–Amino
Acid Interactions across Distinct Binding Modes
3.3
RNA represents a fundamental modulator of biomolecular condensation by establishing heterotypic interactions with proteins that regulate condensate stability and material properties. ?,?,?,?−? ? ? We therefore extended our PMF calculations to uncover the nucleotide–amino acid interactions underlying multivalent condensation. Specifically, we employed the AMBER03ws force field to characterize the interactions of the four RNA nucleotides (A, C, G, and U) with the 20 natural amino acids (Lys, Arg, Asp, Glu, His, Cys, Met, Phe, Tyr, Trp, Thr, Gln, Ser, Asn, Gly, Pro, Ala, Val, Leu, and Ile). As a first approach, and considering the large number of possible interaction modes, we systematically arranged each amino acid with its side chain oriented toward the nitrogenous base of the nucleotide (see Section for technical details). The PMF interaction profiles and the corresponding free energy minima are shown in Figures and ?, together with representative snapshots of the employed configurations. As shown, nucleotide–amino acid interactions between the base and the side chain are generally weaker than canonical base–base interactions (Figure) but comparable to amino acid–amino acid interactions under physiological conditions (see Figure and refs ?, ?, and ? ). Interestingly, seven amino acids (Lys, Arg, Asp, Glu, Gln, Ser, and Asn) exhibited significant interactions with most nucleotides, especially C and G.
PMF profiles along the COM distance between the base of the nucleotide and the side chain of the amino acid obtained with AMBER03ws for the interactions between the amino acids lysine (A), arginine (B), aspartic acid (C), glutamic acid (D), histidine (E), cysteine (F), methionine (G), phenylalanine (H), tyrosine (I), and tryptophan (J) with the nucleotides adenine (pink curve), cytosine (yellow curve), guanine (blue curve), and uracil (green curve). The amino acids were oriented with their side chains facing the nitrogenous bases of the nucleotides, as shown next to each interaction profile, including arrows to indicate the dissociation direction. (K) Absolute minimum PMF values of each nucleotide–amino acid pair from the interaction profiles shown in (A–J).
PMF profiles as a function of the COM distance between the base of the nucleotide and the side chain of the amino acid obtained with AMBER03ws for the interactions between the amino acids threonine (A), glutamine (B), serine (C), asparagine (D), glycine (E), proline (F), alanine (G), valine (H), leucine (I), and isoleucine (J) with the nucleotides adenine (pink curves), cytosine (yellow curve), guanine (blue curve), and uracil (green curve). The amino acids were oriented with their side chains facing the nitrogenous bases of the nucleotides, as shown next to each interaction profile, including arrows to indicate the dissociation direction. (K) Absolute minimum PMF values of each nucleotide–amino acid pair from the interaction profiles shown in (A–J).
Taking a closer look at the positively charged amino acids (i.e., Arg and Lys; FigureA,B,K), we observed that both arginine and lysine exhibited a strong affinity for C and G, whereas their binding to A and U was significantly weaker (≤1 kcal/mol). Moreover, we find that Arg interacts more strongly than Lys with all nucleotides except uridine. This result is consistent with the expected role of arginine as a sticker, in contrast to lysine. ?,?−? ? ? For example, arginine-rich regions are fundamental for the recruitment of RNA strands in FUS condensates. ?,?,? Indeed, the demonstrated ability of Arg to form persistent cation−π and electrostatic interactions clearly explains its strong binding to nucleotides. ?,? We attribute the stronger attraction between Arg and G/C to the greater effective polarizability of these bases compared to A and U. Interestingly, the interaction of histidinewith an effective charge of approximately +0.5e due to its partial protonation of the imidazole side chain (Figure)is negligible compared to arginine and lysine, with only cytosine showing moderate affinity. Moreover, negatively charged amino acids (Asp and Glu; FigureC,D,K) also exhibited significant binding to certain nucleotides through their base. However, the negative charge of the nucleotide phosphate group can hinder direct interactions between these residues and the nucleotides when the binding does not occur directly through the base. We ascribe the attractive weak interactions to charge screening by solvent, ions, and the exposure of the nucleotide base in this configuration, which likely reduces the electrostatic contrast between the two groups.
Aromatic residues (Tyr, Phe, and Trp) are typically expected to display high affinity toward RNA strands through π–π interactions. ?,?,? However, our results reveal negligible binding (FigureH–K), potentially due to the imposed orientation, which hinders π–π stacking (we will later discuss favorable binding modes which promote π–π interactions). In contrast, polar amino acids (Gln, Ser, Asn; FigureB–D,K) exhibited significant binding energies (∼2–3 kcal/mol). For instance, the amide group in Asn can form hydrogen bonds with all nucleotides, as suggested by our calculations (FigureD,K). Similarly, the amide group in Gln promotes persistent interactions with C, G, and U, whereas interactions with A are more limited due to the absence of carbonyl groups in its structure. Analogously, Ser can form hydrogen bonds between its hydroxyl group and the nucleobase. Interestingly, our results showed that Thr interacts only weakly with the nucleotides (E ≲ 1 kcal/mol; FigureA,K) despite its polar nature. This is consistent with the simpler chemical structure of threonine, which might render its side chain ineffective in stabilizing base interactions.
The remaining amino acids not discussed above (Cys, Met, Gly, Pro, Ala, Val, Leu, and Ile; FiguresF,G,K and ?C,E–K) exhibited negligible interactions (E ≲ 1 kcal/mol), thus contributing little to protein–RNA interactions that stabilize phase separation. From the nucleotide perspective, cytosine (yellow) and guanine (blue) appear to be the most relevant residues in protein–RNA interactions, showing binding affinities of up to ∼3 kcal/mol with several amino acids (e.g., Arg, Glu, and Gln). Uridine (green) can also interact significantly with representative amino acids, although the binding strengths are generally lower than those observed for C and G. In contrast, adenosine (red) displays only weak interactions; primarily, its strongest binding occurs with Asn (FigureK) and Glu (FigureK), while the remaining amino acids exhibited negligible affinity toward it through the nucleobase. Interestingly, polyA is among the most prevalent RNA strands in stress granules, where it recruits RNA-binding proteins such as TDP-43, FUS, or G3BP1. ?,?−? ? ? Therefore, the main interaction mode of adenosine with amino acids is likely modulated through alternative mechanisms (i.e., nonspecific electrostatic interactions through the phosphate group). To further analyze nucleotide–amino acid interactions involving the nucleotide base, we computed the integral of the negative part of the PMF curves (Figure S6 in the Supporting Information). The overall trend observed for the PMF minima is preserved across the integral analysis. Notably, the most significant differences arise for polar residues (Gln, Ser, and Asn), whose effective interactions appear weaker than in Figures and ?. This reduction can be attributed to the shape of the PMF curve at intermediate distances and the narrower width of the potential well at the global minimum.
Furthermore, we focused on alternative specific nucleotide–amino acid interactions that are particularly relevant to biomolecular condensation. In particular, we explored: (i) the interactions between charged amino acids (Arg, Lys, Glu, and Asp) and the phosphate group of the nucleotide (FigureB); (ii) the binding of aromatic residues (Tyr, Phe) to the nucleobase through π–π stacking (FigureA); and (iii) the interaction between tyrosine and the ribose moiety of cytosine and uridine (FigureC). The corresponding PMF curves are shown in Figure S7 of the Supporting Information, together with representative snapshots of the configurations employed. Our calculations for π–π stacking interactions between the aromatic rings of Tyr and Phe and the nitrogenous bases of the nucleotides (FigureA) revealed strong binding energies (5–6 kcal/mol), which are similar across all nucleotide–amino acid pairs. Remarkably, these interaction energies are comparable to those associated with CG base pairing (Figure), underscoring the recognized role of aromatic contacts in the formation and stabilization of RNA–protein condensates ?,?,? and consistent with previous computational PMF calculations.?
(A) PMF values from π–π stacking interactions between the bases of the nucleotides and the side chains of the aromatic amino acids including tyrosine and phenylalanine with adenine (red bars), cytosine (orange bars), guanine (blue bars), and uracil (green bars). (B) Interaction PMF minima of selected nucleotide–amino acid pairs via the phosphate group of the nucleotides and the side chain of the amino acids. (C) Interaction PMF minima between ribose moieties of the nucleotides and the side chain of the amino acids, oriented in t-shaped conformations (solid bars) and π–π stacking conformations (striped bars), for tyrosine interacting with cytosine (green bars) and uracil (purple bars). All the calculations were performed with AMBER03ws at 300 K, 1 bar, and 150 mM NaCl.
To evaluate the electrostatic interactions (FigureB), we employed different representative nucleotides for each of the charged amino acids (Arg, Lys, Glu, and Asp) and also included the aromatic residue Tyr as a cross-check. In principle, the specific identity of the nucleotide should not dramatically affect the interaction between the phosphate group and the amino acid side chain; therefore, only two nucleotides were used for each amino acid type. Our results showed that the interaction with arginine (red) is the strongest, reaching approximately 3.5–4.0 kcal/mol. As expected, the computed binding PMF profiles between the phosphate group and the amino acid side chain suggested unfavorable interactions in several cases (U–Glu, G–Glu, and C–Asp; see Figure S7C). Lysine, despite its positively charged side chain, only showed a moderately attractive interaction for the phosphate group compared to arginine. This might be attributed to the entropic penalty associated with forming a stable contact: lysine must first displace tightly bound water molecules, and the energy gained upon forming the Lys–phosphate bridge might not sufficiently compensate for this cost to represent a strong energetic gain. These findings reinforce previous experimental observations highlighting the dominant role of arginine, as opposed to lysine, in driving RNA–protein phase-separation. ?,?,?,? Finally, our results confirm that the interaction between the phosphate group and different amino acids is rather independent of the specific nucleotide considered.
Finally, we calculated the interaction between tyrosine and the ribose moiety of cytosine and uridine, considering two distinct conformations (see Figure S7 for representative snapshots): one in which both rings are stacked (referred to as π–π stacking), and another where the tyrosine side chain is oriented perpendicular to the ribose plane (t-shaped; Figure S7D). For both nucleotide–Tyr pairs, we observed that the π–π stacking interactions were significantly stronger than the t-shaped ones, although these π–π interactions were weaker than those between the aromatic ring of Tyr and the nitrogenous bases of C and U (FigureA). Importantly, while the t-shaped conformation yielded negligible binding compared to interactions typically considered significant (E ≳ 2 kcal/mol), the π–π stacking between Tyr and the ribose displayed comparable strength to amino acid–amino acid interactions such as Tyr–Tyr interactions, thereby notably contributing to the stabilization of RNA–protein condensates.
Conclusions
4
In this work, we present an extensive analysis of nucleotide and amino acid interactions relevant to the formation of biomolecular condensates. Specifically, we investigated numerous configurations of nucleotide–nucleotide, amino acid–amino acid, and nucleotide–amino acid pairs by applying potential-of-mean-force calculations in atomistic MD simulations. We examined different binding modes of nucleotide–nucleotide pairs by computing PMFs for canonical Watson–Crick pairs (CG and AU) as well as for the GU wobble pair, using both AMBER03ws/TIP4P2005 and CHARMM36/TIP4P2005 force-field combinations (Figure). Our results showed excellent agreement between the two force fields and reproduced the expected energetic hierarchy according to the hydrogen bond number that each canonical base pair can establish (CG > AU ≥ GU). Furthermore, the analysis of non-canonical base pairs revealed that CC interactions are the most energetically favorable among them, while other non-canonical pairs exhibited variable but generally lower stability compared to the canonical ones (Figure). A geometric analysis based on the C1′–C1′ distance further supported these energetic trends (Figure S1), in agreement with previous experimental observations. ?−? ?
We also analyzed alternative binding modes of the nucleotides through their ribose and phosphate groups, considering both cis and trans conformations to assess the effect of the nucleotide orientation (FigureA–D). Ribose–ribose interactions exhibited modest attractive energies, with cis orientations being marginally more favorable. Similarly, phosphate–phosphate interactions displayed PMF profiles with very shallow energy wells, reflecting the significant electrostatic repulsion between negatively charged groups. Remarkably, cis configurations were more stable than trans orientations, and among them, pairs involving guanine exhibited the most favorable (though still weak) binding affinity. RNA tertiary interactions were further characterized by computing the energetic landscape of a G-quadruplex structure (FigureE). Our results demonstrated strong cooperative interactions among the RNA strands, yielding strong binding strengths comparable to those reported for protein cross-β-sheets. ?,? Crucially, this high stability arises from the formation of multiple Hoogsteen-type hydrogen bonds between guanine units. Finally, we examined the effect of salt concentration on various nucleotide interaction modes (Figure). Increasing salt concentration leads to a moderate reduction in the binding between bases and a moderate increase between ribose-like interactions. Notably, a pronounced weakening of phosphate–phosphate repulsive interactions due to screening of short-range electrostatic forces also take place.
We further characterized residue–residue interactions by extending our PMF calculations to amino acids. First, we benchmarked amino acid pair interactions using AMBER03ws against the a99SB-disp force field, finding near-quantitative agreement (Figure). We then estimated the interactions between all 20 amino acids and the four RNA nucleotides by orienting the nucleobase toward the amino acid side chain (Figures and ?). Only a subset of amino acids (Lys, Arg, Asp, Glu, Gln, Ser, and Asn) established significantly attractive interactions with most nucleotides, primarily driven by hydrogen bonding and cation−π interactions. To complement this analysis, we examined specific biologically relevant interactions, including π–π stacking, phosphate–side chain contacts, and ribose–side chain interactions (Figure). We identified π–π stacking between Tyr or Phe and nucleobases as a strong and relevant binding mode involved in phase separation. ?,? Moreover, electrostatic interactions with the phosphate group were particularly significant for arginine, in agreement with experimental observations of protein–RNA condensates. ?,? Finally, ribose−π stacking emerges as a major potential RNA–protein interaction in biomolecular condensates. Taken together, our results revealed specific binding patterns between nucleotides and amino acids, providing insight into how residue-level interactions influence molecular organization on larger scales in biomolecular condensates.
Supplementary Material
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Boeynaems S.Alberti S.Fawzi N. L.Mittag T.Polymenidou M.Rousseau F.Schymkowitz J.Shorter J.Wolozin B.Van Den Bosch L.Protein phase separation: a new phase in cell biology Trends Cell Biol.20182842043510.1016/j.tcb.2018.02.00429602697 PMC 6034118 · doi ↗ · pubmed ↗
- 2Hyman A. A.Weber C. A.Jülicher F.Liquid-liquid phase separation in biology Annu. Rev. Cell Dev. Biol.201430395810.1146/annurev-cellbio-100913-01332525288112 · doi ↗ · pubmed ↗
- 3Brangwynne C. P.Eckmann C. R.Courson D. S.Rybarska A.Hoege C.Gharakhani J.Jülicher F.Hyman A. A.Germline P granules are liquid droplets that localize by controlled dissolution/condensation Science 20093241729173210.1126/science.117204619460965 · doi ↗ · pubmed ↗
- 4Razin S.Gavrilov A.The role of liquid–liquid phase separation in the compartmentalization of cell nucleus and spatial genome organization Biochemistry (Moscow)20208564365010.1134/S 000629792006001232586227 · doi ↗ · pubmed ↗
- 5Molliex A.Temirov J.Lee J.Coughlin M.Kanagaraj A. P.Kim H. J.Mittag T.Taylor J. P.Phase separation by low complexity domains promotes stress granule assembly and drives pathological fibrillization Cell 201516312313310.1016/j.cell.2015.09.01526406374 PMC 5149108 · doi ↗ · pubmed ↗
- 6Luo Y.Na Z.Slavoff S. A.P-bodies: composition, properties, and functions Biochemistry 2018572424243110.1021/acs.biochem.7b 0116229381060 PMC 6296482 · doi ↗ · pubmed ↗
- 7Sabari B. R.Biomolecular condensates and gene activation in development and disease Dev. Cell 202055849610.1016/j.devcel.2020.09.00533049213 · doi ↗ · pubmed ↗
- 8Antifeeva I. A.Fonin A. V.Fefilova A. S.Stepanenko O. V.Povarova O. I.Silonov S. A.Kuznetsova I. M.Uversky V. N.Turoverov K. K.Liquid–liquid phase separation as an organizing principle of intracellular space: Overview of the evolution of the cell compartmentalization concept Cell. Mol. Life Sci.20227925110.1007/s 00018-022-04276-435445278 PMC 11073196 · doi ↗ · pubmed ↗
