Sequence dependent aggregation of peptides and fibril formation
Nguyen Ba Hung, Duy-Manh Le, and Trinh X. Hoang

TL;DR
This study uses Monte Carlo simulations to explore how amino acid sequence influences peptide aggregation and fibril formation, revealing sequence-dependent structural heterogeneity and aggregation thermodynamics.
Contribution
It introduces a coarse-grained model to analyze sequence effects on peptide aggregation, highlighting the role of specific patterns and sequence composition in fibril formation.
Findings
Fibril-like aggregates form for sequences with HPH pattern.
Aggregation transition temperatures vary widely among sequences.
Fibril formation follows nucleation and growth mechanism.
Abstract
Deciphering the links between amino acid sequence and amyloid fibril formation is key for understanding protein misfolding diseases. Here we use Monte Carlo simulations to study aggregation of short peptides in a coarse-grained model with hydrophobic-polar (HP) amino acid sequences and correlated side chain orientations for hydrophobic contacts. A significant heterogeneity is observed in the aggregate structures and in the thermodynamics of aggregation for systems of different HP sequences and different number of peptides. Fibril-like ordered aggregates are found for several sequences that contain the common HPH pattern while other sequences may form helix bundles or disordered aggregates. A wide variation of the aggregation transition temperatures among sequences, even among those of the same hydrophobic fraction, indicates that not all sequences undergo aggregation at a presumable…
| Sequence name | Sequence | s |
|---|---|---|
| S1 | P P P H H P P P | 1 |
| S2 | P P H P H P P P | 2 |
| S3 | P P H P P H P P | 3 |
| S4 | P H P P P H P P | 4 |
| S5 | P H P P P P H P | 5 |
| S6 | H P P P P P H P | 6 |
| S7 | H P P P P P P H | 7 |
| S8 | P P H H H P P P | 1 |
| S9 | P P H P H H P P | 1 |
| S10 | P H P P H H P P | 1 |
| S11 | P H P H P H P P | 2 |
| S12 | P H P P H P H P | 2 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Sequence dependent aggregation of peptides and fibril formation
Nguyen Ba Hung
Institute of Physics, Vietnam Academy of Science and Technology, 10 Dao Tan, Ba Dinh, Ha Noi, Viet Nam
Graduate University of Science and Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Ha Noi, Viet Nam
Vietnam Military Medical University, 160 Phung Hung, Ha Dong, Ha Noi, Viet Nam
Duy-Manh Le
Institute of Research and Development, Duy Tan University, K7/25 Quang Trung, Da Nang, Viet Nam
Trinh X. Hoang
Institute of Physics, Vietnam Academy of Science and Technology, 10 Dao Tan, Ba Dinh, Ha Noi, Viet Nam
Graduate University of Science and Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, Cau Giay, Ha Noi, Viet Nam
Abstract
Deciphering the links between amino acid sequence and amyloid fibril formation is key for understanding protein misfolding diseases. Here we use Monte Carlo simulations to study aggregation of short peptides in a coarse-grained model with hydrophobic-polar (HP) amino acid sequences and correlated side chain orientations for hydrophobic contacts. A significant heterogeneity is observed in the aggregate structures and in the thermodynamics of aggregation for systems of different HP sequences and different number of peptides. Fibril-like ordered aggregates are found for several sequences that contain the common HPH pattern while other sequences may form helix bundles or disordered aggregates. A wide variation of the aggregation transition temperatures among sequences, even among those of the same hydrophobic fraction, indicates that not all sequences undergo aggregation at a presumable physiological temperature. The transition is found to be the most cooperative for sequences forming fibril-like structures. For a fibril-prone sequence, it is shown that fibril formation follows the nucleation and growth mechanism. Interestingly, a binary mixture of peptides of an aggregation-prone and a non-aggregation-prone sequence shows association and conversion of the latter to the fibrillar structure. Our study highlights the role of sequence in selecting fibril-like aggregates and also the impact of structural template on fibril formation by peptides of unrelated sequences.
I Introduction
The phenomenon in which soluble proteins or protein fragments self-assemble into insoluble aggregates is considered as a fundamental issue of protein folding with serious impact on human health Chiti and Dobson (2006). A predominant class of these aggregates, that have a long straight shape and are rich in -sheets, known as amyloid fibrils, is associated to a range of debilitating human pathologies, such as Alzeihmer’s, Parkinson’s, type II diabetes and transmissible spongiform encephalopathies Riek and Eisenberg (2016). These fibrils, formed by numerous proteins and peptides including those unrelated to disease Jimenez et al. (1999), have strikingly similar structural features regardless of the amino acid sequence. An widely adopted view is that the tendency of forming amyloid fibrils is a common property of all proteins, supposedly due to their common polypeptide backbone Dobson (1999). It has been shown that poly-aminoacids can also form amyloid under appropriate condition Fändrich and Dobson (2002). However, the propensity of a given polypeptide to form amyloid fibrils as well as the condition under which they form depends very significantly on its amino acid sequence showing that the problem is much more complex than it could be initially thought of but also giving hope for curing amyloid diseases Ventura (2005).
X-ray fiber diffraction data indicate that amyloid fibrils are commonly characterized by the cross--sheets with strands running perpendicularly to the fibril’s longitudinal axis Sunde et al. (1997). The cross--structures at atomic resolution have been obtained for the fibrils of a few proteins and protein fragments including those of insulin Jiménez et al. (2002), -amyloid peptide Petkova et al. (2002), yeast prion protein sup35p van der Wel, Lewandowski, and Griffin (2007), HET-s prion Wasmer et al. (2008), and -synuclein Tuttle et al. (2016) by using cryo-electron microscopy, X-ray and solid-state NMR. It is found that they are highly ordered and composed of -strands of the same segments of repetitive protein molecules. Between the mated -sheets is a complete dry and complementary packing of amino acid side chains with a well-formed hydrophobic core Sawaya et al. (2007). Even though there are evidence of polymorphism Petkova et al. (2005) in amyloid fibrils, the observed packing of side chains in the resolved structures has suggested that the amino acid sequence dictates much the amyloid fold Meier and Böckmann (2015), in the same manner as in protein folding.
The sequence determinant of amyloid formation has been studied with various experi-mental West et al. (1999); Wurth, Guimard, and Hecht (2002); Chiti et al. (2002); Ventura et al. (2004); de la Paz et al. (2002); de la Paz and Serrano (2004); Alberti et al. (2009) and theoretical de la Paz et al. (2005); Bellesia and Shea (2007); Li et al. (2010); Abeln et al. (2014) approaches. It has been shown that the overall hydropho-bicity Chiti et al. (2002) and net charge de la Paz et al. (2002) of a peptide, to some extent, may impact the aggregation rate. There are increasing evidence that the capability of a protein to form amyloids strongly depends on certain short amino acid stretches in the sequence Wurth, Guimard, and Hecht (2002); Chiti et al. (2002); Ventura et al. (2004). To support a proteome-wide search for aggregation-prone peptide segments, a number of predictors have been made available Fernandez-Escamilla et al. (2004); Trovato, Seno, and Tosatto (2007); Maurer-Stroh et al. (2010). However, the problem still substantially needs better understanding.
In this study, we investigate the selectivity of aggregate structures by the amino acid sequence and the mechanism of fibril formation by using the tube model of protein developed by Hoang et al. Hoang et al. (2004). The latter is a Cα-based model exploiting the tube-like symmetry Maritan et al. (2000) of a polypeptide chain and geometrical constraints imposed by hydrogen bonds Banavar et al. (2004). Such symmetry and geometry consideration leads to a presculpted free energy landscape Hoang et al. (2004) with marginally compact protein-like ground states and low energy minima Trovato et al. (2005); Hoang et al. (2006a). Interestingly, the model also shows a strong tendency of multiple chains to form amyloid-like aggregates Banavar et al. (2004); Hoang et al. (2006b), similar to that found in higher resolution models Nguyen and Hall (2004, 2005); Bellesia and Shea (2007). Extensive simulations have been carried out by Auer and coworkers Auer, Dobson, and Vendruscolo (2007); Auer et al. (2008); Auer and Kashchiev (2010); Auer (2011) to study the fibril formation of 12-mer homo-peptides using the tube model with a slightly different constraint on self-avoidance, showing useful insights on the nucleation mechanism Auer, Dobson, and Vendruscolo (2007); Auer et al. (2008) of fibril formation and on the equilibrium conditions between the fibrillar aggregates and the peptide solution Auer and Kashchiev (2010); Auer (2011). In the present study, we focus on the impact of amino acid sequence on the aggregation properties in the tube model with a renewed consideration of hydrophobic interaction. In the original tube model, the latter was based on an isotropic contact potential between centroids represented by the Cα atoms. We introduce here a new model for hydrophobic contact between amino acids that takes into account the side chain orientations. We find that the latter can direct the interaction between -sheets and promote the formation of ordered and elongated fibril-like aggregates.
We restrict ourself to hydrophobic-polar (HP) sequences and short peptides of length equal to 8 residues. The consideration of HP sequences is a minimalist approach in terms of sequence specificity, however is well supported in protein folding Dill and Chan (1997); Hoang et al. (2006b). Furthermore, the rather simplicity of amyloid fibril structures also indicates a possible simplification of the amino acid sequence in determining aggregation properties. It will be shown that even with a short length and a few sequences, the systems considered already exhibit a rich behavior in the morphologies of the aggregates and in their thermodynamic properties.
For an aggregation-prone sequence, we have studied also the kinetics of fibril formation. We will try to elucidate the nucleation and growth mechanism of this process at molecular detail and show evidence of a lag phase. Finally, we have studied a binary mixture of peptides of two different sequences and find that amyloid formation can be sequence non-specific, that is a fibril-like template formed by an aggregation-prone sequence may induce aggregation of a non-aggregation-prone sequence for a fraction of all peptides. This strong impact of the template decreases somewhat the sequence determination of aggregation propensity and suggests that amyloid fibrils could be heterogeneous in their peptide composition.
II Models and methods
Details of the tube model can be found in Ref. Hoang et al. (2004). Briefly, it is a Cα-based coarse-grained model, in which the Cα atoms representing amino acid residues are placed along the axis of a self-avoiding tube of cross-sectional radius . The finite thickness of the tube is imposed by requiring the radius of circle drawn through any three atoms must be larger than Gonzalez and Maddocks (1999); Maritan et al. (2000). The energy of a given conformation is the sum of the bending energy, hydrogen bonding energy and hydrophobic interaction energy. A local bending energy penalty of , with an energy unit, is applied if the chain local radius of curvature at a given bead is less than 3.2 . Hydrogen bonds between amino acids are required to satisfy a set of distance and angular constraints on the local properties of the chain as found by a statistical analysis of protein PDB structures Banavar et al. (2004). Local hydrogen bond, which is formed by residues separated by three peptide bonds along the chain, is given an energy of , whereas non-local hydrogen bond is given an energy of . Additionally, a cooperative energy of is given for each pair of hydrogen bonds that are formed by pairs of consecutive amino acids in the sequence. To avoid spurious effects of the chain termini, hydrogen bonds involving a terminal residue are given a reduced energy of .
Hydrophobic interaction is based on the pairwise contacts between amino acids, considered to be either hydrophobic (H) or polar (P). It is also assumed that only contacts between H residues are favorable, and thus the contact energies of different residues pairing are , and . In the original tube model, a contact is defined if the distance between two residues is less than 7.5 . In the present study, we apply an additional constraint on hydrophobic contact by taking into account the side chain orientation Hung and Hoang (2015) (Fig. 1a,b). The latter are approximately given by the inverse direction to the normal vector Banavar et al. (2009) at the chain’s local position. The new constraint requires that two residues and make a hydrophobic contact if and where and are the normal vectors of the Frenet frames associated with bead and , respectively; is an unit vector pointing from bead to bead ; and . These vectors are given by
[TABLE]
and
[TABLE]
where is the position of bead . The new constraint is in accordance with the statistics drawn from an analysis of PDB structures (Fig. 1b).
We consider 12 HP sequences of length as given in Table I. The sequences, denoted as S1 through S12, are selected in such a way that they contain only 2 or 3 H residues, corresponding to hydrophobic fraction of 25% and 37.5%, respectively. We have chosen sequences that are symmetric as much as possible from the two ends having in mind that the relative positions of the H residues are more important than their absolute positions in the sequence. One characterization of these relative positions is the minimum separation between two consecutive H residues given by the parameter in Table I.
We will study systems of peptides in a cubic box of size with periodic boundary conditions. For a given peptide concentration , the box size is calculated depending on as . For example, for mM (millimolar) and one gets . Parallel tempering Swendsen and Wang (1986) Monte Carlo schemes with 16-24 replicas at different temperatures are employed for obtaining the ground state and equilibrium characteristics. For each replica, the simulation is carried out with pivot, crankshaft and translation moves and with the Metropolis algorithm for move acceptance at its own temperature . A replica exchange attempt is made every 10 MC sweeps (one sweep corresponds to a number of move attempts equal to the number of residues). The exchange of replicas and is accepted with a probability , where is the inverse temperature, is the Boltzmann constant, and and are the energies of the replicas at the time of the exchange.
The temperature range in parallel tempering simulations are chosen such that it covers the transition from a gas phase of separated peptides at a high temperature to the condensed phase of the aggregates at a low temperature. The replica temperatures are chosen such that acceptance rates of replica exchanges for neighboring temperatures are significant, of at least about 20%. Practically, one needs to change the set of temperatures several times in such a way that there are more temperatures near the specific heat’s peak, where the energy fluctuation is large. For example, for sequence S2 with , the final set of temperatures for 20 replicas is {0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.212, 0.214, 0.216, 0.218, 0.22, 0.222, 0.224, 0.226, 0.228, 0.23, 0.24, 0.25, 0.26} in units of . The number of Monte Carlo attempted moves is of the order of per replica. The weighted multiple histogram technique Ferrenberg and Swendsen (1989) is employed for the calculation of equilibrium properties such as the specific heat and the effective free energy.
For studying the kinetics of fibril growth, we carry out multiple independent Monte Carlo simulations that start from random configurations of dispersed monomers. These initial configurations are equilibrated at a high temperature before being used. We are interested in three quantities: the number of aggregates, the maximum size of the aggregates, and the number of peptides in -sheet conformation during the time evolution. A peptide is said to be in a -sheet conformation if it forms at least 4 consecutive hydrogen bonds with another peptide.
III Results
III.1 Sequence dependence of aggregate structures
We first study the dependence of aggregate structure on the amino acid sequence for systems of identical peptides at a fixed concentration of 1 mM. Fig. 2 shows that the lowest energy conformation obtained in the simulations, supposed to be the ground state of a given system, strongly depends on the sequence. Two sequences, S2 and S11, form a double layer -sheet structure with characteristics similar to that of a cross- structure. In these structures, an axis of the aggregate approximately perpendicular to the -strands can be drawn. A similar structure but less fibril-like is also found for sequence S12 with some parts that are non--sheet. Both sequences S3 and S4 form a -helix bundle. The helix bundle of sequence S4 however is more ordered and has an approximate cylinder shape, in which the -helices are almost parallel to each other. This type of aggregate is akin to non-amyloid filaments formed by globular proteins such as the actin filament Hanson and Lowy (1963). Other sequences form some sorts of disordered aggregates. In these disordered structures one may also find a significant amount of -sheets. In our model, residues participating in consecutive local and non-local hydrogen bonds are identified as forming -helix and -sheet, respectively Hoang et al. (2004).
The role of hydrophobic residues in aggregation can be figured out from the structures of the aggregates. In all cases, one finds the presence of a well-formed hydrophobic core with the putative hydrophobic side chains oriented inwards to the body of the aggregate. The packing of hydrophobic side chains is best observed for sequences S2 and S11, for which the hydrophobic residues are aligned within each -sheet and the hydrophobic side chains from the two -sheets are facing each other. This packing is possible due to the HPH pattern in these sequences which position the hydrophobic side chains on one side of each -sheet. An alignment of hydrophobic residues is also seen for sequence S12 due to the HPH segment of this sequence. In the aggregate of sequences S4, which is a helix bundle, the hydrophobic side chains are gathered along the bundle axis, thanks to to the alignment of hydrophobic side chains along one side of each -helix. This alignment is due to the HPPPH pattern in the S4 sequence. On the other hand, the S3 sequence with the HPPH pattern also forms a helix but the hydrophobic side chains are not well aligned in the helix, leading to a less ordered aggregate.
The structure of the aggregate also depends on the number of chains . In Fig. 3 and Fig. 4, the ground states for varying between 1 and 10 are shown for sequence S2 and S4, respectively. Interestingly, for sequence S2 (Fig. 3) as increases one sees transitions from single helix to two-helix bundle, then to single -sheet () and to double -sheets (). One can also notice that as increases the -sheet aggregates become more ordered and more fibril-like as their -strands become more parallel. For sequences S4 (Fig. 4), only helix bundles are formed for all , but the bundle also becomes more ordered as increases. Thus, the increasing orderness with the system size is observed for both -sheet and -helical aggregates.
III.2 Thermodynamics of aggregation
It can be expected that the thermodynamics of aggregation depend on the aggregate structure due to distinct contributions of intermolecular and intramolecular interactions in different structures. Furthermore, the formations of ordered and non-ordered aggregates can be different from the perspective of a phase transition. We will consider the the system’s specific heat, , for the analysis of the thermodynamics. We are particularly interested in the temperature of the main peak of the specific heat, , and the peak height, . corresponds to the aggregation transition temperature. Higher means a more stable aggregate, whereas higher indicates that the aggregation transition is more cooperative Kaya and Chan (2000). For all multi-peptide systems considered, it is found that the energy distribution at has a bimodal shape, suggesting that the aggregation transition is first-order like. Note that the discontinuity of the aggregation transition has been also shown for the simple off-lattice AB model without the directional hydrogen bonds Junghans, Bachmann, and Janke (2006).
We find that the specific heat strongly depends on both the sequence and the system size. Fig. 3 and Fig. 4 show the temperature dependence of the specific heat per molecule for various system sizes for sequences S2 and S4, respectively. For sequence S2, the case in which fibril-like aggregates form, it is shown that as increases the specific heat’s peak shifts toward higher temperature and its height increases (Fig. 3). This result indicates that the aggregate becomes increasingly stable and the transition becomes more cooperative as the system size increases. The increasing cooperativeness of the aggregation transition correlates with the increasing orderness in the structure of the aggregate. For sequence S4, for which the aggregates are helix bundles, the height of the main peak increases with but the position of the peak varies non-monotonically (Fig. 4). Note that the aggregation transition for sequences S4 is always found at a slightly lower temperature than the folding transition of individual chain. This is in contrast with sequence S2, whose aggregation transition temperature is always higher than the folding temperature of a single chain.
In Fig. 5, the results of the maximum specific heat per molecule, , and the temperature of the peak, , are combined for all sequences considered and for several values of . It is shown that the variation of both and increases with . Note that for , the highest specific heat maxima correspond to sequences S2 and S11 whose aggregates are fibril-like (see Fig. 2). Apart from the absolute value of , the increase of with is also a signature of cooperativity. For sequences S2 and S11, is not only the highest among all sequences but also increases with much faster than other sequences, suggesting that these sequences have the most cooperative aggregation transitions. Our results indicates that the propensity of forming fibril-like aggregates is associated with the cooperativity of the aggregation transition.
The wide variation in the transition temperatures among sequences, as shown in Fig. 5b, suggests another interesting aspect of aggregation. Suppose that we consider the systems at the physiological temperature, . In our model, a rough estimate of could be 0.2 , which corresponds to a local hydrogen bond energy of 5 . For , one finds that all sequences but S10 has suggesting that the peptides are substantially unstructured at as a single chain. For and , only three sequences, S3, S4 and S5, have , while the other have . Thus, sequences S3, S4 and S5 do not aggregate at while other sequences do. This result indicates that the variation of aggregation transition temperatures among sequences is also a reason why protein sequences behave differently towards aggregation at the physiological temperature. Some sequences do not aggregate because aggregation is thermodynamically unfavorable at this temperature.
Note that the ability of forming fibril-like aggregates is not necessarily associated with a high aggregation transition temperature. In fact, Fig. 5b shows that sequences S2 and S11 have only a medium value of among all sequences, for both and . Some sequences with a higher , such as S8, S9 and S10, form disordered aggregates.
The dependence of specific heat on the system size also reveals a condition for aggregation. Fig. 3 shows that for sequence S2, systems of have the specific heat peaked at a lower temperature than , which means that these systems do not aggregate at . Only for , the specific heat peak temperature is higher than indicating that the fibril-like aggregates formed by this sequence are stable at . Thus, a sufficient number of peptides is needed for the aggregation to happen at a given temperature. We also find that the lower peak in the specific heat of the system of (Fig. 3) corresponds to a transition from metastable aggregates at intermediate temperature to the ground state at low temperature. Fig. 6 shows the trajectory of an equilibrium simulation at for sequences S2 with . The time dependence of the system’s energy in this trajectory indicates that the peptides do not aggregate most of the time, so that the energy is relatively high, but for some short periods they can spontaneously form a metastable aggregate of a much lower energy. This metastable aggregate has a three-stranded -sheet (Fig. 6, inset) and could act as a template for fibril growth in systems of more peptides.
III.3 Kinetics of fibril formation
It is well-established that amyloid fibril formation follows the nucleation-growth mechanism, familiar to that found in studies of crystallization and polymer growth Oosawa and Kasai (1962). The time dependence of fibril mass is characterized by an initial lag phase, during which the growth rate is small, before a period of rapid growth, resulting in sigmoidal kinetics Xue, Homans, and Radford (2008); Hellstrand et al. (2009); Knowles et al. (2009). Nucleation gives rise to the lag phase and is a rate-limiting step. A primary nucleation event corresponds to the initial formation of an amyloid-like aggregate from soluble species, which is followed by an elongation of the fibrils through the templated addition of species. Analyses of experimental kinetic data using master equation indicate that amyloid fibril growth can be dominated by secondary nucleation events such as fragmentation Knowles et al. (2009) and surface-catalyzed nucleation Ruschak and Miranker (2007). The nucleated and templated polymerization properties of fibril formation have been shown in coarse-grained Bellesia and Shea (2007); Nguyen and Hall (2004, 2005); Auer, Dobson, and Vendruscolo (2007); Auer et al. (2008) and all-atom Hills and Brooks (2007) simulations of short peptides. Studies of crystal-based lattice models by using classical nucleation theory Kashchiev and Auer (2010); Auer (2014) and simulations Zhang and Muthukumar (2009); Irbäck et al. (2013) provide characterizations of the nucleation barriers in terms of -sheet growth within a layer and intersheet couplings, together with extensive temperature and concentration dependence.
In the following, we will investigate the behavior of fibril growth within our tube model for sequence S2. Since the ground state for this sequence is a two-layered -sheet structure, we do not expect it to display very rich behavior, such as the increase of fibril thickness by multi-step -sheet layer addition. Nevertheless, the system may be useful for understanding the formation of a single protofilament.
First, we consider a system of peptides with concentration mM under equilibrium condition. Fig. 7 shows the dependence of the total free energy of the system on the size of the largest aggregate, , formed at three temperatures slightly below including . This free energy is defined as , where is the probability of observing a conformation with the largest aggregate size equal to at temperature . was determined from parallel tempering simulations with the weighted histogram method Ferrenberg and Swendsen (1989). It is shown that for all these temperatures the free energy has a maximum at , suggesting that could be the size of the critical nucleus for fibril formation. Interestingly, is also the system size at which the ground state changes from a helix bundle to a -sheet on increasing , and this -sheet is unstable at temperatures larger or equal (see Fig. 3). Thus, there is a consistency between the equilibrium data obtained with a small and a larger in terms of aggregation properties. The free energy barrier for aggregation in Fig. 7 is found to increase with and is about of to . This barrier is not large and is consistent with the fact that the sequence considered is highly aggregation-prone. For , Fig. 7 shows that the free energy decreases almost linearly with , which is consistent with the fact that the growth of the aggregate in size is essentially one-dimensional. After a certain size, new peptides join an existing aggregate from either of its two ends and establish the elongation of the -sheets.
We then considered a larger system of peptides and studied the time evolutions from random configurations of dispersed monomers. Up to 100 independent trajectories are carried out to determine the statistics. We first consider the system at concentration mM and . Fig. 8 (a and b) shows three typical trajectories with the total energy and the size of the largest aggregate as functions of time. Interestingly, these trajectories show clear evidence of an initial lag time, during which fluctuates but remains small () before a rapid and almost monotonic growth (Fig. 8 b). They also shows that nucleation is complete for , in consistency with the equilibrium analysis obtained before for . A peptide configuration at a nucleation event is shown on Fig. 8d indicating that a possible nucleus is a three-stranded -sheet formed by three peptides (Fig. 8e). Fig. 8c shows that the system can form multiple aggregates of various sizes. The distribution of the aggregate size obtained after a sufficient long time is bimodal reflecting the fact that the system size is finite and clusters of less than 4 peptides are unstable. Thus, one either observes one large cluster with size close to the system size or several smaller clusters. The largest aggregates of peptides have the form of an elongated double -sheet strongly resemble a cross--structure (Fig. 8f).
Consider now the number of peptides in -sheet conformation, , which counts all the peptides that have at least 4 consecutive hydrogen bonds with another peptide. Fig. 9 shows the dependence of on time , with measured in number of MC steps, averaged over the trajectories, for two different temperatures and for various concentrations. It is shown in Fig. 9 (a and b) that for , the time dependence of can be fitted well to the exponential relaxation function of , where is the characteristic time of aggregation. This time dependence also depends strongly on the concentration with increases more than 3 times by changing from 1 mM to 0.5 mM. There seems to be no evidence of a lag phase at as increases linearly with for small (Fig. 9b). This lack of evidence, however, may be due to the fact that the deviation from the exponential growth is too small to be observed. Indeed, we find that if the temperature is increased a little to , the lag phase can be observed. Fig. 9c shows that the growth of in time is significantly deviated from the exponential relaxation function at small time. This growth when plotted in a log-log scale (Fig. 9c) shows that at small time with . The exponent indicates that the time dependence of behaves like a convex function, which proves the existence of the lag phase at small time. The stronger evidence of the lag phase at compared to that at is consistent with the higher free energy barrier for nucleation at the former temperature previously shown in Fig. 7. Note that the lag phase has been also observed in the aggregation of homopolymers with a similar model but for a larger system Auer, Dobson, and Vendruscolo (2007).
With the limited system size and time scale considered, we have not observed fragmentation of the fibril-like aggregates. On the other hand, the surface-catalyzed nucleation may exist from perspective of a two-layer -sheet structure. The exposed hydrophobic side chains of the nucleated three-stranded -sheet promotes association of other peptides by hydrophobic attraction. We find that clusters of 4 to 6 peptides often transform into a double -sheet structure before continuing to grow. Thus, this secondary nucleation is surface-catalyzed and follows immediately after the primary nucleation event. The secondary nucleation also helps to stabilize the primary nucleus.
III.4 Aggregation of mixed sequences
Finally, we study the aggregation for a binary mixture of two sequences, S2 and S4. It was shown that in homogeneous systems, the first sequence is strongly fibril-prone, whereas the second one forms only -helices. Furthermore, the sequence S4 has the aggregation transition temperature lower than , so the its aggregate is not stable at . Strikingly, our simulations at show that in a binary system of equally 10 chains of each sequence, after a sufficiently long time, a fraction of the S4 chains aggregate and convert into -sheet conformation on an existing aggregate formed by the S2 chains (see Fig. 10). Though this fraction is only about 10% on average, this observation shows that the template-based mechanism for fibril formation can be effective for polypeptides of very different natures. Here, the fibril-like aggregate formed by the aggregation-prone peptides acts as the template for the aggregation of non-aggregation-prone peptides. Note that due to the mismatch of different hydrophobic patterns in the two sequences, the aggregates formed by the two sequences are more disordered than the homogeneous ones (Fig. 10b). It is also shown in Fig. 10c that the growth of this mixed aggregate at the given temperature remains exponential but the characteristic time for aggregation is larger than in corresponding homogeneous system of sequence S2.
IV Discussion
Previous study of the tube model Hoang et al. (2006b) has shown that hydrophobic-polar sequence can select protein’s secondary and tertiary structures. In particular, the HPPH and HPPPH patterns have been identified as strong -formers, whereas the HPH pattern is a -former. Strikingly, exactly the same binary patterns have been used in experiments that allow the successful design of de novo proteins Kamtekar et al. (1993); Wei et al. (2003). In the present study, we find that these simple selection rules still hold for the peptides in aggregates, even though the model has been changed by considering the orientations of side chains. The present study shows that the binary pattern also determines the orderness of the aggregate. In particular, there should be some compatibility between the alignment of hydrophobic side chains and the overall symmetry of the aggregate. Interestingly, the HPH pattern appears to be both a strong -former and a highly aggregation-prone sequence. Our finding is in a full agreement with experimental design of amyloids West et al. (1999), which shows that segments of alternating hydrophobic and polar pattern (such as PHPHPHP) can direct protein sequences to form amyloid-like fibrils. The effect of this pattern has been also reported in simulations of an off-lattice model Bellesia and Shea (2007) and also in a recent study of a lattice model Abeln et al. (2014). Interesting, it has been found that Nature disfavors this pattern in natural proteins West et al. (1999).
The role of side-chains in amyloid fibril formation has been stressed in early all-atom simulations of short peptides. The study by Gsponer et al. Gsponer, Haberthür, and Caflisch (2003) showed that backbone hydrogen bonds favor the antiparallel -sheet packing but side-chain interactions stabilize the in-register parallel -sheet aggregate. The simulations performed by de la Paz et al. de la Paz et al. (2005) indicated the importance of specific contacts among side-chains at specific sequence position for the formation and stabilization of -sheet oligomers and ordered fibrils. The exclude volume of side-chains alone has been show to enhance the formation of helices Banavar et al. (2009) and planar sheets Škrbić, Hoang, and Giacometti (2016). A recent lattice model showing the formation of ordered fibrils includes the side chain directionality Abeln et al. (2014). Here, we show that the correlated orientations of hydrophobic side-chains are important for both the ordered packing of -strands within a -sheet and the stacking of -sheets in the fibril. In particular, the alternating hydrophobic polar pattern leads to -sheets of hydrophobic side chains oriented on one side of the -sheet. This one-sided orientation stabilizes the two-layered -sheet aggregate, which is the system’s ground state and can grow into a long fibril, as shown for the case of sequence S2. Note that the asymmetry of hydrophobic -sheet surfaces has been considered in a lattice model Auer (2014), showing increased stabilities of multi-layered -sheets that have weak hydrophobic surfaces exposed. Our study shows how this asymmetry is induced by the sequence at molecular level.
Previous studies Auer and Kashchiev (2010); Auer (2011) have indicated that few-layered -sheet aggregates can be stable with respect to the peptide solution and to liquid-like oligomers in certain ranges of temperature and concentration, but are metastable with respect to the aggregates of large and infinite number of -sheet layers. The example given by our sequence S2 shows that it is possible to design a thermodynamically stable fibril of a fixed small number of -sheet layers by using appropriate amino acid sequences. This result is supported by the common observation of the finite and rather uniform thickness of amyloid fibrils, even though some short peptides are reported to form nanocrystals van der Wel, Lewandowski, and Griffin (2007) at low peptide concentrations.
Our thermodynamics calculations show that the formation of fibril-like aggregates is much more cooperative than that of non-fibril-like aggregates. This cooperativity was indicated by both the height of the specific heat peak and the increase of the maximum specific heat per molecule with the system size. The high cooperativity of fibril formation can be understood as due to the highly ordered nature of fibril structures and the dominating contribution of intermolecular interactions in these structures. We also find that thermodynamic stability is not a distinguished feature of fibril-like aggregates. In particular, sequences associated with very high aggregation transition temperature do not necessarily form fibril-like aggregates. The increased overall hydrophobicity of the sequence is shown to enhance the stability of the aggregates without impact on their fibril characteristics.
It has been suggested Auer, Dobson, and Vendruscolo (2007) that on increasing peptide concentration or peptide hydrophobicity, amyloid fibril nucleation changes from one-step, i.e. the ordered nucleus is formed directly by monomeric peptides from the solution, to the two-step condensation-ordering mechanism, in which nucleation is preceded by the formation of large disordered oligomers. It has been also shown that the nucleation pathway depends on the sequence and its hydrophobicity Luiken and Bolhuis (2015). The sequence S2 in our study shows the one-step nucleation, consistent with the scenario suggested in Ref. Auer, Dobson, and Vendruscolo (2007), given that this sequence has a relatively low hydrophobicity and the 1 mM concentration considered in the simulations is not high compared to those considered in Ref. Auer, Dobson, and Vendruscolo (2007). The impact of the HP sequence on nucleation is also associated with a small nucleation barrier and the rapid nucleation with almost invisible lag phase observed for this sequence. For this fibril-prone sequence, it is found that the non-equilibrium behavior of a larger system is consistent with equilibrium properties of smaller systems at the same peptide concentration. In particular, the frequent formation and dissolving of the aggregates before nucleation and the growth of the aggregates after nucleation are in accord with their thermodynamic stabilities as isolated systems. Note that in general, fibril formation can be kinetics dependent Ricchiuto, Brukhno, and Auer (2012) rather than thermodynamics, especially at very low or very high concentrations. Interestingly, the small size of the critical nucleus found in our study agrees with those obtained in homopolymer studies Auer, Dobson, and Vendruscolo (2007); Auer et al. (2008) as well as in lattice heteropolymer Co and Li (2012) and all-atom Hills and Brooks (2007); Nguyen et al. (2007) simulations of short peptides.
In a recent experiment, Ridgley et al. Ridgley, Ebanks, and Barone (2011) show that mixtures of aggregation-prone peptides and proteins, including the rich in -helices myoglobin, self-assemble into amyloid fibers with increased amounts of cross- content. It was suggested that the -sheet template formed by the peptides promotes the to conversion in the proteins and their involvement in the cross- structure. Our simulation result on the peptide binary mixture is fully consistent with this experiment and shows that a cross--sheet can be heterogeneous in its peptide composition. It is possible that naturally occurring amyloid fibrils can possess this heterogeneity due to the templated self-assembly process. A certain degree of heterogeneity can be seen in the fibril structure of HET-s prion protein Wasmer et al. (2008), which shows that the cross--sheets are formed by repeating ‘in-register’ protein segments but neighboring -strands do not have the same amino acid sequence.
V Conclusion
The present study has highlighted several aspects of amyloid fibril formation that include the sequence determination of fibrillar structures, the role of side chain directionality, the thermodynamics of aggregation, and the nucleation and template-based growth mechanism. In agreement with various experimental findings, our results indicate that fibril-like aggregates form very much under the same principles as in protein folding, such as the alignment of hydrophobic residues in a -sheet, the packing of hydrophobic side chains, and the cooperativity of the aggregation transition. These principles are mainly associated to the specificity of a sequence. Our simulations also show another feature of amyloid formation, that is considerably non-specific to a sequence, namely the fibril-induced aggregation of a non-aggregation-prone sequence. This templating property certainly complicates the problem of amyloid formation as it suggests that the cross--structure can be heterogeneous in their sequence or peptide composition. Our study provides a basis for finding the routes to deal with the problem.
This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under Grant No. 103.01-2016.61. The use of computer cluster at CIC-VAST is gratefully acknowledged.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Chiti and Dobson (2006) F. Chiti and C. M. Dobson, Annu. Rev. Biochem. 75 , 333 (2006).
- 2Riek and Eisenberg (2016) R. Riek and D. S. Eisenberg, Nature (London) 539 , 227 (2016).
- 3Jimenez et al. (1999) J. L. Jimenez, J. I. Guijarro, E. Orlova, J. Zurdo, C. M. Dobson, M. Sunde, and H. R. Saibil, EMBO 18 , 815 (1999).
- 4Dobson (1999) C. M. Dobson, Trends Biochem. Sci. 24 , 329 (1999).
- 5Fändrich and Dobson (2002) M. Fändrich and C. M. Dobson, EMBO 21 , 5682 (2002).
- 6Ventura (2005) S. Ventura, Micro. Cel. Fact. 4 , 11 (2005).
- 7Sunde et al. (1997) M. Sunde, L. C. Serpell, M. Bartlam, P. E. Fraser, M. B. Pepys, and C. C. Blake, J. Mol. Biol. 273 , 729 (1997).
- 8Jiménez et al. (2002) J. L. Jiménez, E. J. Nettleton, M. Bouchard, C. V. Robinson, C. M. Dobson, and H. R. Saibil, Proc. Natl. Acad. Sci. USA 99 , 9196 (2002).
