Conceived and designed the experiments: JNL MM PD GC NM. Performed the experiments: JNL MM. Analyzed the data: JNL MM PD GC NM. Contributed reagents/materials/analysis tools: JNL MM PD GC NM. Wrote the paper: JNL MM PD GC NM.
The authors have declared that no competing interests exist.
The self-organization of peptides into amyloidogenic oligomers is one of the key events for a wide range of molecular and degenerative diseases. Atomic-resolution characterization of the mechanisms responsible for the aggregation process and the resulting structures is thus a necessary step to improve our understanding of the determinants of these pathologies. To address this issue, we combine the accelerated sampling properties of replica exchange molecular dynamics simulations based on the OPEP coarse-grained potential with the atomic resolution description of interactions provided by all-atom MD simulations, and investigate the oligomerization process of the GNNQQNY for three system sizes: 3-mers, 12-mers and 20-mers. Results for our integrated simulations show a rich variety of structural arrangements for aggregates of all sizes. Elongated fibril-like structures can form transiently in the 20-mer case, but they are not stable and easily interconvert in more globular and disordered forms. Our extensive characterization of the intermediate structures and their physico-chemical determinants points to a high degree of polymorphism for the GNNQQNY sequence that can be reflected at the macroscopic scale. Detailed mechanisms and structures that underlie amyloid aggregation are also provided.
The formation of amyloid fibrils is associated with many neurodegenerative diseases such as Alzheimer's, Creutzfeld-Jakob, Parkinson's, the Prion disease and diabetes mellitus. In all cases, proteins misfold to form highly ordered insoluble aggregates called amyloid fibrils that deposit intra- and extracellularly and are resistant to proteases. All proteins are believed to have the instrinsic capability of forming amyloid fibrils that share common specific structural properties that have been observed by X-ray crystallography and by NMR. However, little is known about the aggregation dynamics of amyloid assemblies, and their toxicity mechanism is therefore poorly understood. It is believed that small amyloid oligomers, formed on the aggregation pathway of full amyloid fibrils, are the toxic species. A detailed atomic characterization of the oligomerization process is thus necessary to further our understanding of the amyloid oligomer's toxicity. Our approach here is to study the aggregation dynamics of a 7-residue amyloid peptide GNNQQNY through a combination of numerical techniques. Our results suggest that this amyloid sequence can form fibril-like structures and is polymorphic, which agrees with recent experimental observations. The ability to fully characterize and describe the aggregation pathway of amyloid sequences numerically is key to the development of future drugs to target amyloid oligomers.
The aggregation of soluble peptides and proteins first into soluble oligomeric assemblies and then into insoluble amyloid fibrils is associated with the onset of misfolding diseases such as Alzheimer's disease, Parkinson's disease, type II diabetes and transmissible spongiform encephalopathies
Many studies have shown that soluble oligomeric intermediates are more toxic than the full fibrils themselves
One important way for investigating amyloid fibril formation, polymorphism and cytotoxicity is offered by short protein fragments. Among them, GNNQQNY, from the N-terminal prion-determining domain of the yeast protein Sup35, is a paradigmatic example of a short sequence with the same properties as its corresponding full-length protein
Computer simulations have proved useful complements to experiments for looking at the initial aggregation steps providing information, for example, about the presence of amorphous states in dynamic equilibrium with fibrillar and annular states
In this paper, we push the boundaries of the GNNQQNY oligomer size and investigate, through a multi-scale simulation approach, the aggregation and polymorphism of three GNNQQNY oligomer sizes: 3-mers, 12-mers and 20-mers. Our approach takes advantage of the accelerated sampling properties of replica exchange molecular dynamics (REMD) simulations
Simulations and analyses presented here couple a number of approaches, which are described briefly in this section. The first set of simulations uses the coarse-grained OPEP potential with replica-exchange molecular dynamics (REMD). These are followed by all-atom simulations using GROMACS with MD and REMD. All simulations are labeled as follows: a number, which indicates the number of monomers, two letters indicating the force field (OP for OPEP and GR for GROMACS), a letter or number indicating the simulation and a label for the specific conformation studied (when appropriate) giving, for example: 01OP2-A1.
REMD is a thermodynamical sampling method that requires the running of N MD trajectories (or replica) in parallel at N different temperatures selected in order to optimize thermodynamical sampling
This broadly used method allows for conformations in a deep local minimum to explore other regions of the energy landscape by migrating to higher temperatures. While thermodynamical properties converge faster than with single temperature standard MD, dynamical information is lost due to temperature exchanges. It is still possible, however, to derive thermodynamically putative aggregation pathways by following the continuous trajectories through temperature space.
OPEP is a coarse-grained protein model that uses a detailed representation of all backbone atoms (N, H, Cα, C and O) and reduces each side-chain to one single bead with appropriate geometrical parameters and van der Waals radius. The OPEP energy function, which includes implicit effects of aqueous solution, is expressed as a sum of local potentials (taking into account the changes in bond lengths, bond angles, improper torsions of the side-chains and backbone torsions), non-bonded potentials (taking into account the hydrophobic and hydrophilic properties of each amino acid) and hydrogen-bonding potentials (taking into account two- and four- body interactions)
REMD were carried out using a 1.5 fs time-step, periodic boundary conditions with box sizes depending on the systems and a weak coupling to an external temperature bath
The concentration for both systems is set at 4.15 mM. The random coil monomers are placed 15 Å apart.
The concentration is also 4.15 mM. The monomers are randomly placed 12 to 50 Å apart.
Length of OPEP simulations (ns) |
OPEP - LABEL of structures extracted |
Temperatures min-max (K) & number of temperatures |
GROMACS - LABEL of reconstructed OPEP extracted structures |
Length of GROMACS simulations (ns) |
Total number of atoms |
Temperature (K) |
|
3-mer | 50×16 | 03OP1-A | 222.5–525 16 | 03-GR1-A | 100 | 6504 | 300 |
03OP1-B | 222.5–525 16 | 03-GR1-B | 100 | 5997 | 300 | ||
03OP1-C | 222.5–525 16 | 03-GR1-C | 100 | 5619 | 300 | ||
03OP1-D | 222.5–525 16 | 03-GR1-D | 100 | 5949 | 300 | ||
03OP1-E | 222.5–525 16 | 03-GR1-E | 100 | 6621 | 300 | ||
12-mer | 125×16 | 12OP1-A | 222.5–525 16 | 12-GR1-A | 100 | 56271 | 300 |
12OP1-B | 222.5–525 16 | 12-GR1-B | 100 | 40476 | 300 | ||
12OP1-C | 222.5–525 16 | 12-GR1-C | 100 | 15381 | 300 | ||
12OP1-D | 222.5–525 16 | 12-GR1-D | 100 | 16134 | 300 | ||
12OP1-E | 222.5–525 16 | 12-GR1-E | 100 | 17157 | 300 | ||
20-mer OPp | 200×20 | 20OPp-A | 234.6–447.6 20 | 20GRp-A1 | 100 | 25620 | 300 |
20OPp-A | 234.6–447.6 20 | 20GRp-A2 | 100 | 25620 | 300 | ||
20OPp-B | 234.6–447.6 20 | 20GRp-B1 | 100 | 27816 | 300 | ||
20OPp-B | 234.6–447.6 20 | 20GRp-B2 | 100 | 27816 | 300 | ||
20OPp-B | 234.6–447.6 20 | 20GRp-B3 (REMD) | 10×12 | 27816 | see text | ||
20OPp-C | 234.6–447.6 20 | 20GRp-C1 | 100 | 49674 | 300 | ||
20OPp-C | 234.6–447.6 20 | 20GRp-C2 | 100 | 49674 | 300 | ||
20OPp-D | 234.6–447.6 20 | 20GRp-D1 | 100 | 23746 | 300 | ||
20OPp-D | 234.6–447.6 20 | 20GRp-D2 | 100 | 23746 | 300 | ||
20-mer OP2 | 400×22 | 20OP2-A | 223.8–425.9 22 | 20GR2-A | 100 | 21586 | 300 |
20OP2-B | 223.8–425.9 22 | 20GR2-B | 100 | 24640 | 300 | ||
20OP2-C | 223.8–425.9 22 | 20GR2-C | 100 | 23554 | 300 | ||
20OP2-E | 223.8–425.9 22 | 20GR2-E | 100 | 26065 | 300 | ||
20OP2-N | 223.8–425.9 22 | 20GR2-N1 | 100 | 32206 | 300 | ||
20OP2-N | 223.8–425.9 22 | 20GR2-N2 (REMD) | 10×12 | 32206 | see text |
This table presents simulations done with OPEP (coarse-grained potential) and GROMACS (all-atom potential).
The total simulation time for OPEP REMD simulations in the format time_per_replica x number_of_replicas.
The label of the OPEP/GROMACS structures extracted. The label indicates the number of monomers, the potential used (OP for OPEP and GR for GROMACS), the simulation index (1,2 or p (preliminary)) and the letter ID of the structure.
The range of temperatures (in K) used for OPEP REMD simulations.
The total simulation time for GROMACS simulations. MD simulations are indicated by only one number while, for REMD simulations, the total simulation time is given in the format time_per_replica x number_of_replicas.
The total number of atoms in the system including protein and solvatation water atoms.
The temperature used in GROMACS simulations (in K).
Determining whether equilibrium has been reached, even for the trimer, is difficult. It is always possible that a system is stuck in a minimum and thermodynamical properties will then appear as though they are converged. Here, we use the specific heat to track convergence. This quantity, the second derivative of the free energy, is very sensitive to convergence at all temperatures, and provides a very stringent test even near transitions. Because we are mostly interested in the qualitative properties of the systems under study here, we consider that a system is converged when the overall shape of the specific heat near the transition is converged. This ensures that the dominant structures are found with the proper weight, within the limits of our simulations.
Analysis for these simulations was performed, in part, using a new clustering code that enables us to identify the dominant configuration types in terms of clusters formed in β-sheet structures based on strand attachment. The criterion set to define a hydrogen bond between two given strands is similar to the one used in the DSSP algorithm.
In all cases, structures for all-atom simulations were taken among those of lower-energy OPEP that resisted most efficiently to a temperature increase during replica exchanges. For one preliminary simulation (20OPp) however, the structures were selected based on their frequency of occurrence.
The initial structures for all-atom, explicit solvent Molecular Dynamics (MD) simulations were built by reconstructing the atomic detail of selected conformations from the OPEP coarse-grained runs. Reconstruction was carried out using the MAXSPROUT server
The resulting minimized systems were then solvated in a cubic-shaped box large enough to contain 1.0
For all MD simulations, aggregates were simulated at 300 K for 100 ns. REMD simulations were also used to investigate the stability and the conformational preferences of two 20-mer aggregates. The replica exchange simulations were carried out using the Solute Tempering REMD
The aggregation process for the three types of GNNQQNY oligomers – containing 3, 12 and 20 chains, respectively – was studied by a multi-scale approach consisting in a preliminary, thorough exploration of the phase space through REMD with the OPEP coarse-grained potential, followed by the refinement of the most representative aggregate structures obtained via all-atom MD or REMD simulations in explicit solvent. The initial concentration for the OPEP runs was around 4.15 mM. This concentration is 10 times higher than the concentration at which amyloid GNNQQNY fibrils form in a few hours according to Nelson et al.
For clarity, we first present and discuss results for the trimeric and dodecameric systems as they will serve as basis for understanding the results observed for the 20-mer presented in the last part of this section.
Coarse-Grained REMD simulations were performed with 16 replicas for 50 ns at temperatures discussed in the materials and methods section. Although the system is not fully converged for the very low-temperature replicas, the PTWHAM-generated specific heat computed over two different time intervals shows that the melting temperature, Tm, is well-established at ∼294 K (
The specific heat is calculated over two time intervals for each system (trimer on the left panel and dodecamer on the right panel). Both systems have converged over the time windows displayed here.
Population | |||||||||
222.5 K | 235.8 K | 250.8 K | 266.7 K | 283.4 K | 300 K | 313.8 K | |||
Trimer | Configuration types (%) |
3 | 96.3 | 100 | 98.2 | 98.2 | 92.6 | 13 | 1.9 |
2 1 | 3.7 | 0 | 1.8 | 1.8 | 1.9 | 22.2 | 9.2 | ||
1 1 1 | 0 | 0 | 0 | 0 | 5.5 | 64.8 | 88.9 | ||
% parallel |
45.4 | 39.8 | 22.2 | 12 | 6.5 | 7.4 | 0 | ||
% antiparallel |
54.6 | 60.2 | 77.8 | 88 | 88 | 24.1 | 5.6 | ||
% fully parallel sheets |
0 | 0 | 1.9 | 0 | 0 | 21.1 | 0 | ||
% fully antiparallel sheets |
9.3 | 20.4 | 57.4 | 75.9 | 86.3 | 68.4 | 50 | ||
% mixed sheets |
90.7 | 79.6 | 40.7 | 24.1 | 13.7 | 10.5 | 50 | ||
β-sheet content (%) |
72.4 | 70.6 | 66.6 | 61.6 | 55.1 | 33.3 | 28.3 | ||
% Strands in-register/out-of-register by 1 residue |
51.9/46.3 | 47.2/48.2 | 47.2/38.9 | 38.0/43.5 | 17.7/45.1 | 21.1/42.1 | 33.3/41.7 | ||
Dodecamer | Configuration types (%) |
7 5 | 0 | 4.5 | 9.8 | 54.1 | 75.2 | 34.6 | 0 |
7 4 1 | 30.8 | 33.1 | 12.8 | 1.5 | 0.8 | 1.5 | 0 | ||
8 4 | 0 | 11.3 | 38.4 | 36.8 | 5.3 | 1.5 | 0 | ||
12 | 66.9 | 7.5 | 9 | 0.8 | 1.5 | 0.8 | 0 | ||
11 1 | 2.3 | 15 | 9 | 3 | 1.5 | 0.8 | 0 | ||
10 1 1 | 0 | 9 | 6 | 0 | 3 | 1.5 | 0 | ||
% parallel |
32.3 | 31.3 | 40.9 | 47.9 | 41.9 | 37.9 | 17.1 | ||
% antiparallel |
60.2 | 48.4 | 45.8 | 46.7 | 49.5 | 41.1 | 16 | ||
% fully parallel sheets |
0 | 0.4 | 1.2 | 0.4 | 0.8 | 15.3 | 22.6 | ||
% fully antiparallel sheets |
6.3 | 10.8 | 6.2 | 2.3 | 2.7 | 11.4 | 21.8 | ||
% mixed sheets |
93.7 | 88.8 | 92.6 | 97.3 | 96.5 | 73.3 | 55.6 | ||
β-sheet content (%) |
75.8 | 59.1 | 61.7 | 63.5 | 61.8 | 43.8 | 11.4 | ||
% Strands in-register/out-of-register by 1 residue |
52.5/27.1 | 33.5/46.3 | 39.2/41.7 | 44.7/43.4 | 38.8/54.0 | 30.6/45.6 | 29.5/39.7 |
Temperatures above 313.8 K are not displayed here since they are populated essentially by conformations with random coil monomers with no secondary structure. The percentages are calculated over all the structures obtained in the last 40 ns (trimer) and in the last 100 ns (dodecamer) of the OPEP REMD simulations, where the systems have converged.
The dominant configuration types (as described in the OPEP Analysis and Structure Selection section).
The average amount of parallel and anti-parallel strands in the β-sheets formed. The sum of parallel and antiparallel strands in a structure does not always total 100% if the structure sees strands in an undefined orientation, i.e. attached by only one hydrogen bond.
The average amount of fully parallel, fully antiparallel and mixed sheets.
The average amount of residues in a β conformation.
The average amount of strands in-register and out-of-register (by one residue) in β-sheets.
Structurally, the trimer displays a strong tendency to form ordered planar β-sheets below Tm (
We show, on the left-hand side panel, representative structures obtained from the OPEP simulations and, on the right-hand side panel, the representative structures obtained after all-atom MD refinements. 03OP1-A,-B,-C,-D and –E were extracted respectively at 222.5 K (probability of occurrence for this β-strand organization: 91%), 235.7 K (80%), 250.8 K (41%), 266.7 K (76%) and 283.4 K (86%). 03OP1-A to -C are mixed β-sheets while 03OP1-D and –E are fully antiparallel β-sheets. The all-atom structures are represented in secondary structure cartoon and only the tyrosines (most hydrophobic residues in the sequence) are shown in blue sticks (hydrogen atoms are omitted).
Five representative OPEP-generated structures, labeled 03OP1-A, 03OP1-B, 03OP1-C, 03OP1-D and 03OP1-E (
As seen in the final structures of the all-atom simulations displayed in
Even though all-atom simulations cannot capture fully disordered chains within 100 ns at 300 K, the coarse-grained and all-atom simulations indicate that both parallel and anti-parallel arrangements can be found in multiple meta-stable two-stranded and three-stranded structures, with various registers of hydrogen bonds contributing to the structural richness and conformational variability of the trimeric aggregates.
Our trimeric results point to the existence of three minima associated with parallel, antiparallel and mixed parallel/antiparallel β-sheet structures, and are consistent with previous computational studies at the all-atom level on the GNNQQNY trimer
OPEP-REMD was performed with the 16 replicas as in the case of the trimer, but each for 125 ns. Within the first 25 ns, the system converges at low temperature to β-sheet rich structures where the strands prefer an antiparallel orientation, as for the trimer, but with a lower melting temperature of 283 K (see
Kinetically, the aggregation tendency for the dodecamer is to first form one or two stable four-stranded β-sheets that show little dissociation and that trigger the transient formation of one or two longer β-sheets. The formation of a trimer that precedes the four-stranded β-sheet shows, however, a higher dissociation/association rate. Interestingly, the tendency of the GNNQQNY sequence to form stable tetrameric aggregation nuclei had already been noticed in a previous investigation on the system
We show, on the left-hand side panel, representative structures obtained from the OPEP simulations and, on the right-hand side panel, representative structures obtained after all-atom MD refinements. 12OP1-A,-B,-C,-D and –E were extracted respectively at 222.5 K, 235.7 K, 250.8 K, 266.7 K and 283.4 K. 12OP1-A (top left structure) is a long flat beta-sheet. 12OP1-B to -E (second left to bottom left structures) are made of 2 beta-sheets facing each other. Monomers forming β-sheets in the initial state are colored red or green. These colors are kept in the final structure. The tyrosines are shown in blue sticks for the all-atom structures. During the all-atom MD simulation the structures tend to be more globular but the strands see no exchange between the β-sheets, i.e. the red and green β-sheets do not dissociate for the 12-mer system.
As would be expected, a rich set of ordered configurations is visited for the 12-mer (
The 5 most representative structures obtained from OPEP REMD (labeled 12OP1-A to 12OP1-E) were further studied by all-atom MD. Representative structures obtained from the latter simulations are shown in
β-Sheet Contenent (based on the dssp program) |
% Parallel - Antiparallel |
% Res Align (% 100 Align - % Shift by 1 Res - % Shift by 2 Res…) |
||||||||||||
GROMACS - GR | OPEP - OP | GROMACS - GR | OPEP - OP | GROMACS - GR | OPEP - OP | |||||||||
Configuration Type |
Structure | First Cluster | Final Structure | CG | Min | First Cluster | Final Structure | CG | Min | First Cluster | Final Structure | CG | Min | |
Trimers | 3 | 03_1 - A | 38% | 19% | 71% (100%) | 67% | 50 - 50 | 50 - 0 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 |
3 | 03_1 - B | 38% | 24% | 19% (27%) | 67% | 0 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 0 - 50 | 50 - 50 | 50 - 50 | |
3 | 03_1 - C | 57% | 62% | 62% (87%) | 67% | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | |
3 | 03_1 - D | 48% | 57% | 52% (73%) | 67% | 0 - 50 | 0 - 50 | 0 - 50 | 0 - 100 | 50 - 50 | 50 - 50 | 50 - 50 | 50 - 50 | |
3 | 03_1 - E | 48% | 52% | 57% (80%) | 53% | 50 - 50 | 0 - 50 | 0 - 50 | 0 - 100 | 50 - 0 - 50 | 50 - 0 - 50 | 0 - 50 - 50 | 0 - 50 - 50 | |
12-mers | 12 | 12_1 - A | 49% | 35% | 64% (90%) | 60% | 60 - 20 | 40 - 10 | 36 - 64 | 36 - 64 | 80 - 20 | 50 - 50 | 64 - 18 - 18 | 64 - 18 - 18 |
6 4 1 | 12_1 - B | 14% | 14% | 45% (63%) | 44% | 13 - 25 | 20 - 40 | 22 - 22 | 33 - 33 | 35 - 25 - 37 | 0 - 40 - 60 | 33 - 56 - 11 | 33 - 56 - 11 | |
8 4 | 12_1 - C | 35% | 29% | 49% (68%) | 42% | 10 - 20 | 40 - 20 | 60 - 40 | 40 - 30 | 40 - 40 - 20 | 50 - 20 - 30 | 60 - 20 - 20 | 60 - 20 - 20 | |
7 5 | 12_1 - D | 33% | 24% | 54% (75%) | 55% | 11 - 22 | 0 - 30 | 40 - 60 | 10 - 60 | 45 - 44 - 11 | 50 - 50 | 50 - 50 | 50 - 50 | |
7 5 | 12_1 - E | 56% | 51% | 46% | 51% | 60 - 40 | 50 - 20 | 60 - 40 | 60 - 40 | 60 - 30 - 10 | 50 - 50 | 60 - 30 - 10 | 50 - 40 -10 | |
20-mers | 11 7 1 1 | 20_p - A1 | 43% | 38% | 38% (53%) | 37% | 44 - 6 | 56 - 13 | 76 - 24 | 44 - 25 | 56 - 31 - 13 | 63 - 19 - 18 | 34 - 33 - 33 | 44 - 37 - 19 |
20_p - A2 | 31% | 34% | 38% (53%) | 37% | 33 - 0 | 50 - 0 | 76 - 24 | 44 - 25 | 40 - 40 - 20 | 63 - 31 - 6 | 34 - 33 - 33 | 44 - 37 - 19 | ||
11 7 1 1 | 20_p - B1 | 52% | 49% | 36% (50%) | 66% | 69 - 19 | 59 - 12 | 71 -24 | 75 - 25 | 31 - 44 - 25 | 35 - 47 -18 | 33 - 39 - 28 | 31 - 44 - 25 | |
20_p - B2 | 47% | 38% | 36% (50%) | 66% | 40 - 13 | 38 - 13 | 71 -24 | 75 - 25 | 34 - 53 - 13 | 31 - 56 - 13 | 33 - 39 - 28 | 31 - 44 - 25 | ||
11 7 1 1 | 20_p - C1 | 39% | 35% | 44% (44%) | 59% | 44 - 19 | 47 - 20 | 72 - 22 | 69 - 25 | 31 - 50 - 19 | 54 - 33 -13 | 33 - 39 - 28 | 31 - 44 - 25 | |
20_p - C2 | 59% | 36% | 44% (44%) | 59% | 69 - 25 | 69 - 8 | 72 - 22 | 69 - 25 | 27 - 47 - 26 | 46 - 54 | 33 - 39 - 28 | 31 - 44 - 25 | ||
12 7 1 | 20_p - D1 | 44% | 46% | 44% (62%) | 64% | 24 - 35 | 24 - 35 | 35 - 47 | 41 -47 | 59 - 35 - 6 | 65 - 33 -6 | 59 - 35 - 6 | 59 - 35 - 6 | |
20_p - D2 | 27% | 27% | 44% (62%) | 64% | 29 - 24 | 22 - 22 | 35 - 47 | 41 - 47 | 63 - 31 - 6 | 61 - 33 -6 | 59 - 35 - 6 | 59 - 35 - 6 | ||
9 6 5 | 20_2 - A | 44% | 44% | 34% (47%) | 28% | 69 - 13 | 61 - 11 | 72 - 22 | 73 - 13 | 31 - 44 -25 | 39 - 39 - 22 | 56 - 39 - 5 | 20 - 53 - 27 | |
8 7 5 | 20_2 - B | 45% | 39% | 39% (55%) | 71% | 41 - 41 | 17 - 22 | 71 - 18 | 41 - 59 | 41 - 18 - 41 | 56 - 28 - 16 | 35 - 53 - 12 | 41 - 24 - 35 | |
7 7 6 | 20_2 - C | 36% | 37% | 52% (73%) | 65% | 44 - 17 | 41 - 35 | 76 - 24 | 59 - 41 | 50 - 39 - 11 | 35 - 47 - 18 | 30 - 41 - 29 | 41 - 47 - 12 | |
8 8 4 | 20_2 - E | 32% | 33% | 51% (72%) | 63% | 28 - 11 | 42 - 16 | 59 - 41 | 50 - 39 | 44 - 39 - 17 | 42 - 37 - 21 | 54 - 23 -23 | 39 - 33 - 28 | |
10 8 2 | 20_2 - N | 41% | 39% | 42% (59%) | 57% | 39 - 22 | 41 - 24 | 31 - 56 | 31 - 56 | 44 - 28 - 28 | 29 - 47 - 24 | 44 - 25 -31 | 44 - 25 - 31 |
“First Cluster” means the most representative structure of the GROMACS simulations. “Final structure” is the final conformation obtained at the end of the GROMACS simulations. “CG” is the structure extracted at the end of the OPEP simulations before the reconstruction of the side chains. “Min” indicates the structure resulting from the reconstruction of the side chains after a minimization step.
The configuration type (as described in the OPEP Analysis and Structure Selection section).
The average amount (percentage) of residues in a β conformation. For OPEP, the percentage in brackets has been calculated without taking the Glycines into account.
The average amount (percentage) of parallel and anti-parallel strands in a structure. The sum of parallel and antiparallel strands in a structure does not always total 100% if the structure sees strands in an undefined orientation, i.e. attached by only one hydrogen bond.
The average amount of strands in-register and out-of-register (by one residue).
Structure 12OP1-B is characterized by a mainly parallel twisted β-sheet, with four strands packed on top. This structure is not stable in the all-atom MD setting, simulation 12GR1-B, and evolves towards a compact globular structure as shown by the evolution of the radius of gyration in time (
Such a supramolecular organization of the peptides may be representative of one of the soluble intermediates on the pathway to fibril formation. Solubility is favored by the presence of hydrophilic side chains on the external surface of the aggregate. At the same time, the packing of the interior is not optimal, so that the resulting structure may not be in the most favorable arrangement to ensure lasting stability. Water can also access the interior of the globular aggregate, disrupting inter-strand hydrogen bonds, eventually favoring conformational changes.
Structures 12OP1-C and 12OP1-D are similar to 12OP1-B: the main difference is that four strand pack with their long axis almost perpendicular to the long axis of the extended β-sheet. The main difference between 12OP1-C and 12OP1-D is that the planes defined by the four strands have different inclinations with respect to the plane of the long extended β-sheet. In the all-atom MD setting — simulations 12GR1-C and 12GR1-D — these structures evolve to less globular, but more compact final arrangements than that observed above, with most of the Tyr side-chains in contact with the solvent (
Finally, structure 12OP1-E is characterized by two orthogonal twisted β-sheets. The OPEP structure is very stable: it does not undergo significant rearrangement during the all-atom MD, contrary to the previous cases, and the β-sheet content remains constant (
Overall, the combined results indicate that the configurational richness increases from the trimer to the 12-mer and that the critical nucleus has not yet been found. Though, the strands do not see much exchange between sheets as seen in
Next, we turned to the study of 20-mers in order to assess the importance of the number of chains on the final supra-molecular organization and determine whether new structural motifs can emerge.
Three REMD simulations with OPEP were thus generated for the GNNQQNY 20-mer systems: 20OPp, 20OP1 and 20OP2. A preliminary run 20OPp was run to identify the four most common low-energy clusters, from which we extract the central structure for each: 20OPp-A, 20OPp-B, 20OPp-C and 20OPp-D (
The stable 20-mer structures obtained from OPEP's preliminary simulation 20OPp are shown on the left-hand side panel. The final primary clusters obtained from the OPEP structures with all-atom MD or all-atom REMD are displayed on the right-hand side panel. 20OPp-A,-B,-C and -D were extracted at 283.4 K. The color code is the same as in
Following this preliminary run, we have performed two additional simulations 400 ns-long 20OP1 and 20OP2 (
The specific heat is calculated over two time intervals for the systems 20OP1 (left panel) and 20OP2 (right panel) during the last 200 ns.
As for the 12-mer, aggregation is extremely favorable energetically. The melting temperature for 20OP1 varies between 280.4 K and 289.2 K during the last 200 ns of simulation and the energy of ordered structures at the lowest temperature, 223.8 K, is on average −27.8 kcal/mol/monomer for 20OP1, as calculated from the PTWHAM analysis. For 20OP2, the transition is happening between 260.2 K and 290.5 K and the potential energy of aggregated structures at the lowest temperature, 223.8 K, is on average −28.1 kcal/mol, which is comparable to the energies of aggregated structures for 20OP1. Those energies are about 10 Kcal/mol/monomer above the dodecamer structures' energies at 222.5 K: clearly, the structures generated for the 20-mer are not as ordered as those found for the 12-mer due to the much longer time needed to sample these energetically-favorable conformations, but also because the entropic loss associated with full-ordering is larger for the 20-mer. For both the 20OP1 and the 20OP2 simulation sets, random coil structures dominate at simulations whose temperature is above 280 K.
Following specific trajectories, as they move through temperatures, it is possible to identify sequences of steps leading to low-energy ordered structures. In the more than 25 such events observed in 20OP1 and 20OP2, the aggregation process is systematically triggered by the formation of a few dimers, trimers and/or tetramers seeds. The conformations obtained from both 20OP1 and 20OP2 are structurally similar in the sense that they are almost always composed of three sheets composed of 5 to 9 strands each either facing each other in a triangle-like or organized in a propeller-like or β-sandwich conformation (
The final primary clusters obtained from the OPEP structures with all-atom MD or all-atom REMD are displayed on the right-hand side panel. 20OP2-A,-B,-C,-E and -N were extracted respectively at 260.1 K, 249.2 K, 254.7 K, 265.9 K and 292.2 K. The different sheets are distinguished by either a green, red or yellow color and the tyrosines are shown in blue sticks for the all-atom structures. Structures 20OP2-A,-B,-C and -E are composed of 3 sheets twisted around each other while structure 20OP2-N is a 2-sheet fibril-like conformation. During the all-atom MD simulation the structures tend to be more globular with the strands seeing some exchange between the β-sheets, i.e. the red, green and yellow β-sheets from the OPEP structures dissociate and re-associate during the all-atom MD simulations except for the fibril-like structures 20GR2-N1 and -N2.
Population | ||||||||||
223.8 K | 249.3 K | 260.1 K | 265.9 K | 270.3 K | 273.8 K | 277.1 K | 280.1 K | |||
20OP1 | Configuration types (%)(a) | 8 7 5 | 93.2 | 12.4 | 7.5 | 7.1 | 9.8 | 6.8 | 7.1 | 9.4 |
8 7 3 2 | 0.0 | 44.7 | 2.6 | 0.8 | 0.4 | 0.8 | 0.0 | 0.0 | ||
7 7 6 | 2.6 | 12.4 | 13.9 | 9.8 | 12.8 | 10.5 | 12.4 | 7.9 | ||
10 6 4 | 0.4 | 0.0 | 7.5 | 8.7 | 5.3 | 11.3 | 5.3 | 3.4 | ||
11 5 4 | 0.0 | 0.8 | 7.5 | 6.4 | 6.0 | 3.4 | 1.5 | 0.0 | ||
11 5 2 2 | 0.0 | 10.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||
14 6 | 0.0 | 0.8 | 5.3 | 6.8 | 7.1 | 7.5 | 3.4 | 4.9 | ||
15 5 | 0.0 | 0.0 | 3.4 | 7.1 | 3.8 | 4.5 | 6.8 | 8.3 | ||
10 10 | 0.0 | 0.4 | 4.5 | 8.7 | 6.0 | 5.3 | 7.9 | 2.3 | ||
10 5 5 | 0.0 | 0.0 | 1.9 | 3.8 | 8.3 | 7.1 | 3.8 | 2.6 | ||
11 9 | 0.0 | 0.0 | 3.8 | 9.0 | 11.7 | 6.8 | 7.1 | 4.5 | ||
20 | 0.0 | 1.1 | 3.4 | 4.9 | 5.3 | 5.3 | 3.4 | 0.0 | ||
% parallel(b) | 73.9 | 48.8 | 53.4 | 55.1 | 56.2 | 56.8 | 58.4 | 46.8 | ||
% antiparallel(b) | 25.1 | 48.2 | 41.3 | 37.6 | 36.7 | 35.6 | 34.5 | 29.4 | ||
% fully parallel sheets(c) | 26.2 | 22.4 | 19.7 | 15.1 | 11.8 | 14.5 | 11.1 | 16.1 | ||
% fully antiparallel sheets(c) | 0.9 | 7.9 | 3.1 | 4.6 | 1.9 | 2.5 | 1.0 | 6.5 | ||
% mixed sheets(c) | 72.9 | 69.7 | 77.2 | 80.3 | 86.3 | 83 | 87.9 | 77.4 | ||
β-sheet content (%)(d) | 59.1 | 61.3 | 60.5 | 60.5 | 60.5 | 59.8 | 57.1 | 45.7 | ||
% Strands in-register/out-of-register by 1 residue(e) | 32.1/38.0 | 43.1/31.2 | 42.6/33.8 | 41.2/36.9 | 40.1/38.4 | 38.8/38.6 | 39.7/38.6 | 38.7/39.8 | ||
20OP2 | Configuration types (%)(a) | 9 6 5 | 29.3 | 29.3 | 24.4 | 24.1 | 18.4 | 15.4 | 8.7 | 8.3 |
9 7 4 | 0.4 | 1.5 | 4.5 | 4.5 | 7.5 | 4.9 | 6.0 | 3.8 | ||
8 8 4 | 1.1 | 0.0 | 1.9 | 1.1 | 3.8 | 6.0 | 4.5 | 4.1 | ||
7 7 6 | 4.5 | 6.0 | 8.7 | 12.0 | 12.8 | 16.2 | 15.0 | 12.0 | ||
15 5 | 12.4 | 9.4 | 7.5 | 2.3 | 6.8 | 3.4 | 1.9 | 1.5 | ||
13 7 | 0.8 | 0.4 | 0.4 | 1.9 | 1.5 | 2.3 | 4.9 | 7.1 | ||
14 6 | 3.4 | 3.8 | 3.0 | 4.9 | 10.9 | 6.4 | 8.7 | 6.4 | ||
11 9 | 9.4 | 14.7 | 6.8 | 3.8 | 2.3 | 4.5 | 5.6 | 3.8 | ||
11 5 4 | 16.5 | 12.4 | 3.8 | 0.0 | 0.4 | 1.5 | 0.0 | 0.0 | ||
8 7 5 | 0.4 | 1.9 | 1.5 | 1.9 | 5.6 | 6.8 | 6.4 | 6.0 | ||
10 10 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||
20 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.4 | 0.0 | 0.4 | ||
% parallel(b) | 53.1 | 56.6 | 57.1 | 56.8 | 56.7 | 55.7 | 56.7 | 55.1 | ||
% antiparallel(b) | 38.1 | 35.7 | 36.5 | 37.5 | 37.9 | 39.3 | 38.4 | 38.5 | ||
% fully parallel sheets(c) | 7.4 | 16.9 | 15.0 | 14.2 | 14.1 | 14.1 | 18.7 | 17.1 | ||
% fully antiparallel sheets(c) | 0.6 | 4.7 | 3.2 | 4.2 | 3.9 | 4.4 | 5.1 | 8.1 | ||
% mixed sheets(c) | 92.0 | 78.5 | 81.8 | 81.6 | 82.0 | 81.5 | 76.2 | 74.8 | ||
β-sheet content (%)(d) | 66.5 | 63.4 | 63.0 | 62.5 | 61.5 | 60.6 | 58.7 | 56.0 | ||
% Strands in-register/out-of-register by 1 residue(e) | 51.0/36.3 | 50.0/34.0 | 48.1/35.1 | 44.6/37.0 | 42.4/37.2 | 42.3/37.2 | 39.9/39.1 | 39.7/39.4 |
Temperatures above 280.1 K are not displayed here since they are populated essentially by conformations with random coil monomers with no secondary structure. The percentages are calculated over all the structures obtained in the last 200 ns of both OPEP REMD simulations. For details on (a)–(e), see
In both REMDs, the β-sheets have a high probability of being in a mixed anti-parallel/parallel orientation state due to their length (over 70% below Tm) (
Consistent with the protocols described above for the 3-mers and 12-mers, the dynamical properties of 9 selected oligomeric conformations generated by the OPEP simulations were refined by all-atom MD simulations in explicit solvent (see
The first set of all-atom MD simulations was run on the structures selected from 20OPp calculations. OPEP runs identified two main types of 3D organization for the 20mer: extended β-sheet and globular-like structures. The former are characterized by the presence of two parallel sheets, while the latter are characterized by a circular organization of the strands, in a mostly parallel arrangement. The major representatives of the extended β-sheet like structures obtained from OPEP are labeled 20OPp-B and 20OPp-C; the globular structures are recapitulated by 20OPp-A (which shows a compact part packed by a more extended sheet) and 20OPp-D, see
A
After the first MD run (20GRp-B1) the two sheets are oriented anti-parallel to each other, forming a tight and elongated structure (
In the second MD simulation (20GRp-B2), starting from the same initial structure with a different set of velocities, the β-sheet content decreases due a lower degree of packing of the Tyr side chains and interdigitation of the Gln and Asn side-chains (
This structure is representative of the 20GRp-B2 simulation. It shows the inter-sheet space filled by Asn and Gln side chains, which in some case interdigitate.
During both 20GRp-B1 and 20GRp-B2 MD simulations, the strands are dynamically interchanged between the two β-sheets.
Two elongated sheets may, however, evolve towards very different supramolecular organizations. In 20OPp-C two elongated β-sheets, with mainly parallel β-strands, are in contact through the terminal Tyr aromatic chains in an extended and non-compact structure (
E
Starting from 20OPp-A, simulation 20GRp-A1 evolves towards a compact globular structure, in which parts of the ordered β-sheets are lost and strands are dynamically interchanged between sheets. The aromatic Tyr side chains cluster in the hydrophobic core of the structure and Asn side chains align on the surface. Most of the conserved sheets are still in parallel orientation.
Strikingly, in simulation 2 (20GRp-A2) the initial structure evolves to form two twisted antiparallel sheets in which the constitutive strands are parallel to each other. This structure resembles the twisted conformation observed for 20OPp-B and was observed in previous simulations
These results suggest that the sheet organization in the twisted antiparallel conformation(s) may be accessible on the aggregation pathway, once two sheets are formed and docked upon each other. Interestingly, we have observed the formation of elongated, twisted antiparallel structures in MD only in the 20-mer system. The latter appear to evolve preferentially towards globular structures, suggesting that elongated, fibril-like conformations of the oligomers may be accessible only in the presence of a higher number of monomers. At the atomic level, sheet-locking is favored by the packing of Asn and Tyr side-chains. The Tyr aromatic packing and the initial formation of steric-zipper-like structures also provide important contributions in determining the ordering and stabilization of the growing aggregate and, possibly, its evolution to a stable fibril.
The detailed role of side-chains in determining the conformational characteristics of compact aggregate structures was further evaluated by analyzing at atomic resolution a set of diverse OPEP structures: 20OP2-A, 20OP2-B, 20OP2-C, 20OP2-E and 20OP2-N (
The starting structures of the MD simulations from 20OP2-A, 20OP2-C and 20OP2-E of the aggregates all consist of three extended β-sheets, organized in different tertiary arrangements (see
In the case of 20OP2-C, the starting structure constituted by three β-sheets, which are lined and twisted along a common axis, is not stable in the all-atom MD setting, and immediately evolves to a more compact globular structure that however does not display specific supramolecular properties or preferential orientations of the strands within the aggregate (
In the case of 20OP2-E, the evolution of all-atom MD simulations at 300 K determines a large decrease in the degree of ordered β-structure leading to the formation of disordered, amorphous conformations (
The remaining two representative clusters obtained from OPEP simulations display different three-dimensional arrangements. In the case of 20OP2-B, the structure is characterized by parallel β-sheet motifs that form a less compact conformation than the one observed above. All-atom MD evolution leads to a globular structure with a global reorganization of the β-strands (
Finally, we simulated the structure of cluster 20OP2-N at all-atom resolution as this aggregate forms an elongated structure with two facing β-sheets. MD evolution at 300 K for this system shows no reorganization of the β-strands (
Summarizing, as shown in
The all-atom simulations of 100 ns starting from 20OPp-B and 20OP2-N showed the possibility for the aggregates to remain elongated partially ordered oligomers whose structures are reminiscent of the arrangements observed by X-rays of micro-crystals. In order to gain more insights into the stability and conformational evolution properties of these structures, we set out to run all-atom REMD simulations starting from OPEP structures 20OPp-B (
In the REMD simulation labeled 20GRp-B3, and similarly to what is seen in OPEP REMDs, we observe that structures interconvert between compact and elongated conformations with a pair of sheets facing each other. The main representative structures for simulation 20GRp-B3, and their relative stabilities, are reported in
The number identifying each structure represents the cluster rank (1. being the most populated cluster). The value of the GB/SA energy in water of the complex is reported. Arrows represent transitions between clusters, indicating possible paths between cluster structures.
Several conformational transitions among different structural clusters are observed, and highlighted by direction arrows in
All-atom REMD simulations were also used to analyze the structural evolution of cluster 20OP2-N. The representative structures obtained from the all-atom simulation 20GR2-N2 are reported in
Overall, we observe that the elongated structures reminiscent of the one observed by the Eisenberg group in the fibril microcrystals is accessible at room temperature but is not stable and can easily interconvert into globular or more disordered conformations, even in the presence of 20 monomers (see
The self-organization process of peptides and proteins into oligomeric soluble and insoluble aggregates that eventually evolve to fibrils is still difficult, if not impossible, to study at atomic resolution using experimental approaches. In this paper, we have carried out an extensive and comprehensive study of the formation of oligomers of the model peptide GNNQQNY in different conditions combining coarse-grained and all-atom simulation approaches. Different numbers of peptides were used in several simulations. In the smallest systems, composed of three strands, a diverse set of structural motifs is accessible at room temperatures. When bigger systems consisting of 12 chains are analyzed, compact and globular structures begin to appear. Interestingly, in some cases, globular oligomers expose hydrophilic side chains to the contact with the water solvent, providing a viable model for soluble intermediates that have been observed on pathway to the formation of the final fibril. In parallel, at 300 K, globular structures exposing a large amount of hydrophobic surface also appear. These may represent possible nuclei for the growth of bigger supramolecular structures.
In simulations conducted using 20 monomers, we have noticed the appearance of elongated structures characterized by the juxtaposition of two mainly parallel β-sheets with partial interdigitation of amidic side chains reminiscent of the zipper-spine observed in fibril microcrystals. It is important to notice, however, that these structures are not stable in water solvent and evolve towards more globular conformations. This observation suggests that while ordered fibril-like structures are accessible on the energy landscape, they need further stabilization by establishing contacts with multiple copies of similar structures in order to evolve to a fully fibrillar geometry. In this context, the formation of this geometry would require the constructive interplay of many factors and the entropic expense of such process would be clearly very high, explaining the long lag phase times and very slow kinetics of amyloid fibril formation. Moreover, the rich variety of structures and conformational changes observed for the aggregates may also reverberate into the fibril polymorphism observed at the experimental scale.
In summary, our data and structural models represent valid complements to experimental approaches in the attempt to shed light on the supramolecular arrangements of amyloidogenic oligomers, and lead the following conclusions.
First, the 20-mers of GNNQQNY are polymorphic and endowed with a high degree of structural plasticity. Polymorphism of the fibrillar products of amyloid aggregation has been observed for many sequences by X-ray diffraction and solid-state NMR experiments
Second, the 20-mers of GNNQQNY in explicit water are in dynamic equilibrium, within at least 100 ns, between amorphous structures (high probability) and configurations with three β-sheets in various orientations (medium probability) and two β-sheets (low probability). These two-β sheets, reminiscent of the cross-β structures and the dry steric zipper observed experimentally for mature fibrils, are not parallel, however, suggesting the existence of a free energy barrier preventing the formation of a perfectly packed steric zipper.
Third, there is a reorientation of the β-strands between the GNNQQNY oligomers and fibrils. We find that an anti-parallel β-strand alignment dominates over the parallel one in the 3 and 12 peptide systems. This contradiction with the fibrillar parallel β-strand orientation
Fourth, a common observation is that short amyloid peptide fragments assume antiparallel β-strand geometries whereas longer peptides, and proteins, often assume parallel geometries. Our simulations along with other recent studies show this geometrical property is more complex and depends strongly on the amino acid composition. The dependence of β-strand orientation with oligomer size occurs in the GNNQQNY (Sup35) peptide and the VQIVYK (PH6) peptide, as reported by another computational study
In addition, antiparallel β-sheets allow a higher potential variability of the inter-chain H-bond geometry
Fifth, in terms of experimental relevance, it is important to note that evidence exists showing that aggregation pathways can be manipulated by the use of molecular chaperones. In the case of the Sup35 prion protein, the chaperone Hsp104 catalyzes the polymerization of seeds that are crucial for efficient amyloid formation
On this basis, the hydrophobic-hydrophilic profile of the chaperone interaction surfaces could, for instance, be changed by means of site-directed mutagenesis, affecting their activity and ultimately the properties of the remodeled oligomers. This would allow a rational manipulation of the amyloidogenic pathways, helping to shed light on a very complex biological phenomenon.
A final consideration helpful to put our results in a biological perspective is related to the importance of the knowledge of oligomeric structures in the design of amyloidogenic inhibitors. In this context, we are currently exploring the characterization of the solvent accessible hydrophobic surface area of the 20-mers to guide docking-experiments of small-molecule compounds (Congo Red and EGCG in particular), in order to derive possible rules for the rational selection of aggregation inhibitors. Preliminary data and results show that this could be helpful in alleviating the difficulties associated to drug-design when dealing with amyloid-targets. Indeed, compared to classical drug-design efforts where the target is an active site, with well-defined structure and cavities, the variety of structures, mechanisms and conformational plasticity of oligomers shown here confirm that rational design of aggregation inhibitors is a daunting challenge. However, careful characterization of oligomeric structures provides useful suggestions for the design of possible inhibitors. Selective compounds or peptidomimetics could be designed/selected to target the oligomer conformations characterized by the presence of aromatic groups on their external surface. These compounds would actually target intermediates that are more prone to be insoluble or to favor the addition of monomers through hydrophobic interactions. Interestingly, most of the existing inhibitors of amyloidogenic pathways are small molecules rich in aromatic functionalities, which can target more than one single aggregating species, showing a general mechanism of action
Alternatively, one could design peptidomimetic-based or small molecule chaperones that can stabilize soluble species, subtracting them from the amyloidogenic pathway. This would lead to a redirection of otherwise amyloidogenic peptides into non-amyloidogenic species.
Time evolution of the radius of gyration of the 12-mer oligomers. From top to bottom: structures 12GR1-A, 12GR1-B, 12GR1-C, 12GR1-D and 12GR1-E. The structures shown are the final structures of the all-atom MD simulations with GROMACS.
(TIF)
Time evolution of the radius of gyration of the 20-mer oligomers for the preliminary simulation. Structures 20GRp-B1, 20GRp-B2, 20GRp-C1 and 20GRp-C2. The structures shown are the final structures of the all-atom MD simulations with GROMACS.
(TIF)
Time evolution of the radius of gyration of the 20-mer oligomers for the preliminary simulation. Structures 20GRp-A1, 20GRp-A2, 20GRp-D1 and 20GRp-D2. The structures shown are the final structures of the all-atom MD simulations with GROMACS.
(TIF)