Conceived and designed the experiments: ZZ IM. Performed the experiments: ZZ IM. Analyzed the data: ZZ IM. Contributed reagents/materials/analysis tools: ZZ IM. Wrote the paper: ZZ IM.
The authors have declared that no competing interests exist.
HD amino acid duplex has been found in the active center of many different enzymes. The dyad plays remarkably different roles in their catalytic processes that usually involve metal coordination. An HD motif is positioned directly on the amyloid beta fragment (Aβ) and on the carboxy-terminal region of the extracellular domain (CAED) of the human amyloid precursor protein (APP) and a taxonomically well defined group of APP orthologues (APPOs). In human Aβ HD is part of a presumed, RGD-like integrin-binding motif RHD; however, neither RHD nor RXD demonstrates reasonable conservation in APPOs. The sequences of CAEDs and the position of the HD are not particularly conserved either, yet we show with a novel statistical method using evolutionary modeling that the presence of HD on CAEDs cannot be the result of neutral evolutionary forces (p<0.0001). The motif is positively selected along the evolutionary process in the majority of APPOs, despite the fact that HD motif is underrepresented in the proteomes of all species of the animal kingdom. Position migration can be explained by high probability occurrence of multiple copies of HD on intermediate sequences, from which only one is kept by selective evolutionary forces, in a similar way as in the case of the “transcription binding site turnover.” CAED of all APP orthologues and homologues are predicted to bind metal ions including Amyloid-like protein 1 (APLP1) and Amyloid-like protein 2 (APLP2). Our results suggest that HDs on the CAEDs are most probably key components of metal-binding domains, which facilitate and/or regulate inter- or intra-molecular interactions in a metal ion-dependent or metal ion concentration-dependent manner. The involvement of naturally occurring mutations of HD (Tottori (D7N) and English (H6R) mutations) in early onset Alzheimer's disease gives additional support to our finding that HD has an evolutionary preserved function on APPOs.
HD amino acid duplex can be found in the active center of different metallo-enzymes. An HD motif is positioned directly on the amyloid beta (Aβ) fragment and on the carboxy-terminal region of the extracellular domain of the human amyloid precursor protein (APP) and a taxonomically well defined group of APP orthologues (APPOs). The conservation of the HD dyad is not position specific and it cannot be seen in a multiple alignment. Yet we show with a novel statistical method using evolutionary modeling that HD motif is positively selected by evolution on APPOs, despite the fact that HD dyad is underrepresented in the proteomes of all species of the animal kingdom. CAED of all APP orthologues and homologues are predicted to bind metal ions including Amyloid-like protein 1 (APLP1) and Amyloid-like protein 2 (APLP2). Our results suggest that HDs on the APPOs are most probably key components of metal-binding domains, which facilitate and/or regulate inter- or intra-molecular interactions in a metal ion-dependent or metal ion concentration-dependent manner. The involvement of naturally occurring mutations of HD (Tottori (D7N) and English (H6R)) in early onset Alzheimer's disease gives additional support to our finding that HD has an evolutionary preserved function on APPOs.
Human Alzheimer precursor protein (APP) gene was brought to the forefront of scientific interest in the late 80's when protein sequencing of the major component of the amyloid plaques, the amyloid β peptide (Aβ) implicated APP in the development of Alzheimer's disease (AD)
APP is a Type-I transmembrane protein with a complex domain organization. So far eight domains were identified on the mammalian APPs, the growth factor like domain, the copper-binding domain, Kunitz-type protease inhibitor domain, the OX2 domain, the glycosylated E2 domain, the unstructured carboxy terminal region of the APP extracellular portion, the transmembrane domain and the short cytoplasmic tail that is involved in transcriptional signaling
The human APP gene is ubiquitously expressed not only in glial and neuronal cells but also in almost all tissues that have been examined. The pre-mRNA contains 19 exons and it is alternatively spliced to produce several isoforms. In the brain APP695 is the major component, which compared to the longer versions, is missing the KPI and the OX2 domains on its extracellular portion
APP is localized to many membranous compartments within the cells. It travels through the endoplasmic reticulum and the Golgi apparatus to reach the cell membrane where it is re-internalized in the lysosomes. During this journey the majority of APP is processed on the cell surface by α-secretases which results in the membrane-bound C83 fragment and the soluble APPsa fragment. C83 is cleaved further in its transmembrane region by gamma secretases and leads to the release of the P3 fragments (Aβ17–40/42) and the APP intracellular domain (AICD) into the intercellular space and the cytosol, respectively. However, the minority of the APP might be processed on the amyloid pathway by the β secretase complex which generates APPsb and a 16 amino-acid longer version of C83, the so-called C99 fragment and to a lesser extent it can also cleave within the Aβ domain between Tyr10 and Glu11. The cleavage of the β secretase-generated fragments by γ secretase leads to the release of the AICD into the intracellular compartment and to the generation of Aβ1–40, the more neurotoxic Aβ1–42, and Aβ11–40/42 (see
The predicted transmembrane domains are in red. From the CAEDs only the regions homologous to the predicted metal-binding site of the human APP are shown. The HD dyads are highlighted by gray. Digestion sites of the human α-, β-, and γ-secretases are marked by arrows.
Several hypotheses sprung up to explain Aβ's contribution to the etiology of AD, including the amyloid cascade
An HD amino acid duplex has been found in the active center of many different enzymes. Most of these are metalloenzymes like phospholipase A2
In α-secretases, which are members of the adamalysin/ADAM metalloproteinase family, the fully conserved Asp-416 is involved in intramolecular hydrogen bond interactions and directly follows the last hystidine of the zinc-binding consensus motif HEXXHXXGXXH
In cycJ/ccmE proteins H of the conserved HD covalently binds and releases the hem prosthetic group
We have identified an HD dyad on the Aβ domain of the mammalian APP proteins. Although position specific conservation is not observed, we show with a novel statistical method using evolutionary modeling that the motif is positively selected along the evolutionary process in the majority of the APP orthologues (APPOs) despite the fact that no other sequence conservation can be recognized on the carboxy-terminal region of the APPOs extracellular domains (CAEDs). In addition, we also show that HD dyads in the proteome of various organisms are under-represented, which further supports the hypothesis that the prevalence of HD in CAEDs is the result of evolutionary selection rather than arbitrary events. The conservation of HD in CAEDs strongly suggests a functional role of this motif, which most likely involves metal coordination or chelation.
APP orthologues have been collected from the NCBI protein databank with the Blast-P program. The following proteins have been found: Acyrthosiphon pisum, XP_001947569.1; Culex quinquefasciatus, XP_001864483.1; Brugia malayi, XP_001899252.1; Caenorhabditis briggsae, XP_002644641.1; Loligo pealei, ABI84193.2; Aplysia californica, AAT07668.3; Aedes aegypti, EAT42567.1; Drosophila simulans, EDX16764.1; Drosophila yakuba, EDX00795.1; Anopheles gambiae str. PEST, XP_312126.4; Nematostella vectensis, EDO45291.1; Drosophila willistoni, XP_002067462.1; Drosophila persimilis, XP_002027785.1; Drosophila pseudoobscura pseudoobscura, XP_001354498.2; Drosophila virilis, XP_002055698.1; Drosophila grimshawi, XP_001992447.1; Drosophila erecta, XP_001982404.1; Drosophila ananassae, XP_001966309.1; Nasonia vitripennis, XP_001601635.1; Culex quinquefasciatus, XP_001864483.1; Pediculus humanus corporis, XP_002426948.1; Manduca sexta, AAY25024.2; Rattus norvegicus, NP_062161.1; Mus musculus, Q53ZT3; Monodelphis domestica, XP_001373948.1; Equus caballus, XP_001499900.2; Sus scrofa, ABB82034.1; Gallus gallus, AAG00594.1 Canis lupus familiaris, AAX81908.1; Macaca fascicularis, BAD51938.1; Ailuropoda melanoleuca, XP_002920108.1; Oryctolagus cuniculus, XP_002716819.1; Pan troglodytes, AAV74286.1; Callithrix jacchus, XP_002761374.1; Stenella coeruleoalba, AAX81912.1; Xenopus (Silurana) tropicalis, AAH75266.1; Xenopus laevis, AAH70668.1; Pongo abelii, NP_001127014.1; Cricetulus griseus, AAB86608.1; Chelydra serpentina serpentina, AAN04908.1; Apis mellifera, XP_624124.3; Ixodes scapularis, XP_002400744.1; Schistosoma mansoni, CAZ32701.1; Hydra magnipapillata, XP_002154415.1; Neohelice granulata, ACO59955.1; Paracentrotus lividus, CN53783.1; Strongylocentrotus purpuratus, XP_790315.2; Saccoglossus kowalevskii, P_002741027.1; Branchiostoma floridae, XP_002613121.1; Narke japonica, BAA24230.1; Takifugu rubripes, O93279.1; Tetraodon fluviatilis, O73683.1; Danio rerio, NP_690842.1; Tetraodon nigroviridis, CAG05838.1.
The retrieved proteins were globally aligned with MultAlin
The tree was calculated from region 1–70 of the alignment shown in
We implemented a program in the Java 1.6 language that takes an evolutionary tree and a sequence labeling its root, and evolves the sequences on the tree according to a substitution model represented with a continuous time Markov model. The Markov model is given by its rate matrix,
Each site in the sequence is evolved independently of the other sites, and the evolution on each branch of the evolutionary tree is also independent of the evolution on other branches. We took the evolutionary tree generated by SplitsTree 4.0, put the
The number of every amino acid dyad in the generated sequences was counted, and the distribution of the sequence sets with different number of sequences containing any given dyads was calculated. The observed number of CAEDs with HD dyads (41 altogether) was compared with the empirical distribution of the 10 000 sequence sets calculated to the HD dyad to test the following hypothesis:
The
We downloaded the Uniprot/Swisprot database, 2011-05-31 release (
For secondary structure prediction the PSIPRED
Jpred runs also a neural network predictor Jnet v3.0, which combines PSI-BLAST position scoring matrix with hidden Markov profiles and achieved a secondary structure prediction score of 81.5% in blind experiments. In a validation test the two programs produced largely overlapping results, however on a portion of the data set one or the other programs gave more accurate prediction
Metal-binding was predicted by the SVMProt server
MetalloPred classifies proteins from sequence derived features (like amino acid composition physicochemical properties and pseudo-amino acid composition) by using a three level cascade of neural networks. The 1st layer of the cascade is for finding metalloproteins, the 2nd layer for the main functional classes (e.g transition metal); and the 3rd layer for identification of the bound metal (e.g. zinc). The accuracy of the program at the first level is reported to be >80%, while the overall accuracy for the correct metal recognition is higher than 60%.
The SVMProt server runs support vector machine prediction systems to predict metal-binding proteins with 10 metal-binding classes (e.g. sodium-binding and zinc-binding, etc). It recognized metal-binding domains and multi-domain metal-binding proteins with more than 80% accuracy in validation tests.
Screening of protein databases revealed an HD dyad on the mammalian APP proteins. The motif is positioned directly on the amyloid beta fragment of human APP (amyloid beta conventional numbering H6, D7; APP conventional numbering H677, D678) and it seems to be conserved among not only mammals but four legged vertebrates (tetrapoda), too. No APLP1 or APLP2 orthologues of any animals contain the motif in their functionally homologous region (data not shown).
To clarify the functional and evolutionary significance of the HD motif in APP proteins, first the available APPOs were collected from protein databanks and their CAEDs and the transmembrane (TD) domains were aligned (
Despite the lack of conservation in the animal kingdom, the majority of the CAEDs contain an HD dyad in their last 70 amino acid region (CAEDC70). In spite of intensive search we were unable to find any local conservation around the dyad or a conserved amino acid pattern in the CAEDs, which would include the HD motif. In human Aβ HD is part of a presumed RGD-like integrin-binding motif RHDS
Dissimilar sequences frequently share similar secondary structures and folds
The presence of HD on different APPOs is far from being arbitrary and shows a progressive taxonomic distribution in certain animal groups. Though it appears first in the Anthozoa class of the Cnidaria phylum, it is absent from the APPOs of the Hydrozoa class, the Platyhelminthes and Echinodermata phyla. Besides tetrapods, it can be found in all the available APPOs of insects, mollusks and nematodes, and it is completely missing from the members of the related primordial taxons of the deuterostomia and arthropoda lineages like cephalochordates, bony and cartilaginous fishes, crustaceans and chelicerate arthropods (
First we examined whether the frequent occurrence of the HD on CAEDs could be the result of overrepresentation of HD in the biota and most importantly in the animal proteomes.
The log-odds of the HD motif as defined in Equation (5) in the vast majority of investigated proteomes resulted in negative values. This indicates that the HD motif is underrepresented in the biota in general, as well as in specific species. Among single-cell organisms HD frequency fluctuates and the log-odds can even take positive values, while in multi-cell organisms, regardless of their taxonomical place, it always remains negative. Based on the presently available data, there seems to be an inverse correlation between the log-odds and the evolutionary development of the phyla in the animal kingdom. In the investigated species the values decrease from primitive Bilateria towards modern Bilateria, and they reach the minimum in vertebrates (
Orange, green and blue columns represent prokaryotes, single-cell eukaryotic and multicell eukaryotic organisms respectively.
On the other hand, the log-odds value of the HD motif in the CAEDC70s is 1.355. Namely the HD motif is overrepresented in the CAEDC70s, its occurrence is much more frequent than the independent distribution of amino acids would indicate.
We have also developed a computer program to study whether neutral evolutionary forces are able to produce such overrepresentation of HDs on the CAEDs. The program takes an evolutionary tree (e.g. the tree is shown in
As an input tree, first the evolutionary tree of the CAEDC70s was chosen (generated from region 1–70 of the alignment (see
As shown in
Every column represents a group of sequence sets with a certain number (0–41) of HD containing sequences. Numbers on the X axis correspond to the number of HD containing sequences in the group. Groups with 39–41 HD containing sequences have 0 member and they are not indicated on the X axis. Labels on the Y axis show the size of the groups in percentage of the total 10 000 sets.
Therefore we conclude that we can reject the neutral evolution hypothesis with a
Though a large number of CAEDC70 were tried as input (among them the
The lack of position-specific conservation raises the question of how the HD dyad can be kept by a selective evolutionary force without position-specific conservation. In our computer simulations, more than 20% of the simulated sequence sets contained at least one sequence with 2 or more HD dyads, hence, we conclude that the probability for the emergence of a new HD dyad is relatively high. If the evolutionary force is only for maintaining at least one of the HD dyads, then the deletion of the old HD dyad will not be prevented by selection just as it happened in the majority of the modern sequences.
To test further the hypothesis that HD motifs might appear by random mutations, we repeated the simulation of neutral evolution, but now we put the
From a structural point of view, an important recognizable common feature of the CAEDC70s is that despite the lack of sequence similarity they are rich in metal coordinating amino acids (His, Glu, Gln Asp, Asn, Tyr, Ser, Thr Arg, Lys). Certain combinations of these amino acids like EE ED and DD are also relatively frequent on CAEDC70s (they occur on 34, 37 and 17 protein segments of the 53 CAED respectively), though their occurrences stay below that of the HD. These dyads are also able to coordinate metal ions. EE was reported as part of manganese and nickel coordinating motifs
Taken together, these data indicate that H and D may be involved in metal coordination not only in human Aβ but also on APPOs, and their evolutionary selection can be related to this function.
Investigation of the CAEDs of evolutionary distant animals revealed that despite the low sequence similarity, the majority of them contain an HD dyad and their membrane-proximal 60–70 amino acid regions are predicted to bind metals. The HD-containing CAEDs belong to the species of well-defined taxonomic groups such as tetrapodes, insects, mollusks and nematodes. We have shown using an evolution model system that although HD is negatively selected in the proteome of different animals, the presence of the HD dyad on CAEDs is most likely the result of positive selection. We want to emphasize that the conservation of the HD dyad is not position specific; hence its conservation cannot be seen in a multiple alignment. However, the positive evolutionary selection of HD has been proved by statistical testing of the sequences. Under the neutral evolution hypothesis, namely, assuming no selection force for maintaining the HD dyads, the probability of the observed abundance of the HD dyads is less than 0.0001.
Computer simulations showed that the emergence of an HD dyad in the CAED sequences is likely even under neutral evolution. Two of the CAEDs contain more than one HD dyads. According to the simulation the probability of observing such multiple occurrences in at least one of the modern sequences is over 20% even in the case of neutral evolution. The probability that at least one of the evolving intermediate sequences contained multiple dyads is even higher. Without the selection force, these appearing HD dyads could mutate, thus the sequence could lose at least one, or even all of them. In case of a selection force, at least one of the HD dyads is preserved in the sequence. However, this dyad might be the one that appeared by random mutations, and the older HD dyad might be deleted. This scenario could explain why we see HD dyads in the majority of sequences without site-specific conservation. Similar migration of functional elements have already been described for transcription factor binding sites in DNA promoter regions as the so-called binding site turnover
The evolutionary selection of HD strongly suggests a functional role of the motif in APPOs, which is probably related to metal coordination. As the HD motif is part of the extended catalytic network of several enzymes, the contribution of the CAEDs' HD to the formation of a catalytic center through inter- or intramolecular interaction may not be excluded. However, the lack of conservation in the vicinity of HD, the variable position of the motif on the CAEDs and the multiple copy occurrences in certain proteins indicate a low probability for this supposition. The same arguments which oppose enzymatic activity, together with the fact that membrane-proximal region of the CAEDs seem to have metal coordination capabilities, rather suggest that HDs on CAEDs could be key components of metal ion-coordinating domains which facilitate and/or regulate inter- or intramolecular interactions in a metal ion-dependent or metal ion concentration-dependent manner. This notion is supported by the findings that Aβ has metal-binding capability
APP and its derivatives interact with large number of proteins. Aβ binds its homologous sequence on APP and also facilitates the oligomerization of the β-secretase cleaved APP C-terminal fragment, C99
If HD is involved in any molecular interaction, which influences APP processing or has any other function that influences the development of AD, then certain mutations of HD may facilitate the manifestation of AD. In fact, there are reports supporting the biological significance of these amino acids in the development of AD. Naturally occurring mutations of HD are involved in the early onset of familial AD in cases from Japan (D7N)
Though substantial amount of data has already accumulated about the pathogenesis and development of AD, finding the cure may require greater knowledge about the physiological role of APP. We hope that our results can stimulate new investigations and contribute to the better understanding of APP's involvement in the development of AD.
Secondary structure predictions of some selected CAEDC70.
(PDF)
The log-odds values of the amino acid dyads in the proteomes of several organisms from the Biota. Table A, shows the number of the amino-acids in the proteomes; Table B, shows the log-odds values which are calculated by equation 5. The first amino acids of the dyads are represented on the vertical axis while the second amino acids are represented on the horizontal axis.
(PDF)
The p values of the different amino acid dyads in the neutral evolution simulation. The values of Panels A, B, and C were calculated from different regions (1–70, 10–54 and 1–96, respectively) of the alignment shown in
(PDF)
Metal-binding prediction results on the CAEDs of APPOs.
(PDF)