ࡱ > 6H bjbj 1 6@ A A A A A U U U U D < U ! 0 Q! # b A @ A A ^ ^ ^ A A ^ ^ ^ ^ P< U d ^ 0 ! ^ U$ B U$ ^ U$ A ^ ^ ^ ! U$ : Supporting Information for Hadzipasic, et al. A Horizontal Alignment Tool for Numerical Trend Discovery in Sequence Data.
Text S1: Significant HePCaT matches are robust to different hydrophobicity scales.
Introduction
The purpose of this Supporting Material is to document the robustness of two significant HePCaT matches discovered using the Kyte-Doolittle Hydropathy scale ADDIN EN.CITE Kyte198232532532517Kyte, J.Doolittle, R.F.A simple method for displaying the hydropathic character of a protein.Journal of Molecular BiologyJournal of Molecular Biology105-132.1571982[1] with respect to other common, yet diverse, hydrophobicity scales. The original matches reported in the main text were between human adenosine receptor A2a (gi|5921992) and human taste receptor type 2, member 19 (gi|28882035), and between the pore-forming domain of E. coli colicin A (SCOP ADDIN EN.CITE Andreeva200899917Andreeva, A.Howorth, D.Chandonia, J. M.Brenner, S. E.Hubbard, T. J.Chothia, C.Murzin, A. G.MRC Centre for Protein Engineering, Hills Road, Cambridge CB2 0QH, UK.Data growth and its impact on the SCOP database: new developmentsNucleic Acids ResNucleic acids researchNucleic Acids ResearchD419-2536Database issue*Databases, Protein/trendsEvolution, MolecularGenomicsInternet*Protein Structure, TertiaryProteins/*classification/genetics2008Jan1362-4962 (Electronic)18000004http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18000004 eng[2] domain d1cola_) and ORFan protein ADDIN EN.CITE Yomtovian201037337337317Yomtovian, I.Teerakulkittipong, N.Lee, B.Moult, J.Unger, R.Composiiton bias and the origin of ORFan genes.BioinformaticsBioinformatics996-999262010[3] TC0624 from C. muridarum (gi|7190664). These matches are displayed in Figures 4 and 6 of the main text, respectively.
Materials and Methods
These two pairwise comparisons were repeated using hydrophobicity profiles made from four scales other than Kyte-Doolittle. These scales were subjectively selected based on visual inspection of the comprehensive clustering of most known hydrophobicity scales, shown in Figure 4 of AAindex (http://www.genome.jp/aaindex). ADDIN EN.CITE Tomii199642242242217Tomii, K.Kanehisa, M.Analysis of amino acid indices for prediction of protein structure and function.Protein EngineeringProtein Engineering27-3691996[4] The selected scales, unlike the Kyte-Doolittle scale, are centrally located in said clustering, but are nonetheless dispersed from each other and from the Kyte-Doolittle scale. These criteria were used in an attempt to bias the comparisons away from those reported in the main text, thereby testing whether the matches were strongly dependent on one particular scale.
As described, these profiles were all constructed using a window averaging of 15 residues. The scales were derived from either different experimental methods or computational analysis of protein structures as taken from the literature. For visual comparison to the Kyte-Doolittle scale, the signs of values were uniformly negated from the published scale if necessary. The values used in these calculations are given in Table S1.
Table S1. Four diverse hydrophobicity scales.
Amino AcidRadzicka-Wolfenden ADDIN EN.CITE Radzicka198842642642617Radzicka, A.Wolfenden, R.Comparing the polarities of the amino acids: side-chain distribution coefficients between the vapor phase , cyclohexane, 1-octanol, and neutral aqueous solution.BiochemistryBiochemistry1664-1670271988[5] aNozaki-Tanford ADDIN EN.CITE Nozaki197142342342317Nozaki, Y.Tanford, C.The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. Establishment of a hydrophobicity scale.Journal of Biological ChemistryJournal of Biological Chemistry2462211-22171971Levitt197642442442417Levitt, M.A simplified representation of protein conformations for rapid simulation of protein folding.Journal of Molecular BiologyJournal of Molecular Biology59-1071041976[6,7] aFauchere-Pliska ADDIN EN.CITE Fauchere198342542542517Fauchere, J.Pliska, V.Hydrophobic parameters pi of amino acid side chains from the partitioning of N-acteyl-amino-acid amides.European Journal of Medicinal ChemistryEuropean Journal of Medicinal Chemistry369-375181983[8] a Rose, et al. ADDIN EN.CITE Rose198542142142117Rose, G.D.Geselowitz, A.R.Lesser, G.J.Lee, R.H.Zehfus, M.H.Hydrophobicity of amino acids in globular proteins.ScienceScience834-8382291985[9]W-1.39-3.4-2.250.85F-2.04-2.5-1.790.88Y1.08-2.3-0.960.76M-1.41-1.3-1.230.85L-3.98-1.8-1.700.85I-3.98-1.8-1.800.88V-3.10-1.5-1.220.86A-0.87-0.5-0.310.74C-0.34-1.0-1.540.91G0.00.00.00.72P0.0-1.4-0.720.64T3.51-0.4-0.260.70S4.340.30.040.66N7.580.20.600.63Q6.480.20.220.62D9.662.50.770.62E7.752.50.640.62H5.60-0.5-0.130.78R15.863.01.010.64K6.493.00.990.52a. The numerical values for these three scales are taken from Creighton. ADDIN EN.CITE Creighton19931501501506Creighton, Thomas L.Proteins: Structures and Molecular Properties21993New YorkW.H. Freeman and Company[10]
Results and Discussion
The effects of different hydrophobicity scales were first assessed by comparing how they affected the profiles of each protein individually. This assessment was quantified using Pearson correlations ADDIN EN.CITE Press19928282826Press, W.H.Teukolsky, S.A.Vetterling, W.T.Flannery, B.P.Numerical recipes in C: the art of scientific computing2nd edition1992New YorkCambridge University Press[11] of the average hydrophobicity values at each position in the protein, computed over all possible pairs of scales. The results are displayed in Table S2, and suggest that the Rose, et al. scale is most highly correlated with Kyte-Doolittle, while the Nozaki-Tanford scale is among the least correlated. However, the lowest correlation coefficient in Table S2 is 0.75, indicating that all scales contain similar information to Kyte-Doolittle when the averaging over 15 residues is performed. Examples of this worst correlation, as well as the best, are given in Figure S1 to give a visual sense of the differences in information content.
Table S2. Pearson correlation coefficients (R2) between Kyte-Doolittle hydropathy profile and profiles computed from other hydrophobicity scales for identical proteins.
Protein NameRadzicka-WolfendenNozaki-TanfordFauchere-PliskaRose, et al.Human A2a Receptor0.880.750.850.90Human Taste Receptor Type 2 Member 190.950.790.890.92E. coli Colicin0.920.770.860.91ORFan TC06240.930.890.970.97
Figure S1. Examples of best and worst correlations between different hydrophobicity scales suggest little loss of hydropathy profile information.
SHAPE \* MERGEFORMAT
Next, the optimal HePCaT matches described in the main text were recomputed using the different profiles. Importantly, all HePCaT parameters were unchanged in these calculations except for the scale, and the calculations were executed on the public site at http://www.best.bio.jhu.edu/HePCaT. In all cases, the results indicated that the identical regions matched using the Kyte-Doolittle scale were also matched when different scales were used (Figure S2). The identical regions were matched regardless of the strength of the correlation of the aligned positions (Table S3). Table S3 shows that, although the Kyte-Doolittle scale happens to exhibit the nominally most similar matches, other scales, such as Rose, et al. and Radzicka-Wolfenden, approach that quality. These results are interpreted as evidence for the robustness of location and quality of optimal HePCaT matches, with respect to the exact details of hydrophobicity scale used.
Figure S2. Examples of best and worst optimal matches using different hydrophobicity scales demonstrate that these matches are independent of scale.
SHAPE \* MERGEFORMAT
Table S3. Pearson correlation coefficients (R2) between HePCaT optimally aligned positions using different hydrophobicity scales.
Protein PairRadzicka-WolfendenNozaki-TanfordFauchere-PliskaRose, et al.Kyte-DoolittleHuman A2a Receptor
vs.
Human Taste Receptor Type 2 Member 190.860.840.850.870.92E. coli Colicin
vs.
ORFan TC06240.920.790.900.930.96
References
ADDIN EN.REFLIST 1. Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology 157: 105-132.
2. Andreeva A, Howorth D, Chandonia JM, Brenner SE, Hubbard TJ, et al. (2008) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36: D419-425.
3. Yomtovian I, Teerakulkittipong N, Lee B, Moult J, Unger R (2010) Composiiton bias and the origin of ORFan genes. Bioinformatics 26: 996-999.
4. Tomii K, Kanehisa M (1996) Analysis of amino acid indices for prediction of protein structure and function. Protein Engineering 9: 27-36.
5. Radzicka A, Wolfenden R (1988) Comparing the polarities of the amino acids: side-chain distribution coefficients between the vapor phase , cyclohexane, 1-octanol, and neutral aqueous solution. Biochemistry 27: 1664-1670.
6. Nozaki Y, Tanford C (1971) The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. Establishment of a hydrophobicity scale. Journal of Biological Chemistry 246.
7. Levitt M (1976) A simplified representation of protein conformations for rapid simulation of protein folding. Journal of Molecular Biology 104: 59-107.
8. Fauchere J, Pliska V (1983) Hydrophobic parameters pi of amino acid side chains from the partitioning of N-acteyl-amino-acid amides. European Journal of Medicinal Chemistry 18: 369-375.
9. Rose GD, Geselowitz AR, Lesser GJ, Lee RH, Zehfus MH (1985) Hydrophobicity of amino acids in globular proteins. Science 229: 834-838.
10. Creighton TL (1993) Proteins: Structures and Molecular Properties. New York: W.H. Freeman and Company.
11. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992) Numerical recipes in C: the art of scientific computing. New York: Cambridge University Press.
' - { | } 7 = > d { | U V Y Z [ j w e
l
|
}
~
. n o ûïççËÓÃË h*mI j hm Uhm hh 6hm hBmy hh6 j h Uh hG hG hG 6h2<