Conceived and designed the experiments: SB BP. Performed the experiments: SB. Analyzed the data: SB. Wrote the paper: SB.
BOP and UCSD have a financial interest in Genomatica, Inc. The findings in this manuscript may not benefit Genomatica, Inc.
Reconstructions of cellular metabolism are publicly available for a variety of different microorganisms and some mammalian genomes. To date, these reconstructions are “genome-scale” and strive to include all reactions implied by the genome annotation, as well as those with direct experimental evidence. Clearly, many of the reactions in a genome-scale reconstruction will not be active under particular conditions or in a particular cell type. Methods to tailor these comprehensive genome-scale reconstructions into context-specific networks will aid predictive in silico modeling for a particular situation. We present a method called Gene Inactivity Moderated by Metabolism and Expression (GIMME) to achieve this goal. The GIMME algorithm uses quantitative gene expression data and one or more presupposed metabolic objectives to produce the context-specific reconstruction that is most consistent with the available data. Furthermore, the algorithm provides a quantitative inconsistency score indicating how consistent a set of gene expression data is with a particular metabolic objective. We show that this algorithm produces results consistent with biological experiments and intuition for adaptive evolution of bacteria, rational design of metabolic engineering strains, and human skeletal muscle cells. This work represents progress towards producing constraint-based models of metabolism that are specific to the conditions where the expression profiling data is available.
Systems biology aims to characterize cells and organisms as systems through the careful curation of all components. Large models that account for all known metabolism in microorganisms have been created by our group and by others around the world. Furthermore, models are available for human cells. These models represent all possible biochemical reactions in a cell, but cells choose which subset of reactions to use to suit their immediate purposes. We have developed a method to combine widely available gene expression data with presupposed cellular functions to predict the subset of reactions that a cell uses under particular conditions. We quantify the consistency of subsets of reactions with existing biological knowledge to demonstrate that the method produces biologically realistic subsets of reactions. This method is useful for determining the activity of metabolic reactions in
Commonly, genome-scale metabolic networks are reconstructed to contain all known metabolic genes and reactions in a particular organism
Knowledge of transcriptional regulation of metabolism comes from different sources. At a low level, from the bottom up, some of the regulatory proteins that control the transcription of sets of metabolic genes are known
There are three ways to study how regulation tailors gene-expression under a specific condition. First, if a transcriptional regulatory network network (TRN) is available, then the transcription state of the cells can be computed for a given input
The third approach to study regulation relies on the available of expression profiling data. If such data is available for the conditions being examined we can directly examine the expression of the ORFs accounted for in a genome-scale reconstruction. Metabolic network reconstructions can be combined with gene expression data from different states to identify regulatory principles in organisms
The results of these methods are dependent on the quality of the expression data that is used as input. Expression data is known to be noisy, and the variety of methods for converting the fluorescence intensity of thousands of spots on a chip to semi-quantitative readings of mRNA molecule counts do not produce equivalent results
Here we use gene expression data in combination with objective functions to create functional models despite potentially noisy data. We describe the use of genome-scale transcriptomic data to constrain reactions in both bacteria and human cells, enabling context-specific metabolic networks to be reconstructed and compared. We quantitatively define the consistency of gene expression data with assumed functional states of a cell, demonstrating agreement with physiological data. Context-specific metabolic networks will be virtually essential to accurately model human metabolism due to the variety of cell types and their corresponding metabolic processes.
The approach to the construction of context-specific metabolic networks is termed Gene Inactivity Moderated by Metabolism and Expression (GIMME) and is illustrated in
The GIMME algorithm takes three inputs: gene expression (or any other data type) mapped to reactions, a metabolic reconstruction, and one or more RMFs. A metabolic reconstruction is mapped through a data set, removing reactions that are not available and creating a reduced model. Reactions are reinserted into the reduced model as needed to achieve RMFs (such as growth and/or ATP production), resulting in a functional, context-specific model that features minimal disagreement with the data. The consistency score quantifies the disagreement with data, showing the minimal sum of fluxes weighted with reaction data deviations from data.
Simply speaking, reactions that correspond to mRNA transcript levels below a specified threshold are tentatively declared inactive. If the cell cannot achieve the desired functionality without at least one of these reactions, linear optimization is used to find the most consistent set of reactions to reactivate. Inconsistency scores are calculated based on the product of distance from threshold and necessary flux for each reaction required to be reactivated, as illustrated in
Find the maximum possible flux through each RMF (allowing usage of all reactions).
Constrain the RMF's to operate at or above some minimum level (generally a percentage of the maximum found in [A]) and identify the set of available reactions that best fit a quantitative data set.
Inconsistency scores for each reaction are computed by multiplying the deviation from a threshold by the required flux through a reaction. In the example here, the green reactions have data above the threshold, set to 12 (this is a parameter; see text). The red reactions have data below the threshold (11.4 and 8.2). The calculation of the inconsistency score corresponding to each reaction is shown numerically as flux multiplied by the deviation from the cutoff. They each increase the inconsistency score, implying that the data are less consistent with the objective of growing on lactate. Greater required fluxes and greater deviation from the threshold both increase the inconsistency scores. The total inconsistency score is the sum of all individual reaction scores.
Part A is achieved through flux balance analysis (FBA)
The above optimization problem would generally be difficult to solve due to the presence of an absolute value operator, but in this case, a trivial simplification converts the above problem to a standard LP problem. Each reaction defined as possibly reversible (containing a negative lower bound) is converted to two irreversible reactions, thus restricting all fluxes to be positive, and removing the need for the absolute value.
In general, some reactions will not have available data. The algorithm takes a conservative approach and designates these reactions as active; hence the term “gene inactivation” is part of the method name. The algorithm treats these reactions as if they had data that surpassed the cutoff; this is a conservative approach to avoid any penalty for absent data. The lack of data does have implications for the interpretation of results. It is entirely possible that given better data, these reactions would be determined to be absent, perhaps necessitating the activation of other reactions. Clearly, with limited data, the results must be considered with caution. In general, this is far more of a concern for human metabolic networks than for
We have used the GIMME algorithm to produce context-specific metabolic networks for
Adaptive evolution has been used in the laboratory to improve the growth rate of
The gene expression data used to construct the models consists of CEL files containing the data described in
Normalized consistency scores are computed directly from the inconsistency scores, as described in the text. A higher normalized consistency score indicates that the gene expression data is relatively more consistent with the RMF. Thus, here the gene expression data from the glycerol-evolved strains are more consistent with highly efficient growth on each of the carbon sources tested. The p values, determined by permutation testing, are less than 0.01 in all cases here.
This figure demonstrates the same result as
Metabolic engineering seeks to optimize bacterial strains to produce a valuable product from a less expensive set of molecules. Rational design of strains for metabolic engineering is possible with genome-scale metabolic models
The normalized consistency score for an
We used this data set to verify the robustness of the algorithm to two different factors. First, we tested the effect of altering the cutoff by recalculating the results for cutoffs ranging from eight to 14, in increments of 0.1. We found that the consistency scores were significantly different for all cutoffs. For some cutoffs, the p value was not as good as for others, but p<0.01 for all choices of cutoff within the range tested. Second, we verified the robustness of the algorithm with a jackknife test. We randomly removed 5% of the expression values mapped to reactions 100 times and recomputed the context-specific networks and consistency scores. We found that for all repetitions the same conclusion was reached, although in some cases the p values were not quite as low as when using all of the data. In all cases, the conclusion was reached with p<0.02, which demonstrates slightly lower performance with all of the data available. This suggests that the algorithm should be expected to have greater statistical power when as many reactions as possible are assigned data.
The growth of
A graphical representation of the log2 transform of the difference between inconsistency scores. A green box indicates that the sample on the
A graphical representation of the log2 transform of the difference between inconsistency scores. A green box indicates that the sample on the
A graphical representation of the log2 transform of the difference between inconsistency scores. A green box indicates that the sample on the
The wide variety of human cell types in the body do not share a simple objective such as cellular growth, but rather have a multiplicity of functions necessary for multi-cellular life. Accordingly, understanding the metabolism of any particular cell type requires a model that contains only the reactions present in that cell type, without potentially thousands of extraneous reactions. Human Recon 1
We used three publicly available sets of gene expression data for skeletal muscle cells, as depicted in
Abbreviation | Description | Reference | GEO Accession Number |
GB | 3 patients before and 1 year after gastric bypass surgery (vastus lateralis). | GDS2089 | |
GI | 6 subjects before glucose/insulin infusion via clamp and 2 hours after beginning (vastus lateralis) | GSE7146 | |
FO | 24 subjects divided into 3 groups of eight: morbidly obese (MO), not obese (NO), and obese (O) (rectus abdominus). | GDS268 |
These three datasets were originally gathered for purposes completely distinct from creating context-specific metabolic networks, just as the
Reactions in the white area have no usable gene chip data on either platform. Reactions in grey have usable data only on the 133+ 2.0 platform. Reactions in black have usable data for both the 133+ 2.0 and the 133A platform. Importantly, 5% (179) of the reactions are only represented on the 133+ 2.0 chip, potentially increasing scores across chips. The average difference score is 340, so a difference of 179 reactions is greater than a 50% impact.
Each of the 42 (6+12+24) gene expression datasets was used with the GIMME algorithm to create a model that produces ATP at no less than half the optimal efficiency and matches the data as closely as possible. These models were compared on a pairwise basis by finding the number of reactions that are different in the two models under comparison. On average, two models differ by 340 reactions, which is approximately 10% of the reactions in the global model. The pairwise distances are shown graphically in
This heat map displays the level of difference in each pair of models. Darker squares represent models that are more similar to each other than lighter squares. A black square (as on the diagonal) indicates identical models, and a white square indicates the most different pair of models. The three darker blocks that surround the main diagonal are the comparisons of samples within each dataset to each other. These darker blocks show that the models within each dataset tend to be more similar to each other than to models from other datasets. The models from a particular expression array type also appear to be more similar to each other than to models from a different array types, but the data available do not allow us to show that this is actually true, as is shown in
This figure is the same as
In spite of the difficulties of comparing models derived from different sources, two statistically significant differences emerge from the analysis. First, a given patient is more similar to himself before and after either gastric bypass or glucose/insulin infusion than he is to other patients. We took the similarity scores for the GB and GI patients and created two separate groups: (A) all matched patients before and after and (B) all unmatched patients from the same dataset. Permutation testing demonstrated that group A has a smaller mean distance than group B (p<0.01). Secondly, we looked at consistency scores, asking if any group was more consistent with high ATP production than any other. Only one statistically significant (p<0.01) result emerged, that the after-GI patients are more consistent with high ATP production than the before-GI patients. Again, this result is exactly as expected; muscle cells that have been given a substantial dose of glucose and insulin in the bloodstream should be more consistent with high ATP production.
The work reported herein details the first available method to both produce a guaranteed functional metabolic model specific to a set of gene expression data and quantify the agreement between gene expression data and one or more metabolic objectives. We have demonstrated the functionality of this GIMME method with gene expression data from
Initially, we expected that the results for human models would be more interesting than those for any other organism reconstructed to date, principally because we expected that human cells would show the most variability across conditions. However, the lack of available data for a substantial number of human metabolic reactions confuses attempts at comparison. We showed that reducing the number of reactions considered by 5% can change the apparent differences between different datasets. In addition, the lack of replicates in human gene expression data sets and the difficulty in obtaining high quality biological controls complicates matters and reduces the statistical power of comparisons. We have higher confidence in the results presented for
With metabolic reconstructions growing in size and becoming available for more and more organisms, tools to filter global reaction lists into context-specific reaction lists will be highly useful. Meaningful analysis of the human metabolic network will require procedures such as GIMME in order to accurately predict phenotypes.
The metabolic networks for
The gene expression data was obtained as CEL files and processed using Bioconductor
The GIMME algorithm is implemented in Matlab, using functions in the COBRA Toolbox. In general, any robust linear programming solver should work; we used Tomlab (Tomlab Optimization, Pullman, WA).
The output from the GIMME algorithm is an inconsistency score, and a higher score means that the gene expression data is less consistent with the model achieving the desired objective. For visualization purposes only, these scores are converted into normalized consistency scores, with a higher score indicating greater consistency between the data and the modal achieving the objective. For a given set of scores, each inconsistency score is subtracted from 1.02 * (maximum inconsistency score) to produce a set of consistency scores. Each consistency score is divided by the maximum consistency score to produce a set of normalized consistency scores. The 1.02 factor assures that the smallest consistency score is slightly greater than zero and easy to visualize on a graph.
Permutation testing with 10,000 randomizations was used to determine the statistical significance of all results with regard to consistency scores. This testing was implemented in Matlab.
Heat-map type representations were produced in Matlab. Other graphs were produced in Excel (Microsoft, Redmond, WA).
We thank Neema Jamshidi and Monica Mo for testing and discussions regarding the algorithm. We thank Markus Hergard and Andrew Joyce for discussions regarding gene expression analysis. We thank Shankar Subramaniam for advice regarding statistics and algorithm validation.