Conceived and designed the experiments: VSK CDM. Performed the experiments: VSK. Analyzed the data: VSK. Contributed reagents/materials/analysis tools: VSK. Wrote the paper: VSK CDM.
The authors have declared that no competing interests exist.
Genome-scale metabolic reconstructions are typically validated by comparing
Over the past decade, mathematical models of cellular metabolism have been constructed for describing existing metabolic processes. The gold standard for testing the accuracy and completeness of these models is to compare their cellular growth predictions (i.e., cell life/death) across different scenarios with available experimental data. Although these comparisons have been used to suggest model modifications, the key step of identifying these modifications has often been performed manually. Here, we describe an automated procedure GrowMatch that addresses this challenge. When the model overpredicts the metabolic capabilities of the organism by predicting growth in contrast with experimental data, we use GrowMatch to restore consistency by suppressing growth enabling biotransformations in the model. Alternatively, when the model underpredicts the metabolic capabilities of the organism by predicting no growth (i.e., cell death) in contrast with available data, we use GrowMatch to restore consistency by adding growth-enabling biotransformations to the model. We demonstrate the use of GrowMatch by reconciling growth prediction inconsistencies of the latest
There are currently 700 completely sequenced genomes along with extensive compilations of data
As shown in
The proposed method makes use of gene essentiality data sets currently available for many microorganisms
The need to develop automated procedures to improve the accuracy of existing metabolic reconstructions has been recognized and has led to the development of a number of computational procedures. To this end, Reed et al.
In this paper, we supplement previous efforts
Similarly, NGG inconsistencies are corrected one-by-one to GG by identifying the minimal set of model modifications (i.e., through reaction or transport mechanism addition or reaction reversibility allowance) that enable biomass formation (above a pre-specified cutoff). If none of these modifications affect any of the consistent NGNG cases, we refer to them as
Here, we demonstrate the use of GrowMatch to resolve growth prediction inconsistencies between the latest
Cutoff Value | Type of Mutant | |||
GNG | NGNG | NGG | GG | |
1% | 45 | 112 | 96 | 1027 |
10% | 55 | 135 | 53 | 1017 |
50% | 107 | 160 | 28 | 965 |
Values are a percentage of
GNG Mutant | Associated Essential Reaction (Pathway) |
SHK3Dr (Tyrosine, Tryptophan and Phenylalanine metabolism) | |
HCO3E (Unassigned) | |
ALAAlAr (Cell Envelope Biosynthesis) | |
12 reactions (Cell Envelope Biosynthesis) | |
DHFR (Cofactor and Prosthetic Group Biosynthesis) | |
MCTP1App (Murein Biosynthesis) | |
GLNS (Glutamate metabolism) | |
THRD_L (Valine, Leucine and Isoleucine metabolism) | |
CYSTL (Methionine Metabolism) | |
METS (Methionine metabolism) | |
ASPK or HSDY (Threonine and Lysine metabolism) | |
MCTP1App (Murein Biosynthesis) | |
ASPK or HSDY (Threonine and Lysine metabolism) | |
OPHBDC (Cofactor and Prosthetic Group Biosynthesis) | |
H2Otex (Transport, Outer Membrane) |
We define complementary (non-complementary) isozymes as pairs of isozymes that satisfy the following two conditions: (a) at least one of the isozymes is encoded by a gene associated with a GG (GNG) mutant and (b) the isozymes catalyze an essential reaction (under aerobic glucose conditions). We checked the sequence similarity of complementary and non-complementary isozymes using the BlastP algorithm. The results are available in
To see if the genes that code for non-complementary isozymes are inactive under aerobic minimal glucose, we checked their expression levels. Specifically, we examined the relative expression levels for these pairs of genes (deleted gene and gene associated with non-complementing isozyme) available at Covert et al.,
All abbreviations are taken from the
The deleted genes in the second group (i.e., 26 GNG mutants) encode for enzymes that catalyze blocked reactions in the metabolic network. Blocked reactions are defined as reactions that cannot carry any flux under given substrate conditions
GrowMatch resolved 23 of these 26 inconsistencies by suitably adding biomass components to the biomass equation. Specifically, consistency to six GNG mutants (
The third group of GNG mutants involves deleted genes that do not encode isozymes and are not associated exclusively with blocked reactions. We used GrowMatch to identify reaction suppressions that drop the biomass production below the predefined growth cutoff. We allowed for up to
GNG Mutant | Deleted Reaction(s) | Additionally Suppressed Reaction(s) |
GHMT2r | ||
IMPD | ||
PGCD | ||
PSP_l | ||
G5SD | ||
GLU5K | ||
CBPS | CBMKr (unassigned) and OXAMTC (unassigned) | |
CBPS | CBMKr (unassigned) and OXAMTC (unassigned) | |
13 reactions (8 with isozyme) | PPM or PRPPS or R15BPK | |
CYSabc2pp, GTHRDabc2pp | (GLYAT AND GLYCL) or (AACTOOR and GLYCL) | |
PRPPS | ||
GAPD | PPS | |
RNDR1, RNDR2, RNDR3, RNDR4 | TRDR or GTHOr or GRXR | |
RNDR1, RNDR2, RNDR3, RNDR4 | TRDR or GTHOr or GRXR | |
ENO | PPS | |
PGK | PPS | |
14 reactions | FBA and TPI | |
URIDK2r | ( |
Suppressions in bold are valid when the growth medium is changed from minimal glucose to minimal glycerol.
We tested the sensitivity of the identified suppressions to the growth medium by changing the medium from minimal glucose to minimal glycerol. Based on the data available in
All abbreviations are taken from the
Restoring growth for the NGG predictions requires that production routes be established in the metabolic model for all 63 precursor metabolites to biomass.
We next use GrowMatch to resolve the NGG inconsistencies by adding pathways using one or more of the three mechanisms discussed previously. GrowMatch identified consistency-restoring hypotheses for 5/38 mutants. Interestingly, one NGG mutant
The first three NGG resolutions were corrected by adding
The other three resolutions (see
NGG | Secreted Metabolite |
glycoaldehyde | |
S-Ribosyl-L-homocysteine | |
3,4-dihydroxy-2-butanone 4-phosphate |
Here we have developed an automated procedure, GrowMatch, to resolve
GrowMatch resolved eighteen GNG inconsistencies by suggesting suppressions in the mutant metabolic networks whereas fifteen inconsistencies were resolved by suppressing isozymes in the metabolic network. The remaining 23 inconsistencies corresponding to blocked genes were repaired by simply adding component(s) of the associated blocked reactions to the biomass equation (
In this study, we were able to pinpoint missing functionalities that may have been overlooked during model reconstruction. In one such example, were able to resolve a NGG mutant by adding a reaction (i.e., sulfate adenylyltransferase) with documented evidence of its being present in
In line with recent explanations for GNG inconsistencies in
It is important to note that GrowMatch makes use of parsimony criteria to prioritize alternative model correcting hypotheses. Therefore, biologically relevant hypotheses that involve more than the selected maximum allowed limit of model modifications will be missed. Also, using alternate cellular objectives such as MOMA
In summary, we believe that GrowMatch, in conjunction with GapFill, are useful model-refinement tools during the reconstruction of new metabolic models or testing/curation of existing ones. In addition to the use of GrowMatch to restore growth inconsistencies for the latest
First, we define the sets, parameters and variables that are common to the mathematical procedures formulated to resolve NGG and GNG inconsistencies. To this end, we define the index sets, {
These definitions imply that if there exists two isozymes
Upper and lower bounds,
A GNG single gene deletion mutant occurs when the model predicts growth whereas no growth is observed
The suppressions required to ensure that the maximum biomass formation is below the imposed cut-off
The aim of GrowMatch is to identify the minimal number of reaction suppressions needed to zero the maximum biomass formation. We do this by ensuring that there is no biomass formation even when fluxes in the network are systematically re-apportioned so as biomass formation is maximized. This leads to a
For GNG mutants associated with genes encoding isozymes, we check if simply deleting the associated reaction prohibits
NGG mutants are characterized by the lack of growth
Based on these definitions, we next identify the minimal number of modifications required to correct a single NGG mutant corresponding to the
In GrowMatch, the objective function minimizes the number of modifications (addition of reactions or activation of secretion of metabolites) in the metabolic model. The first constraint enforces zero flux through reactions that are rendered absent through the elimination of the genes that are knocked out in experiment
We test the hypotheses generated to resolve the NGG mutant using the following two criteria. For reactions added from the database, we check the two-way protein-protein BLAST expectation value between the enzyme that catalyzes that reaction and the genome of interest (in this case
In our simulations, we set the glucose uptake rate to 10 mmol/gDW hr, ATP maintenance to 8.39 mmol/gDW and oxygen uptake rate to 15 mmol/gDW hr. We also turn off the reactions given in
Blattner numbers of genes associated with GNG mutants
(0.03 MB XLS)
Blattner numbers of genes associated with NGG mutants
(0.44 MB XLS)
BLAST scores and expression data for complementary and non-complementary isozymes
(0.05 MB XLS)
Sequence similarity between genes associated with NGG mutants and alternative genes in the
(0.04 MB XLS)
Components added to
(0.02 MB XLS)
We would like to thank Dr. Patrick F. Suthers and Dr. Anthony P. Burgard for the many useful discussions and comments during the preparation of the manuscript. We would also like to thank the anonymous reviewers for the detailed suggestions that enabled us to improve the manuscript substantially.