Conceived and designed the experiments: LGA. Performed the experiments: INM ADC LGA. Analyzed the data: INM PS ADC LGA. Contributed reagents/materials/analysis tools: AM JSR LGA. Wrote the paper: AM LGA. Conceived and formulated the integer linear program for the more efficient presentation of the Boolean model: AM. Implemented the formulation and linked to commercial solvers: AM. Provided access to commercial solvers (license-based): AM. Secondary inventor of the methodology for drug effects via drug-induced pathway alterations: AM. Wrote parts of the manuscript (mostly the computational part): AM. Run the computational case studies under the supervision of Alexopoulos and Mitsos: INM PS. Prepared the figures of the manuscript under the supervision of Alexopoulos: INM PS. Provided data/tools of previous approach for optimization of Boolean networks with genetic algorithm (GA): JSR. Assisted in the reformulation of the Boolean model and in the comparison with the GA approach: JSR. Edited the manuscript: JSR. Initiated and led the project and collaborations: LGA. Conceived the idea of using more sophisticated computational methods for fitting of experimental data after his initial discussion with AM: LGA. Provided the facilities and materials for the experiments: LGA. Main inventor of the methodology for finding drug effects via drug-induced pathway alteration: LGA. Wrote most of the paper: LGA.
The authors have declared that no competing interests exist.
Understanding the mechanisms of cell function and drug action is a major endeavor in the pharmaceutical industry. Drug effects are governed by the intrinsic properties of the drug (i.e., selectivity and potency) and the specific signaling transduction network of the host (i.e., normal vs. diseased cells). Here, we describe an unbiased, phosphoproteomic-based approach to identify drug effects by monitoring drug-induced topology alterations. With our proposed method, drug effects are investigated under diverse stimulations of the signaling network. Starting with a generic pathway made of logical gates, we build a cell-type specific map by constraining it to fit 13 key phopshoprotein signals under 55 experimental conditions. Fitting is performed via an Integer Linear Program (ILP) formulation and solution by standard ILP solvers; a procedure that drastically outperforms previous fitting schemes. Then, knowing the cell's topology, we monitor the same key phosphoprotein signals under the presence of drug and we re-optimize the specific map to reveal drug-induced topology alterations. To prove our case, we make a topology for the hepatocytic cell-line HepG2 and we evaluate the effects of 4 drugs: 3 selective inhibitors for the Epidermal Growth Factor Receptor (EGFR) and a non-selective drug. We confirm effects easily predictable from the drugs' main target (i.e., EGFR inhibitors blocks the EGFR pathway) but we also uncover unanticipated effects due to either drug promiscuity or the cell's specific topology. An interesting finding is that the selective EGFR inhibitor Gefitinib inhibits signaling downstream the Interleukin-1alpha (IL1α) pathway; an effect that cannot be extracted from binding affinity-based approaches. Our method represents an unbiased approach to identify drug effects on small to medium size pathways which is scalable to larger topologies with any type of signaling interventions (small molecules, RNAi, etc). The method can reveal drug effects on pathways, the cornerstone for identifying mechanisms of drug's efficacy.
Cells are complex functional units. Signal transduction refers to the underlying mechanism that regulates cell function, and it is usually depicted on signaling pathways maps. Each cell type has distinct signaling transduction mechanisms, and several diseases arise from alterations on the signaling pathways. Small-molecule inhibitors have emerged as novel pharmaceutical interventions that aim to block certain pathways in an effort to reverse the abnormal phenotype of the diseased cells. Despite that compounds have been well designed to hit certain molecules (i.e., targets), little is known on how they act on an “operative” signaling network. Here, we combine novel high throughput protein-signaling measurements and sophisticated computational techniques to evaluate drug effects on cells. Our approach comprises of two steps: build pathways that simulate cell function and identify drug-induced alterations of those pathways. We employed our approach to evaluate the effects of 4 drugs on a cancer hepatocytic cell type. We were able to confirm the main target of the drugs but also uncover unknown off-target effects. By understanding the drug effects in normal and diseased cells we can provide important information for the analysis of clinical outcomes in order to improve drug efficacy and safety.
Target-based drug discovery is a predominant focus of the pharmaceutical industry. The primary objective is to selectively target protein(s) within diseased cells in order to ameliorate an undesired phenotype, e.g., unrestrained cell proliferation or inflammatory cytokine release. Ideally, other pathways within the diseased cells, as well as similar phenotypes in other cell types, should remain unaffected by the therapeutic approach. However, despite the plethora of new potential targets emerged from the sequencing of the human genome, rather few have proven effective in the clinic
Finding drug's targets is traditionally based on high-throughput
To address drug effects in more physiological conditions, novel genomic and proteomic tools have recently been developed
Here, we describe a significantly different approach to identify drug effects where drugs are evaluated by the alterations they cause on signaling pathways. Instead of identifying binding partners, we monitor pathway alterations by following key phosphorylation events under several treatments with cytokines. The workflow is presented in
(A) A Boolean generic map is assempled from pathway databases and includes stimuli (green squares), key measured phosphoproteins (brown circles), and the neighboring proteins (yellow circles). (B) Cells are treated with a combination of cytokines and selective inhibitors (red circles) of known effects and an ILP formulation is used to fit the data to the Boolean pathway. (C) A cell-type specific pathway is constructed. (D) Cells are treated with a combination of cytokines and drugs –their effects are assumed unknown- and ILP is used for the second time to fit the drug-induced phosphorylation data. (E) Alterations of the the cell-type specific topology reveals drug effects (red arrows).
In contrast to previously developed techniques, our method is based on the actual effect on phosphorylation events carefully spread into the signaling network. Theoretically, it can be applied on any type of intracellular perturbations such as ATP-based and allosteric kinase inhibitors, RNAi, shRNA etc. On the computational front, our ILP-based approach performs faster and more efficient than current algorithms for pathway optimization
High-throughput bead-based ELISA-type experiments using xMAP technology (Luminex, Texas, USA) are performed as briefly described in the
The generic pathway map is constructed in the neighborhood of the 5 stimuli and the 13 measurements. The ubiquitous presence of conflicting reports on pathway maps and alternative protein names makes this step a highly nontrivial one. We explored several pathway databases including STKE, Pathway Interaction Database, KEGG, Pathway Commons, Ingenuity, and Pathway Studio
A detailed description of Boolean representation of pathways can be found elsewhere
The ILP algorithm is using a subset of postulated reactions denoted with black and gray arrows in a generic pathway to construct a HepG2 pathway map (black arrows in pathway diagram). Gray triangles show phosphoprotein activation level upon stimuli (columns in top and bottom panels) and inhibitors (subcolumns in top and bottom panels). Red background denotes an error between experimental and pathway-inferred responses. Generic topology can hardly represent the HepG2 signaling responses (red background in top panel) and pathway optimization is critical to obtain a pathway topology that captures HepG2 function (limited red background in bottom panel). Pathways are visualized using Cytoscape
The formulation for the optimal pathway identification is a 0–1 Integer Linear Program, i.e., an optimization problem with binary variables and linear constraints (see
The ILP is solved with the state-of-the-art commercial code (CPLEX
To validate our model, we also examine three scenarios where we remove 20% of our experimental data, and then we try to predict them. Specifically, we create three training datasets, each time by removing all cases where one inhibitor is present (either MEKi, PI3Ki, or p38i) and then we calculate how well our ILP-optimized map can predict each of the inhibitor cases (see
In order to compare the ILP algorithm with the previously published genetic algorithm (GA) we use the same initial topology and the same normalized dataset
The notable differences between the proposed method and the method used in
For the identification of the drug effects we make use of the second dataset in HepG2s where drugs are applied together with the same set of ligands. In this case, the ILP formulation is being used with the HepG2 specific topology (topology obtained from the previous step) and not the generic map. We also do not impose inhibitor constrains the way we do for pathway optimization (e.g., PI3K inhibitor blocks the signal downstream of PI3K) but we let the optimization algorithm decide which reaction(s) should be removed in order to fit the drug-induced data.
The effect of Lapatinib (
(A–D) Red arrows denote drug effects, i.e., reactions that are removed from the HepG2 topology by the ILP algorithm in order to fit the drug-altered phosphoprotein dataset. (E–H) Raw data that correspond to drug effects. Lines indicates the signal between 0 minutes (untreated) and “early response” (average signal of 5 and 25 minutes post stimuli). (I) Off-target effect of Gefitinib. Dose response curve shows that the EGFR inhibitor reduces cJUN activation upon IL1α treatment. R2 corresponds to linear fit.
Gefitinib, an EGFR tyrosine kinase inhibitor, alters the topology in a very similar pattern as Lapatinib, but, interestingly enough, it also results in the removal of the JNK→c-JUN branch (
The “dirty” Raf inhibitor Sorafenib shows a very different profile: it also blocks the JNK→c-JUN branch (
In this article, we present an unbiased phosphoproteomic-based approach and an optimization formulation to construct cell-type specific pathways and to identify drug effects on those pathways. For the pathway construction, we track 13 key phopshorylation signals in 55 different conditions generated by the combinatorial treatment of stimuli and inhibitors. Using Integer Linear Programming (ILP) for pathway optimization we take a generic network of 74 proteins and 105 reactions and construct a cell-type specific network of 49 proteins and 44 reactions that spans between the 5 stimuli and the 13 measured phosphorylated proteins. In this network, we monitor 4 cases of drug-induced pathway alterations using a similar computational scheme.
In comparison to all other protein-based target identification approaches, our method is not based on measurements of drug affinities either by
An important aspect of the current approach is the construction of pathway maps. Pathway construction is a major endeavor in biology and a variety of experimental
Optimizing pathway topologies is a relatively new approach for the construction of cell-type specific pathways. Using Boolean topology and Genetic Algorithm (GA) for an optimization scheme, Saez-Rodriguez et al.
When applied in HepG2s, our approach identifies both known and unanticipated results. As a positive control, it removes the TGFα branch upon EGRF drug treatments. Another easily understandable effect is Sorafenib's inhibition of the pathway downstream of p38 which can be explained by the drug's target affinity to p38α and p38β
Understanding the interplay between
HepG2 cells were purchased from ATCC (Manassas, VA), and seeded on 96-well plates coated with collagen type I-coated (BD Biosciences, Franklin Lakes, NJ) at 30,000 cells/well in DME medium containing 10% Fetal Bovine Serum (FBS). The following morning, cells were starved for 4 hours and treated with inhibitors and/or drugs. Kinase inhibitors were used at concentrations sufficient to inhibit at least 95% the phosphorylation of the nominal target as determined by dose-response assays (presented in
A major improvement in the present dataset as compared to
From each lysate we measured 13 phosphorylation activities that we considered “key phosphorylation events” using a Luminex 200 system (Luminex Corp, Austin, TX). The 13-plex phospho-protein bead set from Bio-Rad was used to assay p70S6K (Thr421/Ser424), CREB (Ser133), p38 (Thr180/Tyr182), MEK1 (Ser217/Ser221), JNK (Thr183/Tyr185), HSP27 (Ser78), ERK1/2 (Thr202/Tyr204, Thr185/Tyr187), c-JUN (Ser63), IRS-1 (Ser636/Ser639), IκB-α (Ser32/Ser36), Histone H3 (Ser10), Akt (Ser473), and IR-β (Tyr1146). Data were normalized and plotted using with DataRail
Here, we describe how the Boolean model described in
While typically the set of species is known, the set of reactions is not known. Rather, only a superset of potential reactions is postulated. The goal of the proposed formulation is to find an optimal (in some sense) set of reactions out of such a superset. To that extent binary variables
A set of experiments is performed, indexed by the superscript
The last group of variables
For the case that a species is measured, the measurement is defined as
Note also that alternative norms, such as least-squares errors, could be also used. The resulting optimization problem would still be an ILP, since the objective function involves only integer variables. For instance for the least-square error objective function the following linear reformulation is valid:
The secondary objective is to minimize the weighted number of possible reactions
The ILP proposed can be summarized as:
In formulation (1)–(11) for the manipulated species binary decision variables along with the constraints (9) and (10) are introduced. This simplifies notation. In the implementation, these variables are replaced by constants. Alternatively the preprocessor of the optimization solver can be used to exclude these trivial variables.
In the following the reasoning for the formulation is given. The first set of constraints, i.e., (2) allow the modeler to limit the combinations of connectivities considered. For instance, suppose that two reagents
The constraints (7) ensure that a species will be formed if some reaction in which it is a product occurs. Note that multiple reactions can give the same species; mathematically this will result in redundant constraints. In contrast, the constraints (8) enforce that a species will not be present if all reactions in which it appears as a product do not occur. Recall that manipulated species are not considered as products in reactions. Note also, that it would be possible to combine the constraints (7) into a single constraint for each species, e.g.,
In the present study, our ILP formulation was utilized in two different circumstances. For the creation of the cell-type specific pathway using combinations of inhibitors and stimuli our ILP formulation included 27887 constraints and 9732 variables. For each drug case, where the reduced and optimized pathway was utilized, we had 2477 constraints and 947 variables.
For the goodness of fit, we calculated the percentage error as:
Note that for binary
Raw data for the construction of the cell-type specific map and the evaluation of the drug effects. The signals in the Y-axis correspond to the measurements of the phosphorylated residues listed in
(0.52 MB PDF)
Model Validation. The first panel shows the optimization results when the full dataset (shown in
(0.91 MB PDF)
Comparison between genetic algorithm and ILP. Both algorithms performed well and achieved very similar solutions. Red background denotes inconsistency between predicted values and experimental data: ILP matched all but 98 out of 880 experimental data, as opposed to 110 mismatches in the topology furnished by the GA. The computational time for ILP was 14.3 sec as opposed to 1approximately one hour for GA.
(0.63 MB PDF)
Equivalent reformulation as MILP
(0.03 MB PDF)
We would like to thank Steffen Klamt and Regina Samaga for helpful discussions regarding the Boolean model.