InTech uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Technology » "New Trends in Technologies", book edited by Blandna Ramov, ISBN 978-953-7619-62-6, Published: January 1, 2010 under CC BY-NC-SA 3.0 license. © The Author(s).

Chapter 3

Toward Personalized Therapy Using Artificial Intelligence Tools to Understand and Control Drug Gene Networks

By Alexandru G. Floares
DOI: 10.5772/7591

Article top


Ornithine decarboxylase mRNA - processed mRNA experimental data fitted with a cubic spline interpolant and the 1st derivative with respect to time (explanation in text).
Figure 1. Ornithine decarboxylase mRNA - processed mRNA experimental data fitted with a cubic spline interpolant and the 1st derivative with respect to time (explanation in text).
Neural networks feedback linearization model identification for ornithine decarboxylase mRNA; the NN model have an almost identical output for the same input as the plant (mathematical model), the order of magnitude of the error is 10−3 (un-scaled data; explanation in text).
Figure 2. Neural networks feedback linearization model identification for ornithine decarboxylase mRNA; the NN model have an almost identical output for the same input as the plant (mathematical model), the order of magnitude of the error is 10−3 (un-scaled data; explanation in text).
Neural networks feedback linearization model identification for α2-Macroglobulin mRNA; the NN model have an almost identical output for the same input as the plant (mathematical model), the order of magnitude of the error is 10−4 (scaled data; explanation in text).
Figure 3. Neural networks feedback linearization model identification for α2-Macroglobulin mRNA; the NN model have an almost identical output for the same input as the plant (mathematical model), the order of magnitude of the error is 10−4 (scaled data; explanation in text).
Neural networks feedback linearization control for ornithine decarboxylase mRNA; the missing temporal-series of drug-receptor complex in the nucleus, DR(N), are reconstructed such as to constraint the mRNA output of the model to follow the measured mRNA (processed). The accuracy of the control is very good (explanation in text).
Figure 4. Neural networks feedback linearization control for ornithine decarboxylase mRNA; the missing temporal-series of drug-receptor complex in the nucleus, DR(N), are reconstructed such as to constraint the mRNA output of the model to follow the measured mRNA (processed). The accuracy of the control is very good (explanation in text).

Toward Personalized Therapy Using Artificial Intelligence Tools to Understand and Control Drug Gene Networks

Alexandru G. Floares1

1. Introduction

The real implementation of individualized therapy and gene therapy of diseases, which are most often multi-gene disorders, is an important goal of modern personalized medicine, but needs a solid rational foundation. The deluge of complex, high-dimensional biomedical data is continuously increasing; however, our modeling capacity is much smaller and increasing only slowly - particularly in fields using high-throughput techniques such as genomics, transcriptomics, proteomics, and pharmacogenomics. Knowing which genes are expressed, when, where, and to what extent is important for understanding organisms, as well as for controlling genes through adequate drugs and dosage regimens development. The regulation of gene expression is achieved through complex regulatory systems—gene regulatory networks (GRNs) - which are networks of interactions among DNA, RNA, proteins, and small molecules.

Let us remark that not only a key ingredient but a whole dimension is missing from this view. A large variety of external molecular species interfere with gene networks, but we will focus only on drugs, drug discovery being one of the important routes to personalized medicine. A more general concept of drug gene regulatory networks (DGRN), or simply drug gene networks (DGN), first introduced in (Floares, 2007b), is presented together with some mathematically definitions.

Besides the high-throughput experimental approaches, allowing to simultaneously monitor thousands of genes or other molecular species, mathematical modeling is essential for understanding and controlling gene networks by drugs or gene replacements. Various formalisms, such as Bayesian networks, Boolean networks, differential equation models, qualitative differential equations, stochastic equations, and rule-based systems, have been used (see Jong, 2002; Gardner & Faith, 2005; Bansal, 2007 for reviews).

The ordinary differential equations (ODE) approach tries to elucidate a deeper understanding of the exact nature of the regulatory circuits and their regulation mechanisms. In a pharmacogenomic context, it allows the design of controls that are optimal, individualized drug dosage regimens (Floares, 2005; Floares, 2006). Unfortunately, this is also the most difficult, tedious, expensive, and time-consuming approach. The models are high-dimensional systems of nonlinear-coupled stiff ODEs. The number of parameters is extremely large, and many of them have unknown values. Although in principle one can find the best set of parameter values by sampling the whole parameter space, many degenerate solutions may be expected. These are due to the correlations between parameters, and to the fact that biological systems have built-in regulation mechanisms that make them robust to changes in many of their parameter values. These facts suggest that it is the network structure rather than the precise value of the parameters that confers stability to the system.

There is a need for algorithms to automatically infer such models from high-throughput time-series data, and artificial intelligence is better suited than conventional modeling. We proposed a series of reverse engineering algorithms for drug gene networks (Floares, 2007b), based on artificial intelligence methods - neural networks (NN) for identification and control, and linear genetic programming (GP) (M. Brameier & W. Banzhaf, 2007) for symbolic regression. The algorithms take as inputs high-throughput (e.g., microarray) time-series data and automatically infer an accurate ordinary differential equations model, revealing the networks structure and parameter and giving insights into the molecular mechanisms involved.

RODES algorithms, from reversing ordinary differential equations systems, decouple the systems of differential equations, reducing the problem to that of revere engineering individual algebraic equations. Using GP involve evaluating the fitness of hundreds of models (computer programs) every generation, in the simulated evolution. Our approach drastically reduces the complexity of the problem and the execution time, because for evaluating the fitness function is not necessary to integrate the ODE system. In addition, the possibility of incorporating common domain knowledge in RODES reduces the structure search space and further speeds up the algorithms.

Other studies, proposing GRN reverse-engineering algorithms based on evolutionary computation, require integration of the ODE systems hundreds or thousands of times for each generation ((Sakamoto & Iba, 2001); (Kikuchi et al., 2003); (Noman & Iba, 2005); (Cho et al., 2006)). Similar methods have been proposed in the past, but most of them require a predefined model structure, however, and are limited to parameter estimation. For example, the S-system model refers to a particular type of ODE system in which the component processes are power-law functions (Savageau, 1976), (Voit, 2000). Despite the elegance and computational simplicity of the S-system model, this formalism has its limitations for biochemical networks (e.g., (Beard, 2004)).

Usually, due to various experimental constraints, essential information is missing from data, and even the most powerful artificial intelligence techniques are not creating information, but just extracting it from data. Missing information from data is an important data mining and artificial intelligence problem, by no means restricted to the problem investigated here. To our knowledge, the problem of missing information from data, in the form of variables or features missing, does not have adequate technical solutions in the data mining and computational intelligence literature. In the present context, not all variables or time-series are simultaneously measured, as it is required to reconstruct the drug gene networks, as systems of ordinary differential equations. RODES algorithms can reveal if some information from the input set is either missing or not related to the output. This is possible because the genetic programming version of RODES (GP RODES) requires the temporal series of all variables of the system to infer an accurate mechanistic model. It also means that it does not discover false input–output relations.

One of the unique features of RODES is its ability to deal with the common but challenging situations of information as variables missing from data. Thinking in a systemic way one can conjecture that, due to the interactions in these networks, information must be implicitly present in the data. Therefore we used some ideas and techniques from control theory, mainly feedback linearization. To automate the algorithm the neural networks counterpart of the conventional feedback linearization (Garces, 2003) was used. Applied to drug gene networks the neural networks version of RODES algorithm (NN RODES) enable and automate the reconstruction of the time-series of the transcription factors, microRNAs, or drug related compounds which are usually missing in microarray experiments.

The tricky solution consists of transforming the modeling problem in a tracking control problem. The measured mRNA temporal series become the desired or reference trajectories.

The problem is to find the control(s) such that the plant output - the solution of the mRNA ODE - tracks the desired trajectory with an acceptable accuracy level. These controls are the missing variables of the (D)GRNs that are identified in this way. To the best of our knowledge, this are the first realistic reverse-engineering algorithm, based on linear GP and NN FBL, for large gene networks including pharmacogenomic variables and interactions, capable to deal with missing information as variables from data.

2. Methods

2.1. (D)GRN Fundamental ODE Patterns and Building Blocks

The rate at which the concentration of a protein changes inside a cell depends mainly on the following:

  1. the rate at which its mRNA is produced and degraded;

  2. the rates at which the mRNA molecules are translated;

  3. the rate at which the protein itself degrades.

Because these are (bio)chemical reactions the corresponding rates are described by (bio)chemical kinetic equations. There are two main frameworks for modeling and simulations (bio)chemical reactions, a deterministic and a stochastic one.

The deterministic modeling is based on the construction of a set of rate equations to describe the biochemical reactions. These rate equations are non-linear ordinary differential equations with concentrations of chemical species as variables. Deterministic simulation produces concentrations by integrating the ODEs. The stochastic modeling involves the formation of a set of chemical master equations with probabilities as variables (van Kampen, 1992) [17]. Stochastic simulation produces counts of molecules of the chemical species as realizations of random variables drawn from the probability distribution described by the chemical master equations. Which framework is appropriate for a given biological system depends on the investigated biological phenomena, and is influenced by the simplifying assumptions of the analysis (Wolkenhauer et al., 2004). The present report is focused on the deterministic approach.

Usually, these rates have the same mathematical form as the pharmacokinetic (PK) blocks describing the drug movement into, within, and out of the body:

  1. Zero order: dX/dt=k , where k is a zero-order rate constant and X is the concentration of the drug;

  2. First order: dX/dt=k·X , where k is a first-order rate constant and X is as above; and

  3. Michaelis–Menten: dX/dt=Vm·X/(Km+X) , where Vm is a maximum rate, Km is the Michaelis constant, and X is as above.

Some of the rates of mRNA production, degradation, and translation are regulated by TFs and microRNAs (miRNA) respectively, in GRN.

In a pharmacogenomic context, new regulatory interactions, or exogenous control factors represented by drugs, are added. We introduced the more general concept of drug gene regulatory networks or drug gene networks in (Floares, 2007b). If the regulation is restricted to transcription factors and microRNAs, the network is a GRN. If the regulation is exerted by transcription factors, microRNAs and by drugs or drugs related compounds, e.g., drug–receptor complexes, the network is a DGRN. Thus, GRN could be considered a subset of DGRN.

The mathematical descriptions of the mechanisms of regulation, by TFs, microRNAs, and drug-related compounds, are the same. They have the form of the most common pharmacodynamic (PD) blocks, describing the relationship between drug doses or concentration and effects:

  1. Linear (stimulation [+] or inhibition [-]) model: E=E0±S·Ce

  2. Log-linear (stimulation [+] or inhibition [-]) model: E=E0±S·log (Ce)

  3. Ordinary (γ= 1) ( or sigmoid (γ 1)Emax (stimulation [+] or inhibition [-]) model: E=E0±EmaxCeγ/(Ceγ+EC50γ)

where E is the effect variable; E0 is the baseline effect; Emax is the maximum drug-induced effect, also called capacity; EC50 (sometimes IC50 [50% inhibitory concentration] is used instead of EC50 for inhibitory effect) are the plasma concentration at 50% of maximal effect, also called sensitivity; S is the slope of the line relating the effect to the concentration; Ce is the concentration to which the effect is related, and γ is the sigmoidicity factor (Hill exponent).

GRN ODE systems models have one ODE for each mRNA, microRNA and protein, corresponding to transcription and translation, respectively. The protein can be a transcription factor too. DGRN ODE systems have some additional equations for the drug related compounds, which can act as transcription factors, e.g., a drug-receptor complex in the cellular nucleus. The corresponding ODE describes the translocation of the drug-receptor complex from cytoplasm to nucleus and its degradation. The ODE systems for DGRN and GRN results from the following:

  1. summing up the pharmacokinetic blocks and

  2. multiplying the rate constants of the regulated processes by pharmacodynamic blocks.

Usually, it is assumed that other processes, such as diffusion and transport, are fast with respect to transcription and translation and may thus be ignored.

In words, the structure of these equations, for any molecular specie, is simple:

Rate of Change =

Production Rate x Production Regulation - Degradation Rate x Degradation Regulation

Thus, the rates of change in a specific mRNA concentration (mRNA), and in the translated product concentration (e.g., a transcription factor, TF, in our case) are


where ksm is the rate at which mRNA is produced and kdm is the mRNA degradation rate constant; kdTF is the TF degradation rate constant, and ksTF is the average TF translation rate constant. Rs and Rd are generic notations for different regulatory factors of mRNA synthesis and degradation, respectively. Usually, Rs represents TFs regulating mRNA synthesis and RsTF represents microRNAs regulating translation and probably drugs related compounds; Rd and RdTF could represent drug-related compounds, e.g., a drug–receptor complex. A regulatory factor Rs,d= 1 indicates no regulation, and an Rs,d having the form of one of the pharmacodynamic blocks indicates the action and the mechanism of action of a regulatory factor.

Equation (1)is a simple description for both mRNA and miRNA rates and their regulation. Equation (2) is a simple description of translation rate and its regulation for any protein, including the special case of transcription factors, were the protein also regulates transcription. It is worth mentioning that, while many molecular species could act as transcription or translation regulators, embedding all these regulatory interactions in a single variable or function is just a highly accurate but first approximation, as our results will show.

Equations (1) and (2) together with the above PK and PD blocks form a fundamental ODE patterns or building blocks of the (D)GRN models. This common domain knowledge, together with the information obtained via the data and knowledge mining approach, can be simply used to reduce the structure search space of the algorithm, and to identify the biochemical mechanisms involved, in the resultant model. As we will show bellow, the information concerning the direction, sign and mechanisms of such interactions can be at least partially extracted from data by RODES algorithms, in a data mining and network discovery from data approach. It possible and very useful to integrate the data mining approach using numbers, with a knowledge mining approach, extracting information from processed published literature databases, using dedicated systems biology software (e.g., IPA™ from Ingenuity, or GeneGo™ from GeneGo).

Three cases, of increasing complexity, are possible for both equation (1)and equation (2), and for simplicity, they will be presented only for equation (1), and for mRNA:

  1. unregulated mRNA transcription and unregulated mRNA degradation,

  2. regulated mRNA transcription and unregulated mRNA degradation, and

  3. regulated mRNA transcription and regulated mRNA degradation.

For unregulated transcription and degradation ( Rs= 1, Rd= 1 ), all variables (mRNAs) are available and one can use GP RODES, the RODES algorithm based on Genetic Programming (see (Floares, 2008) for details) to automatically infer the corresponding ODE.

For regulated transcription and missing information about the TFs or drug related compounds (variable missing), RODES was extended in (Floares, 2008), using neural networks and simulated data. The application of this NN RODES version to real experimental microarray data is a central theme of this contribution. Usually, while the equations’ structure is known – it should be a version of equation (1))and the parameters’ values can be found in the literature or in public databases - only the temporal series of the mRNAs are available from microarray experiments, but not those of the TFs.

It is important to emphasize that this is an important and difficult data mining or knowledge discovery in data problem. Remember that even the most sophisticate artificial (computational) intelligence methods are just extracting information from data and not producing information. Information could be missing from data in two major distinct ways:

  1. some variables from data have missing values, or

  2. some variables are missing from data.

Both missing values and missing variables from data are encountered very frequently in practice. The first one is very easy to indentify just by carefully examining the data. The second one is not so evident, because in the most interesting data mining experiments one does not know exactly the number of relevant variables. Heuristically, missing variables manifests itself by a relatively low accuracy, which is very similar for different algorithms. Without entering into details, a typical situation could be for example a medical data mining problem -one has a dataset of say 150 patients, 10 input variables, and a binary diagnosis output variable, and (almost) no missing values; despite that, the prerequisites for a high accuracy are present, one only obtains say 80% accuracy. More than this, the accuracy is almost the same +/- 3% for all the algorithms tried - neural networks, support vector machines and decision trees - with the best settings for the algorithms. In addition, from the 10 input variables, only 7 proved relevant for the diagnosis problem. In such a situation, it is clear that information is missing from data and this is related to variables missing. To our knowledge, this is the first time a solution based on artificial intelligence is proposed, for the important problem of information missing, in the form of variables missing from data. Our solution allows the reconstruction of missing variables, is accurate, and is by no means limited to the biomedical problems reported here.

2.2. RODES Algorithm: No Missing Information, All variables measured

The goal of the proposed algorithm for reverse engineering is threefold:

  1. to automatically identify the structure of accurate ODE systems models of GRN and DGRN,

  2. to automatically estimate their parameters, and

  3. to identify the biochemical and pharmacological mechanisms involved.

The RODES algorithm starts from complex time-series data. The name of the algorithm is related to its results, not to the biological systems investigated. This is because we successfully applied it to various biological networks: the subthalamopallidal neural network of the basal ganglia (Floares, 2008) and the vascular networks of tumors (work in progress). The result is an ODE system, dX/dt=f(X) .

In the time-series data, at any given discrete time point, t , where t = 1, 2,..., T , dX/dt is equal to f(x) at the same time point t . Equivalently, for any individual ODE of the system, dXi=dt (at t ) = fi(X) (at t ), where i = 1, 2,..., n is the number of variables. Thus, each equations of the ODE system can be reconstructed one by one, via a simple data mining approach, as algebraic relations fi between the inputs X and output dXi/dt . The algorithm can be used for experimental or simulated data. For simulated data, the true structure of the (D)GRN models is known, and this allows a faithful evaluation of the predicted models. We therefore used simulated time-series data to illustrate our algorithm; the accuracy for experimental data is similar, as will be shown for the more difficult problem of missing information (variables) from data (see next section). The RODES version with no missing information from data is based on genetic programming, as a machine learning method, and consists of the following steps:

  1. Compute the time derivative of each variable, dXi/dt , at all discrete time points t

  2. Build input-output pairs, ( Xi;dXi=dt , at the corresponding discrete time points t:

  3. Build training, validation (optional, to avoid overfitting), and testing sets from the input-output pairs.

  4. Initialize a population of randomly generated programs, coding mathematical models relating the inputs Xi to the output(s) dXi=dt

  5. Perform a tournament contest:

  6. Randomly select four programs and evaluate their fitness (mean squared error) - how well they map the input data Xi to the output data dXi=dt

  7. Select two programs as winners and the other two as losers.

  8. Copy the two winner programs and transform them probabilistically by:

  9. Replace the loser programs with the transformed winner programs. The winners of the tournament remain in the population unchanged.

  10. Repeat steps 5(a) - 5(d) until a program is developed that predicts the behavior sufficiently.

  11. Extract the ODE model from the resultant program or directly use it.

Steps 1 - 3 reduce the problem of reversing a system of coupled ODEs, dX/dt=f(X) , in that of reversing individual, decoupled, algebraic equations, dXi/dt=fi(X) . Even though the output is in reality a time derivative, dXi=dt , the algorithm is simply searching for an algebraic equation relating the inputs to the output, at each discrete time point t . The corresponding relation is the predicted function, f^i(X) , for the right-hand side of each differential equation of the system.

This approach drastically reduces the CPU time of the algorithm, by orders of magnitude, because in step 5(a) the fitness evaluation does not require the integration of the ODE system. More precisely, one can use a fitness function based on (e.g., (Spieth et al., 2006)):


where j is the number of programs, n is the number of variables, T is the number of sampling points, X^i(t) is the numerically calculated time course of the variable Xi at time t from the ODE system predicted by the program j , and Xi(t) represents the experimentally or simulated time course of Xi at time t . Therefore, for every programs fitness calculation, at each generation, the ODE system must be numerically integrated. We used a fitness function of the form


where j and T are as above, dX^i(t)/dt is the time derivative at time point t of the variable Xi predicted by the program j, and dXi(t)/dt represents the time derivative at time t of the experimental or simulated variable Xi calculated in step 1 of the algorithm.

While the time needed to integrate a system of ODE seems negligible, during fitness evaluation the integration has to be executed hundreds or thousands of times per generation. These, and the results of our previous studies (Floares, 2005), (Floares, 2006), (Floares, 2007b), suggest that RODES will scale up well, as required by modern high-throughput biomedical techniques.

We used a linear version of a steady-state genetic programming proposed by Banzhaf (see (Brameier & Banzhaf, 2007)) for a detailed introduction, and the literature cited there). In linear genetic programming the individuals are computer programs represented as a sequence of instructions from an imperative programming language or machine language. Nordin introduced the use of machine code in this context (cited in (Brameier & Banzhaf, 2007)). The major preparatory steps for GP consist of determining

  1. the set of terminals (see below),

  2. the set of functions (see below),

  3. the fitness measure (see equation (4)),

  4. the parameters for the run (see below),

  5. the method for designating a result, and

  6. the criterion for terminating a run.

The function set, also called instruction set in linear GP, can be composed of standard arithmetic or programming operations, standard mathematical functions, logical functions, or domain-specific functions.

We used the following Genetic Programming parameter setting:

  • Population size 500

  • Mutation frequency 95%

    • Block mutation rate 30%

    • Instruction mutation rate 30%

    • Instruction data mutation rate 40%

  • Crossover frequency 50%

  • Homologous crossover 95%

  • Program Size 80-128

  • Demes

  • Dynamic Subset Selection

  • Function set {+, -, *, /}

  • Terminal set 64 = j + k

Using simple and common domain knowledge, such as the set of mathematical functions that appear in the models, e.g., arithmetic functions but not trigonometric function, is enough for RODES to find the proper structure of the reconstructed equations, also greatly increasing execution speed. The terminals are variables and parameters. In microarray experiments, the number of mRNAs is usually of the order of 102 or 103 after filtering, but the number of the clusters of genes with similar temporal signatures is small. One needs only to discover this small number of prototype ODE structures.

All the equations have one of these prototype structures, and the equations in the same cluster have the same structure but different parameter values. We still do not know which are the input variables for each mRNA ODE equation. From the fundamental ODE patterns of DGRN, we know that the equations for each mRNA (see equation (1)) contains a synthetic and a degradation term.

The inputs variables for these mRNA equations are

  • the mRNA concentration - in a degradation term proportional with mRNA concentration - for unregulated transcription and degradation,

  • the concentration of a transcription factor (for GRN) and/or of a drug related compound (for DGRN) - in a PD block (Rs in eqn (1))multiplying a PK block (the constant mRNA synthesis) - for regulated transcription and unregulated degradation, and

  • as above but also the concentration of a drug-related compound (for DGRN), contained in a PD block (Rd in eqn (1)) multiplying a PK block (the linear mRNA degradation) - for regulated transcription and regulated degradation.

The RODES version described in this section requires all inputs to be available. This condition is certainly true for the first situation but is usually false for the second and the third. The next section will extend RODES to cope with the second situation.

Because we know the structure of this ODE (see eqn (1)), this is also the route to automate the discovery of the biochemical and pharmacological molecular mechanisms involved. Analyzing the resultant equations, one can easily identify

  1. cellular processes such as syntheses and degradations and their mechanisms as PK blocks,

  2. the presence of regulation and

  3. the regulation mechanisms - by looking at the corresponding PD blocks and at the rate constants they are multiplying.

There are situations in which the PK/PD mechanisms in the resultant mathematical model need to be clarified. When we have the product of two or more constants, in the symbolic form of the model, the algorithm will find only one numerical value. Using elementary domain knowledge, one can easily and clearly identify the PK/PD mechanisms (see (Floares, 2006) for details).

2.3. NN RODES - Neural Network Feedback Linearization for Missing Variable Identification

We focused only on the genes which expression is affected by the synthetic glucorticoid methylprednisolone treatment, the goal being to reverse engineer this drug gene regulatory network. The genes response to glucorticoid treatment can be classified into three categories:

  1. genes stimulated by methylprednisolone,

  2. genes inhibited by methylprednisolone,

  3. genes with biphasic behavior - stimulation followed by inhibition or inverse.

While these three categories can be discriminated even by simple visual inspection of the temporal series of the microarray data, this approach does not entail objective criteria for selection of probes for further consideration. To screen for the probe sets objectively, the entire dataset was filtered with various filters (Almon, et al., 2007). We selected only the genes belonging to the first two aforementioned categories, stimulated and inhibited. For the categories of stimulated/inhibited genes, we tested two mechanisms - linear stimulation/inhibition and ordinary and sigmoid Emax stimulation/inhibition (see the pharmacodynamic blocks in section 2.1).

From the point of view of our theory of drug gene regulatory networks, we investigated the case of regulated mRNA transcription and unregulated mRNA degradation (see equation (1))where the missing variable is either the temporal series of the regulatory transcription factor (in GRN and DGRN) or those of the drug–receptor complex (in DGRN).

This requires the neural networks feedback linearization version of RODES (NN RODES) which can cope with missing information. NN RODES was first introduced in (Floares, 2008) and applied to simulated data in order to test it on equations with known structure and parameters. The tricky solution we proposed in (Floares, 2008) consists in transforming the modeling problem in a tracking control problem:

  • The measured mRNA temporal series becomes the desired or reference trajectory.

  • The ODE with known structure (see equation (1)) and missing variable(s) becomes the plant to be controlled.

  • The missing variables become the control inputs; they are PD blocks incorporated in the position of the regulatory factor Rs of the transcription in equation (1).

The problem is to find the control(s) such that the plant output - the solution of the mRNA ODE - tracks the desired trajectory with an acceptable accuracy, while all the states and the control remain bounded in a physiological range.

It is tempting to speculate that this might be similar to the problem faced by the real living systems during evolution. This idea is corroborated by the fact that regulation appears to evolve on a faster time scale than the coding regions of the genes. For example, related animals, such as mice and humans, have similar genes, but the transcription regulation of these genes is quite different.

Also, an approach like this could offer a rational foundation for gene therapy, based on understanding and controlling (D)GRN, which are complex networks of interactions, instead of the pedestrian prevailing approach based on a ”one gene - one disease” rule.

Feedback linearization can be considered one of the most important nonlinear control design strategies developed in the last few decades (Garces et al., 2003). This approach algebraically transforms a nonlinear dynamic system into a linear dynamic system, by using a static-state feedback and a nonlinear coordinate transformation, based on differential geometric analysis of the system.

Because our goal is to automate the modeling process, we intended to use a computational intelligence version of feedback linearization. The massive parallelism, natural fault tolerance, and implicit programming of neural networks suggest that they may be good candidates. We successfully applied neural network feedback linearization, based on multilayer perceptrons (MLPs), to complex pharmacogenomic systems to find adequate drug dosage regimens (Floares, 2005); (Floares, 2006).

Owing to the reformulation of the modeling problem as a control problem, a NN FBL approach seems adequate and feasible. We used the NARMA-L2 version of input–output feedback linearization (Narendra, cited in (Garces et al., 2003); see also the literature cited in (Garces, et al., 2003), in which the output becomes a linear function of a new control input.

Fortunately, the prerequisites of the approach, represented by the equations’ structure and parameters, are usually known. In the particular situation investigated here the following domain knowledge and features of the experimental design are taking into account:

  1. The equations structure - there is a strong theoretical foundations for building kinetic equations like equations (1) and (2).

  2. The equations parameter - for the control untreated subjects there is no drug regulation (R) in equation 1, supposed at steady state; at time t = 0 the concentration of mRNA is mRNA0 = 1, because all expression profiles are normalized to that of the control subjects, and thus ksm and kdm are easily estimated.

  3. The regulation and its mechanisms

The control input is the unknown regulator: a pharmacodynamic block (see section 2.1) containing the TF concentration in GRN, a drug-related compound, or both in DGRN. In this approach a neural network model of the ”plant” is first identified, even if, as in our situation, the mathematical model of the ”plant” is known. The mathematical model of the ”plant” is simply the equation 1, where only the mRNA synthesis is regulated, and the degradation is not:


As we previously stated, the values of ksm and kdm are usually known. For example, for the ornithine decarboxylase gene we have ksm = kdm = 0.30. It is known from the experiments that the drug stimulates this gene, but we do not know exactly the mechanism and the corresponding formula for Rs. One can try a linear stimulation mechanism, followed by an Emax one if the first failed, these being the most common mechanisms in the order of increasing complexity. Thus, the Rs for these two simulation experiments are:


respectively; the so called basal effect of the drug, E0 = 1, and DR(N) is the drug-receptor complex in the nucleus, the control input which we are trying to find.

A random input control, between zero and the estimated maximal value, is injected into the model at random intervals. The NN model structure is the standard nonlinear autoregressive moving average (NARMA) model, adapted to the feedback linearization of affine systems - the controller input is not contained in the nonlinearity.

We want the system output represented by the mRNA to track a reference trajectory. This reference trajectory could be related to the measured level of genes expression, as it will be shown bellow, or to a therapeutic objective in a gene therapy context, for example. In the last situation, the idea is that we can in a similar way constrain the genes expression, via the drug inputs to follow some desired or normal trajectories, when they are pathologically perturbed.

We need the time-series data of the mRNA of the investigated genes. Usually, the expression levels in microarray experiments are measured at a couple of specific time points, on a small number of subjects for each time point (see (Almon et al., 2005); (Almon et al. 2007) for the particular datasets used in this investigation).

In order to apply the approach previously described, we first calculated the mean expression level for each time point, and then interpolated using a cubic spline interpolant. Because taking the derivative of noisy data can increase the noise, the result of the interpolation is differentiated instead, with respect to time. The number of hidden layers is one for all neural networks. The number of neurons in the hidden layer, of the two MLPs, is between 5 and 9, depending on the complexity of the problem and the results of the simulation experiments. The activation functions are tangent hyperbolic for the hidden layer and linear in the output layer for all NNs. The parameters and their values are the following:

  • Network Architecture

  • Training Data

We investigate the prediction errors by cross-validation on a test set. We used Bayesian regularization (MacKay, 1992), a training function that updates the weight and bias values according to Levenberg–Marquardt optimization. It minimizes a combination of squared errors and weights and then determines the correct combination to produce a network that generalizes well. We start with different random initial conditions to avoid ending in “bad” local minima. In NN FBL the controller is simply a rearrangement of the neural network plant model. The time-series of the missing variables are identified as the control inputs, and the complete equation is thus reconstructed.

Three important problems can be approached by the proposed methods:

  • finding the unknown transcription factor profile in a drug-gene regulatory network, using the measured mRNA profile as a reference trajectory;

  • finding the unknown drug-receptor complex profile in a drug-gene regulatory network, using again the measured mRNA profile as a reference trajectory; and

  • finding the optimal/individualized drug–receptor complex profile, corresponding to the optimal drug dosage regimen, capable of constraining the mRNA profile to track a desired therapeutic objective, in a pharmacogenomic context.

Because of the mechanistic and mathematical similarities between transcription regulation by TFs and by drug–receptor complexes, the example is illustrative for both situations.

3. Results

Most often, the temporal series of the TF or of the drug-receptor complex is not known for regulated genes. This is also true for the temporal series of the mRNA of the TF in proteomics experiments, when one wants to reconstruct the TF ODE (see eqn (2)). In these situations, RODES clearly indicates that information is missing from the input set (see (Floares, 2008)), and one has to use the extension based on neural network feedback linearization.

For illustration we tested the NN RODES algorithm on reconstructing equation (3)for two genes: the ornithine decarboxylase gene and the α-2Macroglobulin gene. The parameters for the two genes are: ksm = kdm = 0.30 for ornithine decarboxylase, and ksm = kdm = 0.038 for α-2Macroglobulin.

There are three temporal series in equation 3:

  1. one for the mRNA - obtained by fitting the mean expression level at each time point of the experiment, using a cubic spline interpolant (see Fig. 1)

  2. one for the dmRNA/dt - obtained by differentiating with respect to time the result of the interpolation (see Fig. 1 )

  3. one for the regulator - the drug-receptor complex in the cellular nucleus, DR(N) , which we are trying to reconstruct with the aid of NN FBL.


Figure 1.

Ornithine decarboxylase mRNA - processed mRNA experimental data fitted with a cubic spline interpolant and the 1st derivative with respect to time (explanation in text).

The settings of the neural networks feedback linearization experiments are presented at the end of section 3, above.

In our previous studies (see (Floares, 2008)) the performance of the identification step of NN FBL was very good, but also somehow expected because the data were simulated. Here, with real microarray time-series data the performance were very high too, and the order of magnitude of the error is 10−3 - 10−4 for both genes (see Fig. 2 and Fig. 3).


Figure 2.

Neural networks feedback linearization model identification for ornithine decarboxylase mRNA; the NN model have an almost identical output for the same input as the plant (mathematical model), the order of magnitude of the error is 10−3 (un-scaled data; explanation in text).


Figure 3.

Neural networks feedback linearization model identification for α2-Macroglobulin mRNA; the NN model have an almost identical output for the same input as the plant (mathematical model), the order of magnitude of the error is 10−4 (scaled data; explanation in text).

Also, the performance of the identification of the drug-receptor complex concentration profile in the nucleus, DR(N), is very good for both genes; the results are shown just for ornithine decarboxylase, those for the α2-macroglobulin being similar (see Fig. 4).


Figure 4.

Neural networks feedback linearization control for ornithine decarboxylase mRNA; the missing temporal-series of drug-receptor complex in the nucleus, DR(N), are reconstructed such as to constraint the mRNA output of the model to follow the measured mRNA (processed). The accuracy of the control is very good (explanation in text).

Thus, NN RODES using neural networks feedback linearization is able to reconstruct with high accuracy a system of ordinary differential equations, modeling (drug) gene regulatory networks, in the very difficult but common situation of having only the time-series of the gene expression levels, while the time-series of the regulators - transcription factors and drug related compounds - are missing.

Because of the very low tracking error, the reference trajectory, which is the measured mRNA profile (after the processing previously described) the mRNA output of the controlled system cannot be distinguished.

4. Conclusion

ODE systems are one of the most sophisticated approaches to modeling gene regulatory networks and drug gene regulatory networks, the superset of GRN we have recently proposed; it is also one of the most difficult. This contribution showed the applications of RODES, our algorithm for reverse-engineering gene networks, based on linear genetic programming and neural networks control by feedback linearization method, to real microarray time-series data (NN RODES), and simulated data (GP RODES).

Common to both neural networks and genetic programming component is the proposed method for decoupling the ordinary differential equations system. This allows reversing the ordinary differential equations of the system one by one. The neural networks component enables RODES to deal with the very difficult but common situation in which only the microarray time-series data are available, but the regulators time-series, transcription factors and drug related compounds, are missing.

Here we focused on the case of regulated mRNA transcription and unregulated mRNA degradation, when either the temporal series of the regulatory transcription factor (in GRN) or those of the drug–receptor complex (in DGRN) are missing. The tricky solution consists of transforming the modeling problem in a tracking control problem. The measured mRNA temporal series becomes the desired, or reference, trajectory. The problem is to find the control(s) such that the plant output - the solution of the mRNA ODE - tracks the desired trajectory with an acceptable level of accuracy. These control inputs are the missing variables that can be identified in this way, thus completing the automatic reconstruction of the ODE equation. To the best of our knowledge, RODES is the only reverse-engineering algorithm based on neural network feedback linearization, that has been applied to reverse engineer gene networks, as a highly accurate system of ordinary differential equations, in the very difficult but common situation of missing information from data, as missing variables - having only the time-series of the gene expression levels, but not the time-series of transcription factors and drug related compounds. In addition, the algorithm is by no means restricted to the biomedical field, automating the ODE modeling of complex time series, even when information is missing from data in the form of variable missing, in any scientific and technical field.


1 - R. R. Almon, D. C. Dubois, J. Y. Jin, W. Jusko, 2005 Y. & Jusko, W. J. 2005. Pharmacogenomic responses of rat liver to methylprednisolone: An approach to mining a rich microarray time series. The AAPS Journal, 7 1 156 194
2 - R. R. . Almon, D. C. Du Bois, W. J. Jusko, 2007 A microarray analysis of the temporal response of liver to methylprednisolone: A comparative analysis of two dosing regimens. Endocrinology, 148 5 2209 2225
3 - M. . Bansal, V. Belcastro, A. Ambesi-Impiombato, D. di Bernardo, 2007 How to infer gene networks from expression profiles. Molecular Systems Biology, 3 78 1 10
4 - D. A. . Beard, H. Qian, J. Bassingthwaighte, 2004 Stoichiometric foundation of large-scale biochemical system analysis, in Modelling in Molecular Biology, ser. Springer Natural Computing Series, G. Ciobanu and G. Rozenberg, Eds. Springer, 1 19
5 - M. Brameier, W. Banzhaf, 2007 Linear Genetic Programming, ser. Genetic and Evolutionary Series. Springer
6 - D. Y. Cho, K. H. Cho, B. T. Zhang, 2006 Identification of biochemical networks by s-tree based genetic programming. Bioinformatics, 22 13 1631 1640
7 - A. G. Floares, 2005 Genetic programming and neural networks feedback linearization for modeling and controlling complex pharmacogenomic systems, in Fuzzy Logic and Applications, 6th International Workshop, WILF 2005, Revised Selected Papers, ser. Lecture Notes in Computer Science, I. Bloch, A. Petrosino, and A. Tettamanzi, Eds., 3849 Crema, Italy: Springer, Sep. 15-17, 178 187
8 - A. G. Floares, 2006 Computational intelligence tools for modeling and controlling pharmacogenomic systems: Genetic programming and neural networks, in Proceedings of the 2006 IEEE World Cogress on Computational Intelligence, G. C. Yen, L. Wang, P. Bonissone, and S. M. Lucas, Eds. Vancouver, CA: IEEE Press, 7510 7517
9 - A. G. Floares, 2007a Reverse engineering algorithm for neural networks applied to the subthalamopallidal network of the basal ganglia, in Proceedings of the International Joint Conference on Neural Networks, Orlando, Florida, USA, August
10 - A. G. Floares, 2007b Automatic reverse engineering algorithm for drug gene regulating networks, in Proceedings of The 11th IASTED International Conference on Artificial Intelligence and Soft Computing, Palma de Mallorca, Spain
11 - A. G. Floares, 2008 Automatic inferring drug gene regulatory networks with missing information using neural networks and genetic programming,” in IEEE World Congress on Computational Intelligence
12 - A. G. Floares, 2008 A reverse engineering algorithm for neural networks, applied to the subthalamopallidal network of basal ganglia. Neural Networks, 21 2-3 , March/April, 379 386
13 - F. R. Garces, V. M. Becerra, C. Kambhampati, K. Warwick, 2003 Strategies for Feedback Linearisation: A Dynamic Neural Network Approach. Springer-Verlag.
14 - T. S. Gardner, J. J. Faith, 2005 Reverse-engineering transcription control networks. Physics of Life Reviews, 2 65 68
15 - H. D. Jong, 2002 Modeling and simulation of genetic regulatory systems: A literature review. Journal of Computational Biology, 9 1 67 103
16 - S. Kikuchi, D. Tominaga, M. Arita, K. Takahashi, M. Tomita, 2003 Dynamic modeling of genetic networks using genetic algorithm and s-system. Bioinformatics, 19 5 643 650
17 - D. J. C. Mac Kay, 1992 Bayesian interpolation. Neural Computation, 4 3 415 447 , [Online]. Available:
18 - N. Noman, H. Iba, 2005 Reverse engineering genetic networks using evolutionary computation. Genome Informatics, 16 2 205 214
19 - E. Sakamoto, H Iba, 2001 “Inferring a system of differential equations for a gene regulatory network by using genetic programming,” in Proceedings of the 2001 Congress on Evolutionary Computation CEC2001. COEX, World Trade Center, 159 Samseong-dong, Gangnam-gu, Seoul, Korea: IEEE Press, 27-30, 720 726 . [Online]. Available:
20 - M. A. Savageau, 1976 Biochemical System Analysis: a Study of Function and Design in Molecular Biology. Reading, MA: Addison-Wesley
21 - C. Spieth, R. Worzischek, F. Streichert, 2006 Comparing evolutionary algorithms on the problem of network inference in GECCO’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation. New York, NY, USA: ACM Press, 305 306 .
22 - N. G. van Kampen, 1992 Stochastic processes in physics and chemistry. North-Holland
23 - E. O. Voit, 2000 Computational Analysis of Biochemical Systems. Cambridge University Press
24 - O. . Wolkenhauer, M. Ullah, W. . Kolch, Cho. Kwang-Hyun, 2004 Modelling and simulation of intra cellular dynamics: Choosing an appropriate framework. IEEE Trans. NanoBioSci., 3 3 200 207
25 - J. Wang, Ed. 2008 IEEE Computational Intelligence Society. Hong Kong: IEEE Press, 1 6 Jun.