Using Matlab and Simulink for High – Level Modeling in Biosystems

The advances in biology, bioinformatics, programming technologies and computer systems made possible to store and analyze large amount of biological data with the possibility of global exploration and visualization of this data with sequence browsers (the possibility to select features for genomic and proteomic data, the possibility to identifying the transcription factors; to analyze RNA-Sequences data in order to identify specific expressed genes and common bioinformatics workflows, identify copy number variants and SNPs in microarray data). All these advances in bioinformatics are opening the possibility to elaborate different categories of models, with applicability both in deeper understanding of biological phenomena (analyz‐ ing, interpreting, and even predicting the genotype–phenotype relationship), and in the practical application of research results.


Introduction
The advances in biology, bioinformatics, programming technologies and computer systems made possible to store and analyze large amount of biological data with the possibility of global exploration and visualization of this data with sequence browsers (the possibility to select features for genomic and proteomic data, the possibility to identifying the transcription factors; to analyze RNA-Sequences data in order to identify specific expressed genes and common bioinformatics workflows, identify copy number variants and SNPs in microarray data). All these advances in bioinformatics are opening the possibility to elaborate different categories of models, with applicability both in deeper understanding of biological phenomena (analyzing, interpreting, and even predicting the genotype-phenotype relationship), and in the practical application of research results.
Beyond the storage and retrieval of biological data, it is important to study and understand the biochemical processes underlying the existence of living organisms, studies that are extremely difficult to achieve due to the high cost of experiments and uniqueness of the phenomena that characterize living organisms.
Living systems are characterized through thousands of simultaneous crossover high integrated threads of information computation, biochemical reactions and mass transfer processes (metabolic networks, gene regulatory networks, signaling pathways).
High-Level modeling of biological systems allows a better understanding of informational workflows, a deeper understanding of control and regulatory mechanisms generated by the biochemical processes involved in the metabolic and signaling pathways of living systems. MATLAB ® and SIMULINK ® are perfect tools for describing and modeling existing control processes within biosystems. Therefore, unlike software systems made using programming languages such as Python, Pearl, or SBLM that are targeted more on pattern matching in the large databases containing genomic information and static representation of the components of a signaling pathway, identify transcription factors and proteins implied in signaling pathways MATLAB ® and SIMULINK ® offer the possibility to study the dynamic behavior of regulatory mechanisms involved in signaling pathways of living systems through computer simulation.

High -Level modeling in biosystems
High level modeling in biology is more complex as modeling in other fields of science and technology.
First reason of that complexity is the indeterminacy and uniqueness of biological systems. Another reason, a specificity of biological systems, is the high level of integration for the biochemical processes and signaling pathways.
Signaling pathways are not operating in isolation. The main characteristic of cellular control mechanisms, that make more difficult high -level modeling in biosystems is the extensive cross-talk between signalling pathways. These highly integrated signaling mechanisms act through different effectors (e.g. muscle proteins, secretory vesicles, transcription factors, ion channels and metabolic pathways) in order to control the activity of cellular processes such as development, proliferation, neural signaling, stress responses and apoptosis [1].
Another particularity of biological systems, difficult to quantify and to represent in modeling metabolic and signaling pathways is that one and the same signaling pathway may control different processes in young cells and mature cells (for example the extracellular signalregulated kinase 1 and 2 (ERK1 and 2) cascade regulates a number of processes in different cellular evolution stages: proliferation, differentiation, survival, migration, stress responses and apoptosis) [2].

Control systems theory and high level modeling in biosystems
There are a multitude of worldwide databases that store information about signaling pathways and factors involved in biochemical processes, but the results of research in this field highlights that mechanisms underlying the bioprocesses are not still fully elucidated.
Control systems theory may provide an engineering approach of bioprocess that characterizes the living systems.
Control systems theory applied in systems biology made possible the study of the dynamic behavior of biological systems pathways (metabolic pathways, cell signaling pathways) and informational workflows defined by biochemical reactions.
Depending on how there are expressed the evolution laws describing the biological system, and the applied methodolgy, there are many types of models used for modeling biological systems: • Differential models • Models defined by autonomous differential equations • Models defined by differential equations with restrictions • Models defined by differential equations with delay (dead time) • Integrodiferenţiale models that include the effects of delay • Models defined by partial differential equations • Mixed models containing all of the above equations

• Formalism point transformations
• Models using automata theory • Models described by universal modeling language (Matlab, Simulink) • Models described by bioinformatics specific modeling languages [3] On the other hand, in order to be a valid model one must fulfill the following requirements 1. The biological behavior of the modeled system must be in deep understood, based on observations and experimental data.

2.
Deep knowledge of the properties of various equations that constitute the model.

3.
The problem must be formulated correctly in Hadamarad sense, ie to verify the following restrictions: • The solution should exist • The solution should be unique • The solution should be continuous with respect to restrictions boundary conditions or in relation to initial conditions Developing systemic models in the case of bioprocesses shows some particularities compared to other areas of science and technology. Building models in classical branches of science and technology starts from basic concepts, definitions and laws. For example, in physico-chemical sciences, for modeling chemical reactions one starts with basic concepts such as chemical potential, fundamental rate equations such as the diffusion equation, and the basics of electrochemistry such as the Nernst equations. These equations are based on fundamental physical theories and concepts, and contain a defined number of parameters, most of which can be individually measured.
In biosciences, there are no very stringent concepts, definitions and laws due to the indeterminacy, uniqueness, complexity and specificity that characterize biological processes. As a result, the parameters or factors which are involved in signaling pathways are difficult to be measured due to the fact that experiments on living systems are expensive and they cannot be reproduced by an infinite number of times, as it is possible in other areas of science (physics, mechanics, chemistry, etc.).
On the other hand, biological systems are characterized by uniqueness, that means in identical experimental conditions, two experiments is not absolutely identical due to different behaviors of living systems.
Another aspect that makes difficult to develop high -level models for biosystems is the high degree of undetermination of living systems and that the biological mechanisms underlying the processes of regulating and control are not yet fully elucidated.
In biosciences, the computer assisted models for the complex living organisms must have information both on the properties of each component in the system as well as on their interconnectivity and interdependency.
The results of research in the life sciences field, demonstrates that, in spite of impressive bioinformatic databases, all of the information needed to build a computer model of a whole cell at subcellular detailed level of description is not available yet. Moreover, the introduction of the whole information characterizing a cell in a model, in order to get a model that integrates all the cellular mechanisms requires a computational effort impossible to achieve and to manage at this moment.

Signaling pathways in biosystems
Biological systems are complex systems with hierarchical structure, characterized by a high degree of robustness. The stability and adaptability of living systems to environmental changes is provided by control mechanisms, feedback loops, coordinated by complex genetic programs.
The genetic programs and algorithms that control entire cell evolution and activity are expressed through the metabolic and signaling pathways. The metabolic and signaling pathways are determining all cell processes: cell profilation, cell differentiation into specific cell types, cell functions and cell apoptosis [1].
The inter-and intra-cellular signaling mechanism, through biochemical signals (biochemical components), represent the most common signaling method in living systems.
The biochemical signal released by a cell is captured by the membrane receptors from the target cell and converted into specific second messengers. The second messengers are triggering specific processes at nucleus level, cytoplasma level and endoplasmaticum reticullum level for DNA transcription and specific protein synthesis.
Proteins are macromolecular organic substances with an essential role both in the cell and the entire body. In the cell, proteins are classified as: • Structural proteins -involved in setting up structures resistant cells • Functional proteins -are divided into: • permanent functional protein • temporary functional protein Permanent functional proteins are fulfilling different roles inside the cell in which they were synthesized: for example proteins-enzymes with biocatalytic role, nucleoprotein, located in genome constitution and involved in the synthesis and cellular functions, and the transmission of hereditary characters [3].
Temporary functional proteins are present for a limited time interval in the cell in which they were synthesized. Temporary functional proteins are eliminated in the extracellular environment and they are treveling through the body, and fulfilling their function in the target organs or cellular structures. In this category of proteins are hormones, which can be classified into: • Classical hormones: pituitary hormones, thyroid, etc, produced by certain specialized bodies and which, circulating through the body are acting at great distances.
• Neurotransmitters: acetylcholine, norepinephrine, dopamine, and so on, are released at the synapses and provide local nerve transmission.
• Local Hormones: histamine, plasmakinine, etc, released by cells of various tissues, ensure tissue homeostasis and organ.
Intensive research have shown that certain substances, being considered for long time as only local hormones also play the role of neurotransmitters with important implications on the evolution of the human body. In this category enter, for example, serotonin and histamine [3]. It was also shown that a number of classic hormones, such as insulin and adrenaline fulfill specialized roles as synaptic neurotransmitters and found that in many synapses, nerve crossing, are released simultaneously two or more active substances some fulfilling, besides local neurotransmitter function, and other roles in the body. Also, research in the field have revealed the existence of a genuine diffuse endocrine system, consisting of nerve cells scattered throughout the body and able to secrete a variety of substances, involved in work many tissues and organs (central and peripheral nervous system, digestive system, etc.).
Once these biochemical signals reach their targets they use a various number of cell signaling pathways to control the cellular activity ( fig.1) Some biochemical components of signaling pathways are packaged into vesicles where they are stored before being released by exocytosis (hormones with role in neuronal transmission). These stimuli then have different modes of action (juxtacrine, autocrine, paracrine and endocrine), which may be defined based on the distance they travel in order to reach their cellular receptor targets. The information transfer mechanisms implied in cell signaling pathways differ according to physiological function and messenger biochemical structure. Experimental studies have highlighted the following mechanisms for inter-and intra-cellular information transfer: diffusion, direct protein-protein interactions, covalent modifications (protein phosphorylation, acetylation, nitrosylation) [1].
Experimental studies revealed also that each cell type has a unique repertoire of cell signaling components, signalsome, determined by the cell phenotype, expressed during the final stages of cell development, in order to control cell particular physiology. The signalsomes are characterized by a high degree of plasticity being constantly remodeled in order to adjust cell responses to environmental changes. The signalsomes high degree of plasticity provides robust regulatory mechanisms of the cell and for the entire organism to the environment. On other hand, abnormal remodeling of cellular signalsomes creates signaling defects that have great significance for the onset of many diseases [1].
More, linear signaling cascades are involving progressive signal amplification initiated by ligands and governed by phosphorylation, proteins associated with other pathways and an array of modifications fine tune the signal. Cells are using also scaffolding proteins, phosphatases, and second messengers to refine complex signaling programs connecting extracellular stimuli to intracellular responses [2].
A way to describe this diversity is to consider the nature of the stimuli that feed into different cell signaling pathways. Some signaling mechanisms are used by many different signaling pathways, whereas other pathways respond to a specific set of stimuli [2].

Signaling pathways and cellular regulating mechanisms
We have mentioned above that the cell is characterized by a multitude of control mechanisms.
Signaling pathways does not operate in isolation, and a key element of cellular control mechanisms is the extensive cross-talk between signaling pathways.
The role of control mechanisms is to maintain the cell inside the normal operating parameters. In the following we will present some of them:

The (ERQC) control mechanism
How it was mentioned above, abnormal remodeling of cellular signalsomes creates signaling defects that have great significance for the onset of many diseases. The Endoplasmic Reticulum-mediated Quality Control (ERQC) control mechanism is a protein quality control mechanism, integrated with an adaptive stress response. The role of the ERQC control mechanism is to assure that the cell synthetized proteins have the correct three-dimensional structure, essential for normal function of the cell [4]. In essence, cellular response and cellular activity lies in synthesis of a specific protein required to fulfill various physiological functions of the cell.
Biochemical structure and spatial geometry of the protein components are those that cause active or inactive state of the protein, and the impact that will have on other protein-specific mechanisms of cellular metabolism. Sometimes, during to the synthesis process, some parts of functional proteins may remain unfolded or may be misfolded. Scientific studies reveals that failures to fold into native structure generally produces inactive proteins, or, in some instances, may generate modified or toxic functionality.
The ERQC pathway is characterzed by a number of factors located into the endoplasmaticum reticulum lumen membrane and cytosol. The role of ERQC factors located in ER lumen consist in the detection and retention of proteins that have been produced by cell with folding aberrations. The ER membrane located factors are implied in retrotranslocation of misfolded polypeptides, and the cytosol located enzymes degrade retrotranslocated proteins.
The integrated stress response (termed ER stress or unfolded protein response, UPR) contains several signaling branches elicited from the ER membrane, which fine-tune the rate of protein synthesis and entry into the ER to match the ER folding capacity ER-associated protein degradation (ERAD) Elimination of misfolded proteins from the ER by the ERQC program counteracts the production of aberrant proteins from various folding mishaps. Misfolded proteins are exported from the ER and subsequently destroyed by the ubiquitin-proteasome system in the cytosol by a process called retrotranslocation or ER-associated protein degradation (ERAD) [5]. In Fig 3. is presented the ERQC pathway systemic model.

Cellular mechanisms for protein degradation
Protein synthesis processes are subject of interference from both intracellular and extracellular environment. The effects of intracellular and extracellular disturbances are materialized in aberrations of structure and folding of proteins There are two mechanisms used to eliminate nonfunctional proteins or proteins with aberrations of the structure, as well as other subcellular components: the proteosome degradation and the autophagy.

Proteasomal degradation
The 26S proteasome is the major degradation machinery in the cell for dysfunctional or damaged proteins [5].
Proteasomal degradation also regulates a variety of cellular process such as cell cycle progression. The proteasome is a large multi-subunit enzyme that is comprised of two sub-complexes, the 19S regulatory complex and the 20S proteolytic complex [5].

Autophagy
Autophagy is a lysosomal degradation pathway that degrades damaged or superfluous cell components into basic biomolecules, which are then recycled back into the cytosol. In this respect, autophagy drives a flow of biomolecules in a continuous degradation-regeneration cycle. Autophagy is a non-selective, bulk degradation pathway and it is generally considered a pro-survival mechanism protecting cells under stress or poor nutrient conditions. Current research clearly shows that autophagy fulfills numerous functions in vital biological processes.
It is implicated in development, differentiation, innate and adaptive immunity, ageing and cell death [5].

• lipophagy
• xenophagy, but the mechanisms through which the autophagic machinery regulates these diverse processes are not entirely understood [6].
If it is noted with Z i ' , each individual perturbation wich act on an signalling pathway, where i=1-n, their cumulative effect will be where I (s) represents the input biochemical signal.
The perturbation transfer function will be

Actual state of the art in high level modeling of biosystems
Presently there are available a variety of software systems for high level modeling of biosystems, developed in miscellaneous programming languages: Python, Mathematica, Pearl, SBML (Systems Biology Markup Language), Matlab®. The vast majority of these systems are oriented towards pattern recognition processes of large volumes of data in in the field of and proteoemic genomic the identification transcription factors, RNA sequence analysis, modeling biochemical factors involved in the generation and transmission of information in case of signaling pathways, such as and other similar activities that can be divided into static modeling bioprocesses.
Mathematica (Wolfram Mathematica) [7] is a dedicated software system; a multi-paradigm programming language; with a huge built-in library of both general-purpose and highly specialized functions. Programming in Mathematica consist in finding the right combination of primitives. It works with Element Data, Chemical Data, Particle Data, Genetic Data and Protein Data [8]. In contrast to Matlab, Mathematica does not offer similar features to the other module in Matlab Simulink module.
Python is a programming language interpreter based [9], with a good numerical support, provided by Numerical Python (numpy) package, which also provides the possibility to define specific bioinformatics functions for tasks as data management, file parsing, string processing, and interaction with databases. Models designed in Python find their applicability in pattern matching or pattern identification in genomic and proteomic data. Python makes a distinction between matching and searching, which other languages do not. Matching looks only at the start of the target string, whereas searching looks for the pattern anywhere in the target [10].

Using MATLAB ® and Simulink ® for high -Level modeling in biosystems
In contrast to the biosystems models developed using programming languages such as Python, Pearl, or SBML that are targeted more on pattern matching in the large databases containing genomic information and static representation of the components of a signaling pathway, identification of transcription factors and proteins involved in signaling pathways, MAT-LAB ® and SIMULINK ® provide the possibility to study the dynamic behavior of the regulatory mechanisms involved in signaling pathways of living systems through computer simulation.
During the last decade the use of MATLAB ® has increased consistently in scientific and academic institutions as well as in several branches of industry that deal with topics ranging from economics to spacecraft orbit simulations and bioinformatics.
The MATLAB ® and SIMULINK ® software package and the additional specific toolboxes have been proven to be very efficient and robust for numerical data analysis, modeling, programming, simulation and computer graphic visualization.
The main features of MATLAB ® and SIMULINK ® are: SIMULINK ® is the important MATLAB ® enlargement which simplifies very much the computation and model building process. SIMULINK ® is a modeling environment in which systems are represented as block diagrams, which are most often a convenient way to show process actions and interactions. SIMULINK ® provides a set of predefined blocks that can combined in order to create a detailed block diagram of the studied biological process/system/ sub-system. The dynamic model for a biological system may be created dragging and dropping the blocks from the block libraries to the new window and connect them and run the model. In order to implement complex mathematical functions or algorithms for describing the dynamic behavior of the modeled biological system, SIMULINK ® can interface with C, Fortran, and MATLAB ® m-file scripts, Java™ applications, deploy the developed algorithms and custom interfaces as standalone applications, convert MATLAB ® algorithms into Microsoft ® .NET or COM components that can be accessed from any COM-based application, and create Microsoft Excel ® add-ins [11].
The computation underlying Simulink models is handled by the set of solvers included in the MATLAB ® package. The "From Workspace" block can be used to input MATLAB® data to a SIMULINK ® model.
Main Features of the Bioinformatics Toolbox: • Next Generation Sequencing analysis and browser Sequence analysis and visualization, including pairwise and multiple sequence alignment and peak detection • Microarray data analysis, including reading, filtering, normalizing, and visualization • Mass spectrometry analysis, including preprocessing, classification, and marker identification • Phylogenetic tree analysis • Graph theory functions, including interaction maps, hierarchy plots, and pathways • Data import from genomic, proteomic, and gene expression files, including SAM, FASTA, CEL, and CDF, and from databases such as NCBI and GenBank [12] Matlab Bioinformatics Toolbox ™ library of functions and algorithms dedicated to the analysis and processing of data in the bioinformatics provides algorithms and specific applications for analizing and explore and visualize this data with sequence browsers, analyze whole genomes while performing calculations at a base pair level of resolution, spatial heatmaps, and clustergrams, imputing values for missing data, selecting features for genomic and proteomic data, identify transcription factors; analyze RNA-Seq data to identify differentially expressed genes; common bioinformatics workflows, identify copy number variants and SNPs in microarray data; use several methods for normalizing microarray data, including lowess, global mean, median absolute deviation (MAD), and quantile normalization and classify protein profiles using mass spectrometry. [11] The toolbox provides also the possibility to manipulate and analyze DNA or RNA sequences. One may be converted DNA or RNA sequences to amino acid sequences using the genetic code, Perform statistical analysis on the sequences and search for specific patterns within a sequence. The toolbox provides also protein sequence analysis techniques, including routines for calculating properties of a peptide sequence such as atomic composition, isoelectric point, and molecular weight. It can be determined the amino acid composition of protein sequences, cleave a protein with an enzyme, and create backbone plots and Ramachandran plots of PDB data.

Using MATLAB ® and SIMULINK ® for modeling the dynamical behavior of signaling pathways
In contrast with other systems dedicated for analysis, data processing and modeling in the field of bioinformatics, MATLAB ® and SIMULINK ® offers not only the possibility of genomewide data analysis, protein databases, protein pattern recognition, identification and classification of genes and DNA sequences, highlighting signaling pathways and factors involved, but also offers the possibility of analyzing inter-and intra-cellular processes from the of engineering systems theory and automatic regulation points of view. MATLAB ® and SIMU-LINK ® enable researchers understanding the dynamic behavior of the signaling pathways in specific regulatory mechanisms bioprocesses.
Because biological systems are highly complex structures and the mechanisms that control signaling pathways are not very well known, the multilevel hierarchical models built using MATLAB ® and SIMULINK ® can be developed in order to describe only certain categories of functions or biological processes, as well as on the whole system or subsystem, using the black box model for the processes or sub-processes there are not very well known.
Considering the above issues, it is imperative in the model design for the biological system, to make the block diagrams resemble the logical flow of information through the biological system.
Additionally, biological systems are characterized by extremely rapid adjustment and response mechanisms to disturbing factors from the environment, and from this point of view, SIMULINK ® offers enhanced capabilities to capture time-based events with high fidelity, essential for the study of processes ensuring the stability and adaptability over time of biological systems The overall goal in designing a high level model for biosystems is to use the principle of feedback loops to adjust the controlled variable to follow a desired command variable accurately regardless of the command variables path and to minimize the effect of any external disturbances or changes in the dynamics of the system. The standard structure of is a relatively complex task if one must meet the basic requirements listed below: • The minimum requirement is that the closed loop is stable.
• The disturbances should be rejected or they must have a small influence on the controlled variable • The controlled variable should track the command input as precisely and as fast as possible.
• The closed loop should be as insensitive as possible with respect to changes in the plant parameters.
Based on the model system shown in Figure 1, using MATLAB ® and SIMULINK ® to build the next model for studying intracellular processes regulating the dynamic behavior. In the model there are detailed the membrane receptors and it highlights how specific biochemical binding of a compound to the membrane receptor generates dynamic response of intracellular effectors and regulatory mechanisms.
Delay elements are corresponding to bioprocesses occurring in the membrane when the biochemical signaling structure (biochemical factor) is binding on the specific receptor (specific factor receptor) existing in the target cell membrane.
Feedback control loops correspond to recognition processes of aberrant protein structures and folding aberrations and their destruction processes in the mitochondria.
The results of different parameters simulations are presented in figures 5 a,b,c,d.
In simulations where used Runge Kutta 5th order method (fig 5 a,c) and Runge Kutta 3th order method (fig 5 b,d).
the next model for studying intracellular processes regulating the dynamic behavior.
The cell membrane receptors and highlights how specific biochemical binding of compound to the receptor generates dynamic response of intracellular effectors an regulatory mechanisms.
Delay elements are corresponding to bioprocesses occurring in the membrane when bindin factor receptor biochemical structure existing in the target cell membrane.
Feedback control loops correspond to recognition processes of aberrant protein structure and folding aberrations and their destruction processes in the mitochondria.
The results of different parameters simulations are presented in figures 5 a,b,c,d.
In simulations where used runge kutta 5th order method (fig 5 a,c) and runge kutta 3t order method (fig 5 b,d).    Adjusting the transfer functions patameters, entire dynamic behavior of the model may be adjusted.
From the point of view of the dynamic behavior of the model, the simulation results revealed that after overcoming the transitional regime, the model shows a stable behavior, similary to the living systems behavior observed from experimental data (pSOS, pJAK2).
In fig. 6 it is presented the hierarchical Simulink model for the cellular signaling pathways.
There were detalied the Nucleus regulating processes and Mithocondria control mechanisms. The outputs of simulations are presented in figures 7 (a,b).
The differences between Figures 5 (a, b,c,d) and 7 (a, b) representing the shape of the output signal of the model shows that the model detailed at subcellular level and proper adjustment of transfer functions that describe each subcellular component ensure high accuracy of the model.

Conclusions
Living systems are complex systems, characterized by a multitude of control mechanisms. Signaling pathways do not operate in isolation, and a key element of living systems control mechanisms is the extensive cross-talk between signaling pathways that control all biological regulating mechanisms.
High level modeling in biosystems, consist in modeling the signaling pathways.
Experimental studies revealed that each cell type has a unique repertoire of cell signaling components, signalsome, determined by the cell phenotype, expressed during the final stages of cell development, in order to control cell particular physiology. The signalsomes are characterized by a high degree of plasticity being constantly remodeled to in order to adjust cell responses to environmetal changes.
High level modeling in biosystems is a laborious process due a number of facts: the uniqueness of living systems behavior, insufficient knowledge of bioprocesses, high costs of experiments and the impossibility to repeat them in absolutely identical conditions, the large number of unknowns involved in the models, the extensive cross-talk between signaling pathways.
MATLAB ® and SIMULINK ® provide the possibility to study the dynamic behavior of the regulatory mechanisms involved in signaling pathways of living systems through computer simulation.