Cross-Talk Categorisations in Data-Driven Models of Signalling Networks: A System-Level View Cross-Talk Categorisations in Data-Driven Models of Signalling Networks: A System-Level View

Data-driven models of signalling networks are becoming increasingly important in systems biology in order to reflect the dynamic patterns of signalling activities in a context-specific manner. State-of-the-art approaches for categorising and detecting signalling cross-talks may not be suitable for such models since they rely on static topologies of cell signalling networks and prior biological knowledge. In this chapter, we review state-of-the-art approaches that categorise all possible cross-talks in signalling networks and propose a novel categorisation specific to data-driven network models. Considering such models as undirected networks, we propose two categories of signalling cross-talks between any two given signalling path ways. In a Type-I cross-talk, a signalling link { g i , g j } connects two signalling pathways, where g i and g j are signalling nodes that belong to two distinct pathways. In a Type-II cross-talk, two signalling links { g i , g j } and { g j , g k } meet at the intersection of two signalling pathways at a shared signalling node g j . We compared our categorisation approach with others and found that all the types of cross-talks defined by those approaches can be mapped to Type-I and Type-II cross-talks when underlying signalling activities are considered as non-causal rela - tionships. Next, we provided a simple but intuitive algorithm called XDaMoSiN (cross-talks in data-driven models of signalling networks) to detect both Type-I and Type-II cross-talks between any two given signalling pathways in a data-driven network model. By detecting cross-talks in such network models, our approach can be used to analyse and decipher latent mechanisms of various cell phenotypes, such as cancer or acquired drug resistance, that may evolve due to the highly adaptable and dynamic nature of signal transduction networks.


Introduction
A signal transduction network is a collection of all cell signalling pathways where each pathway is a series of biochemical events, transmitting input signals from receptor proteins to intracellular target proteins (e.g., transcription factors). The outcomes mediated by signalling pathways include various cellular activities such as cell growth, proliferation, differentiation, migration, adhesion, and apoptosis [1,2]. Interactions among distinct signalling pathways are called signalling cross-talks and may also play vital roles in mediating or modulating cellular activities [3] under different disease-related cell conditions such as cancer and acquired drug resistance.
Models of signal transduction networks often take a qualitative approach that relies on prior biological knowledge obtained from experimental findings in various cell lines [4,5]. However, the pattern of cell signalling activities is not static and can vary in different cell lines [4,5]. Moreover, different cell lines for which the underlying network architectures of signalling activities are conserved may yield different responses even in similar experimental settings [5]. In the same cell, different ligands can produce different signalling connections [5,6]. Moreover, different drugs and different treatment conditions may also induce different signalling dependencies and thus create a dynamic re-wiring in the signalling network topology [6][7][8]. Therefore, understanding a signalling network topology demands a data-driven modelling approach in order to reflect its context-specific nature in a particular cell type, and a particular experimental configuration. Here, data-driven models of signalling networks are models in which network edges are inferred solely based on signalling data [4] using machine learning approaches such as least square regression [9], Bayesian networks [10][11][12], and time-lag correlation [13]. In contrast, static models of signalling networks are based on canonical signalling mechanisms obtained from the literature [4]. Recent advancements in high-throughput data generation techniques facilitate the quantification of signalling responses, thereby producing large volumes of data measuring protein abundances and activities [4].
Detecting signalling cross-talks using data-driven models of signalling networks is an important task in systems biology since such cross-talks may reveal novel mechanistic details underlying perturbed cellular conditions. Receptor tyrosine kinase (RTK) heterodimerisation is one of the forms of signalling cross-talks (also known as receptor function cross-talks) [14], which has been reported to be involved in the processes of tumourigenesis and developing acquired drug resistance in many cancers [6]. Usually, epidermal growth factor receptor (EGFR) strongly activates extracellular signal-regulated kinase (ERK) signalling, but it is also a weak activator of the phosphatidylinositol 3-kinase (PI3K) signalling pathway. Interestingly, when EGFR cross-talks with human epidermal growth factor receptor 2 (HER2) through heterodimerisation, it activates both signalling pathways significantly [15], thereby contributing tumourigenesis by stimulating proliferation and preventing cell death [6]. In another example, the RTK expression of AXL was found to be a mechanism of acquired resistance to EGFR inhibitors [16], and AXL is found to be transactivated by EGFR through heterodimerisation (cross-talk) [6].
In this chapter, we review existing approaches that have been used in the literature to categorise cross-talks in signalling networks. However, all these methods are limited in application to static models of signalling networks and cannot be used to categorise cross-talks when the types of signalling activities (e.g., reaction, catalysis, or inhibition) are not known. We therefore introduce a novel cross-talk categorisation for a single cell model to resolve such issues. We also compare our categorisation with the existing approaches. Finally, we present an algorithm to computationally detect all signalling cross-talks that are included in our proposed categorisation. Natarajan et al. [17] reported that a global analysis of both known and novel cross-talks can reveal system-level insights into context-dependent signalling: many ligand stimuli converge on a relatively small number of signalling molecules to produce unique responses. Thus, we hypothesise that our approach will be useful to elucidate similar novel system-level aspects of signalling networks derived from context-specific signalling data through the identification of cross-talks.

Existing methods for categorising cross-talks
Only a few studies have attempted to categorise types or modes of cross-talks between two signalling pathways [6,14,18]. In reviewing signalling cross-talks between transforming growth factor-β/bone morphogenic protein (TGF-β/BMP) and other signalling pathways, Guo and Wang [18] distinguished three different modes of signalling cross-talks. According to that study, two pathways: pathway 1 and pathway 2 cross-talk when (1) some component of pathway 1 physically interacts with some component of pathway 2 (Mode-A), (2) some component of pathway 2 plays a role as an enzymatic or transcriptional target of some component of pathway 2 (Mode-B), or (3) signals from pathway 1 modulate or compete for a key modulator or mediator protein that is shared between pathway 1 and pathway 2 (Mode-C). [14] proposed five types of signalling cross-talk between any two signalling pathways: pathway 1 and pathway 2 . They are as follows:

Donaldson and Calder
• Signal-flow cross-talk: an alternative reaction that enhances the signalling in pathway 1 by producing, or catalysing, or inhibiting the production of a protein mediated by the signalling of pathway 2 . For example, there exists signal-flow cross-talk between mitogen-activated protein kinase (MAPK) and integrin signalling pathways [19], where the increased rate of activation of some key protein in the integrin pathway is mediated by signalling through the MAPK pathway.
• Receptor function cross-talk: an alternative reaction to activate/inhibit the receptor of pathway 1 by some enzyme of pathway 2 without the need of a ligand (a protein that activates a receptor protein). For example, oestrogen receptor may become activated without the need of oestrogen ligand by other signalling pathways [20].
• Gene expression cross-talk: a component (typically, a protein) of pathway 1 inhibits or modifies the transcription or protein production of genes in pathway 2 . For example, transcription factor glucocorticoid receptor (GR) of hormone signalling pathways translocates to the nucleus and inhibits the transcriptional activities of the transcription factor nuclear factor-κB (NF-κB) that is activated in response to inflammatory stimuli and environmental stressors [21].
• Substrate availability cross-talk: pathway 1 and pathway 2 share a protein (or a set of proteins) and both of the pathways compete for the activation of that shared protein(s). For example, two MAPK pathways in the yeast S. cerevisiae that share mitogen-activated protein kinase kinase kinase (MAPKKK) protein STE11 (Sterility gene 11) and possess homologous mitogenactivated protein kinase kinase (MAPKK) and MAPK proteins compete for the activation of the MAPK cascade [22].
• Intracellular communication cross-talk: the gene products of pathway 1 act as ligands for the receptor of pathway 2 . For example, TGF-β and Wnt (Wingless-related integration site) signalling regulate the production of ligands of one another [18].
Donaldson and Calder [14] also reviewed some computational models that deal with crosstalks between specific pathways including MAPK pathway, AKT pathways, and protein kinase C (PKC) pathways. These models [22][23][24] use ordinary differential equations (ODEs) where the notion of the cross-talk was a part of the system of equations without any explicit way of detecting or categorising them [14].
Kolch et al. [6] described three types of cross-talks such as heterodimerisation between signalling proteins, node sharing, and competition for nodes. Signalling protein heterodimerisation is a biochemical process where a protein complex is formed by two different macromolecules, and RTK heterodimerisation is a common example of this type of cross-talk [6]. For example, EGFR heterodimerisation with ErbB2 (erythroblastic leukaemia viral oncogene B2 also known as HER2) or ErbB3 (erythroblastic leukaemia viral oncogene B3) (also known as HER3, human epidermal growth factor receptor 3) activates both ERK and PI3K signalling pathways [15] and thereby mediates proliferation and cell survival signals in tumourigenesis [6]. In another example, the transactivation of AXL (an RTK) is caused by EGFR heterodimerisation, and the expression of AXL was found to be a mechanism of resistance to EGFR inhibitors [16].
An example of node (i.e. protein) sharing cross-talk is the scaffolding protein (a protein that binds with multiple members of a signalling pathway) GRB2-associated binding partner (GAB), which is shared by two signalling pathways: EGFR and insulin receptor (IR) pathways [25]. Lastly, an example of cross-talk in the form of competition for nodes (i.e. proteins) was recently identified, consisting of a switch-like coordination between proliferation and apoptotic signalling through rapidly accelerated fibrosarcoma (RAF)-ERK signalling and mammalian STE20-like protein kinase (MST2) signalling [26]. In mammalian cells, rapidly accelerated fibrosarcoma1 (RAF1) inhibits MST2-induced apoptosis (promotes proliferation) [27], whereas Ras association domain-containing protein 1A (RASSF1A) activates MST2 (promotes apoptosis) [28]. Romano et al. [26] showed that this signalling coordination is switchlike, since MST2 binds mutually exclusively with its inhibitor RAF1 and activator RASSF1A by changing its binding affinities from low to high.
Identifying the above cross-talk categories requires prior biological knowledge of the nature of signalling links. An essentially static model of signal transduction networks is thus assumed. However, in data-driven models of signalling networks, connectivity among signalling nodes may differ from cell to cell [6]. In order to reveal novel signalling dynamics in cell-specific, ligand-specific, or treatment-specific contextual data, we define a novel cross-talk categorisation in the following section.

Approaches for inferring data-driven signalling networks
Although our main focus in this chapter is to propose a cross-talk categorisation, here we briefly mention some approaches that fit data-driven models of signalling networks to quantitative signalling datasets. Some high-throughput proteomics techniques that quantitatively measure phosphorylation activities of phosphoproteins (signalling proteins) include mass spectrometry, flow-cytometry, ribonucleic acid interference (RNAi) screening, and reverse-phase protein array (RPPA) [13,29]. Apart from proteomics data, some approaches use gene expression measurements of phosphoproteins as a proxy for protein expression (i.e. protein activity) [30][31][32] in order to fit data-driven models of signalling networks. However, inference methods include modelling both causal [9-12, 29, 33] and non-causal (simple correlations) relationships [13,34] among phosphoproteins. To identify causal relationships in a signalling network topology, various approaches have been applied such as least square regression [9], various models on Bayesian networks [10][11][12] and dynamic Bayesian networks [29], and maximum entropy [33]. Correlation-based approaches include measuring the simple Pearson correlation [34] and time-lag correlation [13]. The rationale behind applying such simple correlation-based approaches to infer signalling network structure is that individual signals may co-vary with respect to one another [4]. Figure 1 presents a schematic diagram of a possible framework that can use our proposed novel cross-talk categorisation algorithm to find cross-talks in data-driven models of signalling networks.

Proposed cross-talk categorisation
In order to generalise our cross-talk categorisation for both causal and non-causal network models, we consider a signalling network as an undirected network. Let G(V,E) be an undirected graph that represents an entire signalling network containing a set of signalling pathways, where V is a set of n signalling components (typically proteins or protein complexes, denoted g i , for i = 1, 2, …, n) and E is a set of unordered pairs of signalling components of the form {g i ,g j } representing signalling links inferred from data. We propose two types of signalling cross-talks between any two signalling pathways, denoted pathway 1 and pathway 2 , which is shown in Figure 2. Here, a pathway is defined merely as a list of signalling components, usually obtained from databases such as KEGG [35], WikiPathways [36], and Reactome [37].

Type-I cross-talk
{g i ,g j } ∈ E is a Type-I cross-talk between pathway 1 and pathway 2 if (g i ∈ pathway 1 ∧ g j ∈ pathway 2 ) ∧ (g i ∉ pathway 2 ∧ g j ∉ pathway 1 ).

Type-II cross-talk
E is a Type-II cross-talk between pathway 1 and pathway 2 if (g i ∈ pathway 1 ∧ g j ∈ pathway 1 ) ∧ (g j ∈ pathway 2 ∧ g k ∈ pathway 2 ).

An algorithm for detecting proposed cross-talks
In Figure 3, we present a simple but intuitive algorithm for identifying Type-I and Type-II cross-talks in data-driven signalling network models. We refer to our algorithm as XDaMoSiN (cross-talk in data-driven models of signalling network). Note that our approach considers data-driven models of signalling networks as undirected networks in order to generalise our categorisation for both causal and non-causal network models. The only assumption we make  Here, each of the pathways is a collection of signalling nodes (typically proteins or protein complexes). A Type-I cross-talk is a signalling link {g i ,g j } that connects two signalling pathways where neither of the two pathways contains both signalling nodes, g i and g j . A Type-II signalling cross-talk is a pair of signalling links {g i ,g j } and {g j ,g k } residing at the intersection of two signalling pathways with a shared node g j .
here is that pathway annotations of signalling pathways are known from pathway databases such as KEGG [35], Reactome [37], and WikiPathways [36]. In these annotations, a pathway is defined as a list of signalling nodes. Note that the signalling links among these nodes are modelled using data-driven relationships. Therefore, a data-driven model of a signalling network is defined as where V is a list of n signalling nodes and E is a list of signalling links {g i ,g j } inferred from data. This algorithm takes two inputs: G (the network) and PathwayDB (a pathway database) and produces two outputs: Type_I_crosstalk and Type_II_crosstalk, which are two lists containing all Type-I and Type-II cross-talks (Figure 3). Here, we consider PathwayDB as a list, where each element in that list is also a list, containing signalling nodes in a particular pathway, and is indexed by the corresponding pathway ID (typically, the pathway name).
In the first part of the algorithm, we find all the Type-I cross-talks among all the pathways in PathwayDB. At first, we initialise the list Type_I_crosstalk, which collects all such Type-I   (PathwayDB, pathway_id) exists, which constructs a list of signalling nodes in a particular pathway with ID: pathway_id in the PathwayDB.
In the second part of the algorithm, we find all Type-II cross-talks. First, we examine each signalling node g j individually, to determine whether it is shared by more than one pathway and has incident signalling link(s) (from E) in those pathways. For this purpose, for each signalling node g j , we construct an intermediate list, called L j . This list collects ordered pairs of information: (1) each incident signalling node g i in {g i ,g j } ∈ E that is contained in a pathway labelled pathway_id and (2) the pathway_id itself. Next, for any combination of pairs in the list L j , such as (pathway_id_1,g i ) and (pathway_id_2,g k ), if pathway_id_1 and pathway_id_2 are different, then we define {g i ,g j } ∧ {g j ,g k } as a Type-II cross-talk between pathway_id_1 and pathway_id_2.

Type-I and Type-II cross-talks include cross-talks from other state-of-the-art categorisations
We compare the cross-talk categorisation approaches, including our proposed methods, in Figure 4. This comparison reveals an interesting aspect of these categorisations: cross-talks between any two pathways can be identified when their corresponding causal relationships are ignored, that is, considering the signalling network as an undirected network only. At the same time, we note that our approach can include all types of cross-talks defined by other categorisation.
Type-I cross-talks can represent signal-flow cross-talks, receptor function cross-talks, and geneexpression cross-talks from Donaldson and Calder [14], Mode-A and Mode-B cross-talks from Guo and Wang [18], and cross-talk of signalling protein heterodimerisation from Kolch et al. [6]. In a cross-talking pair {g i ,g j } in each of these categories, one signalling component g i belongs to one pathway and g j belongs to another pathway, or vice versa, but mutually exclusively (Figure 4). Again, Type-II cross-talks represent the cross-talk types of substrate availability and intracellular communications from Donaldson and Calder [14], Mode-C cross-talks from Guo and Wang [18], and signalling node sharing and competition for nodes from Kolch et al. [6], since in all of these categories, there exists a shared component between pathway 1 and pathway 2 for which the other components of those individual pathways compete for modification or activation of that shared component (Figure 4).
Moreover, Donaldson and Calder [14] reported that their categorisation comprehensively covered all possible types of signalling cross-talks in a single cell model. Since Type-I and Type-II cross-talks include all cross-talks from Donaldson and Calder [ Figure 4], we claim that our categorisation is also comprehensive. Moreover, Donaldson and Calder made a claim that their approach can be useful for detecting cross-talks in data-driven models of signalling networks. However, we note that their proposed algorithm (see the appendix of [14]) was based on qualitative logic only, and is not explicit how that could be used for dealing with network models derived from high-throughput quantitative signalling data such as mass spectometry and RPPA data. Moreover, since they used modular architecture of signal propagation (receptor function, three-stage cascade, and gene expression [14]) in detecting all signalling cross-talks, their approach is not suitable for models derived from gene expression data only. There are some studies [30][31][32] that attempted to infer signalling network topology using gene expression as a proxy for signalling protein activities, since gene expression data are usually cheaper to generate and are possible to produce in large scale [32].

Discussion and conclusion
The data-driven modelling of signalling networks and the detection of cross-talks in those models provide effective ways to elucidate novel mechanisms of perturbed signalling activities in various disease conditions such as cancer and drug resistance. In this chapter, we reviewed some state-of-the-art approaches that categorise signalling cross-talks and identified a limitation of their applicability to data-driven models, since they rely on a static topology of signalling networks. Here, we propose a novel cross-talk categorisation (Type-I and Type-II) that can not only be applicable to data-driven models but also generalises all types of cross-talks defined by other approaches. We also presented a simple but intuitive algorithm for detecting Type-I and Type-II cross-talks between any two signalling pathways. In combination with other computational and statistical methodologies, our approach is useful in systems biology to generate novel but biologically plausible hypotheses in a datadependent manner.
The notion of cross-talking is inherently present in biological systems, which might involve interactions between/among signalling and regulatory pathway activities. Yamaguchi et al. [38] reported that in acquired resistance, RTK-mediated signalling pathways cross-talk with downstream effector pathways via altering the activities of effector proteins including transcription factors and enzymes and thus causes the dysregulation in the expression of multiple target gene, specially involved in growth and cell survival processes. Therefore, in addition to the signalling cross-talks, it is also important to efficiently find cross-talks between/among signalling and regulatory pathways as well. Although this chapter primarily focuses on the signalling cross-talks only, our definition of data-driven models biological systems as undirected graphs and the categorisations of Type-I and Type-II cross-talks can be generalised. Thus, our proposed algorithm will be able to identify cross-talks among any set of pathways including cell signalling and regulatory pathways.