A Review on Node-Matching Between Networks

The relationships between individuals in various systems are always described by networks. Recently, the quick development of computer science makes it possible to study the structures of those super-complex networks in many areas including sociology (Xuan et al., 2009; Xuan, Du & Wu, 2010a), biology (Barabasi & Oltvai, 2004; Eguiluz et al., 2005), physics (Dorogovtsev et al., 2008; Rozenfeld et al., 2010), etc., by the tools in graph theory. Interestingly, it was revealed that many of these complex networks in various areas present several similar topological properties, such as small-world (Watts & Strogatz, 1998), scale-free (Barabasi & Albert, 1999), self-similarity (Motter et al., 2003), symmetry (Xiao et al., 2008), etc. In order to explain these properties, a large number of models have been proposed (Barabasi & Albert, 1999; Li & Chen, 2003;Mossa et al., 2002;Watts & Strogatz, 1998; Xiao et al., 2008; Xuan, Du, Wu & Chen, 2010; Xuan et al., 2006; 2007; 2008). However, most of current researches still focus on understanding the relationships between individuals in a single system, while the inter-system relationships are always ignored.


Introduction
The relationships between individuals in various systems are always described by networks.Recently, the quick development of computer science makes it possible to study the structures of those super-complex networks in many areas including sociology (Xuan et al., 2009;Xuan, Du & Wu, 2010a), biology (Barabási & Oltvai, 2004;Eguíluz et al., 2005), physics (Dorogovtsev et al., 2008;Rozenfeld et al., 2010), etc., by the tools in graph theory.Interestingly, it was revealed that many of these complex networks in various areas present several similar topological properties, such as small-world (Watts & Strogatz, 1998), scale-free (Barabási & Albert, 1999), self-similarity (Motter et al., 2003), symmetry (Xiao et al., 2008), etc.In order to explain these properties, a large number of models have been proposed (Barabási & Albert, 1999;Li & Chen, 2003;Mossa et al., 2002;Watts & Strogatz, 1998;Xiao et al., 2008;Xuan, Du, Wu & Chen, 2010;Xuan et al., 2006;2007;2008).However, most of current researches still focus on understanding the relationships between individuals in a single system, while the inter-system relationships are always ignored.
One of such inter-system relationships may be caused by the fact that an individual may be active in different systems with different identities (Xuan & Wu, 2009), and this type inter-system relationships may further lead to the similar structures of different complex networks.For instance, an ancient protein may evolve into various homologous proteins in different species, a concept may be expressed by different words in different languages, and a person may be active in different communication networks with different identities represented by telephone numbers (Onnela et al., 2007) and email addresses (Newman et al., 2002), etc.Therefore, revealing the different identities of an individual in several different systems has practical significance in many areas (Xuan, Du & Wu, 2010b), e.g., revealing homogeneous proteins, auto-translating languages, inter-network filtrating information, and so on.Through describing complex systems by networks, these different tasks can be transferred to a common node-matching problem between different complex networks, and thuscanbesolvedinthesameframework.
However, since many real-world complex networks are always highly symmetric (Xiao et al., 2008), i.e., there are always large numbers of nodes sharing the same neighbors in a network, it seems quite difficult to distinguish them in one network only by comparing their topological properties (Costa et al., 2007), such as degrees, clustering coefficient and

A Review on Node-Matching Between Networks
Qi Xuan 1 , Li Yu 1 , Fang Du 2 and Tie-Jun Wu 3 1 Zhejiang University of Technology 2 Johns Hopkins University 3 Zhejiang University 1,3 China 2 USA

Definitions and data sets 2.1 Definitions
The node matching problem between two different networks are described as follows (Xuan, Du & Wu, 2010b;Xuan & Wu, 2009): the two networks under study are denoted by ,w h e r eV i = v i 1 ,...,v i M and E i represent the node set and the link set of network i(i = 1, 2), respectively.Assume that there are ), while P r (P r < M) pairs of them have been already revealed, named as revealed matched nodes and denoted by {v i 1 ,...,v i P r }⊂V i (i = 1, 2).Then the problem is: can we design a method to find the other M − P r pairs of matched nodes in these two distinct networks by using the structural information of G 1 and G 2 and the revealed matched nodes?If we can design such a method and finally P c (Pc M − Pr) pairs of them are revealed correctly, the matching precision φ then can be calculated by

Co-evolution network models
In order to better understand the interactions between different systems and test the subsequent node matching algorithms, two co-evolution network models need to be first introduced, where the parameters are set to be N 1 = N 2 = M = N for convenience.Generally, there are two ways to create a pair of interactional networks, as is shown in Fig. 1 (a) and (b), respectively, both of which may work in reality.Inspired by the evolution of organisms, the first way is that the pair of interactional networks G 1 and G 2 are evolved from a common original network; in other words, they are derived from the same network (obtained by some model) through random rewiring processes.And the other way is that the pair of interactional networks are derived from two independent networks by a random interacting process composed of the following two steps (Xuan & Wu, 2009): • Networks initialization: Two networks G 1 and G 2 with N nodes respectively are created by the same rule, where all the nodes are randomly matched, i.e., N pairs of randomly matched nodes Here, the second way will be adopted to create pairs of tested artificial interactional networks.

Rewire
(b) Interacting model Fig. 1.Two ways to create a pair of interactional networks (Xuan & Wu, 2009).(a) The pair of interactional networks G 1 and G 2 are derived from the same original network through random rewiring.The corresponding nodes are matched and connected by brown dashed lines.(b) The pair of interactional networks G 1 and G 2 are derived from a pair of independent networks by random interacting, i.e., two non-linked nodes in the network G 1 are connected by a green line with probability η 2 if their corresponding matched nodes in G 2 were linked while two non-linked nodes in G 2 are connected by a red line with probability η 1 if their corresponding matched nodes in G 1 were linked.η 1 and η 2 are named as interactional degree.

Real-world interactional networks
In reality, when two strangers chat with each other for some reason, e.g., demand of business, common interests, curiosity, warmheart, etc., they may be friends one day in the future if they enjoy with each other, in other words, the chat network may influence the evolution of the friendship network.On the other hand, there is also a natural trend that one prefers to chat with his friends or acquaintances rather than strangers, i.e., the friendship network determines

155
A Review on Node-Matching Between Networks www.intechopen.comthe chat network to a certain extent.Therefore, chat network and friend network can be considered as a pair of real-world interactional networks, which can be figured on a quite large scale by advanced communication technologies and thus used to test the subsequent node matching algorithms.
As an example, we collected the communication records and the contact lists in a week from the database of Alibaba trademanager (an instant messenger (IM) mainly used for electronic commerce).We mainly focus on 14,800 employees of the Alibaba company and construct the chat network G 1 and the friendship network G 2 a m o n gt h e mb yt h e s er e c o r d s .T h et w o networks were then preprocessed by the following two steps (Du et al., 2010;Xuan, Du & Wu, 2010b): • Extract the giant cluster (GC): Extract the GCs of G 1 and G 2 , denoted by where V g i and E g i represent the node set and the link set of the GC G g i respectively.
• Calculate the intersection: A pair of matched nodes in the networks correspond to the same Alibaba user.Select those users appearing in both the G g 1 and G g 2 , denoted by represents the set of links between nodes in V c .S e tG 1 = G c 1 and G 2 = G c 2 , and terminate the preprocessing if both the networks G c 1 and G c 2 are connected, otherwise, turn to the first step.
After the preprocessing, both the networks G 1 and G 2 have 9859 nodes and are one-to-one matched, i.e., each node in G 1 has a matched node in G 2 and vice versa.Moreover, if there is a link between two nodes in G 1 , we can find a link between their matched nodes in G 2 with probability 80.8%, and the probability is 18.4% from G 2 to G 1 .Their basic topological properties, such as the number of nodes N, the average degree k , the average clustering coefficient C , and the average shortest path length L are presented in Table 1 1.The basic properties, i.e., the number of vertices N, the average degree k ,the average clustering coefficient C , and the average shortest path length L for the chat network and the friendship network derived from Alibaba trademanager database (Du et al., 2010;Xuan, Du & Wu, 2010b).

Revealed matched nodes selecting strategies
Since the interactional networks under study are usually not completely identical (Xuan & Wu, 2009), it seems unpractical to match nodes between different networks just by their local structural properties.As a result, a few pairs of matched nodes would be better revealed as references before the node-matching algorithms are implemented.
Recent studies on real-world networks reveals that many of them have similar heterogeneous structure characterized by a power-law degree distribution (Barabási, 2009;Barrat et al., 2004;Eguíluz et al., 2005;Xuan et al., 2009).This property, first modeled by Barabási and Albert (BA) (Barabási & Albert, 1999), indicates that the connection of a heterogeneous network highly depends on hub nodes with quite large degrees, i.e., once these hub nodes are attacked, 156 New Frontiers in Graph Theory www.intechopen.comthe average shortest path length of the network will increase quickly (Albert et al., 2000;Crucitti et al., 2004;Motter & Lai, 2002), as a result, the communication efficiency of the network will be largely weakened.For the node matching problem introduced here, we proved that (Xuan & Wu, 2009) such hub nodes can provide more structural information than those normal nodes and thus are more suitable to be revealed matched nodes.Based on the interactional model introduced in Fig. 1 (b), denoting the degree of v 1 i by d 1 i and the degree of v 2 j by d 2 j , if they are randomly selected as a pair of matched nodes, then, averagely speaking, there are d 1 i d 2 j /N other pairs of matched nodes around them before the interaction.And after the interaction, the degree of v 1 i and that of v 2 j can be calculated by Eq. ( 2) and Eq.(3) respectively, And the number of pairs of other matched nodes around the matched nodes v 1 i and v 2 j after the interaction can be calculated by Eq. ( 4), Since real-world complex networks always have a very huge number of nodes and a relatively small average degree, Eq. ( 2)-Eq.( 4) can be further simplified to Eq. ( 5)-Eq.( 7) respectively, 6) Then we get Eq.( 8) as With the reason that the matched nodes are supposed unknown beforehand in reality, it seems unpractical to sort all the pairs of matched nodes by F ij in descending order in order to improve the final matching precision φ,a l t h o u g hl a r g e rF ij corresponds to more pairs of unrevealed matched nodes around a pair of revealed matched nodes v 1 i and v 2 j .Fortunately, Eq. ( 8) suggests a substitute way, i.e., selecting nodes with larger degree in the reference network, revealing their matched nodes in the other network by some dedicated methods, then these pairs of matched nodes are set to the revealed matched nodes.

157
A Review on Node-Matching Between Networks www.intechopen.com

Large degree priority strategies
Based on this principle, we proposed large degree priority strategies (Xuan & Wu, 2009) for the optimal node matching algorithm, as described by • Large Degree Priority in G 1 (LDP1): G 1 is selected as the reference network, where the nodes are sorted by their degrees in descending order, and the top P r of them as well as their matched nodes in G 2 are selected as the revealed matched nodes.• Large Degree Priority in G 2 (LDP2): G 2 is selected as the reference network, where the nodes are sorted by their degree in descending order, and the top P r of them as well as their matched nodes in G 1 are selected as the revealed matched nodes.
But which of them can bring higher matching precision?Can we answer this question just by comparing the structural properties (in particular, the degree sequences) of the two interactional networks?Without loss of generality, for a pair of interactional networks, suppose G 1 has larger average degree than G 2 , i.e., d 1 > d 2 .M u l t i p l yE q .( 5 )b yη 1 and minus Eq. ( 6), we get 9) Since η 1 η 2 1, the value of η 1 can be roughly estimated by while the value of η 2 cannot be estimated just by comparing the structural properties of the interactional networks.Suppose that the nodes are sorted by their degrees in descending order, denote by R i (i = 1, 2) the set of top P r nodes in G i ,t h enfr omEq.(8),w ec a nseet h a t more structural information may be provided when G 2 is selected as the reference network, if it is satisfied that which is equivalent to Because it is always satisfied that Eq. ( 12) must be satisfied if we have where all the parameters are known when two interactional networks are provided.That is, only when Eq. ( 14) is satisfied, we can say that LDP2 may be superior to LDP1.

158
New Frontiers in Graph Theory www.intechopen.com

Centralized large degree priority strategies
The above LDP strategies are designed for optimal node-matching algorithms, while for iterative node-matching algorithms, these strategies need to be further modified.Because in this case, the revealed pairwise matched nodes would better be centralized to a local world in the networks so as to improve the matching precision in the first round, then the second round and so on.Correspondingly, we propose two centralized large degree priority strategies specially for iterative node-matching algorithms (Xuan, Du & Wu, 2010b): • Centralized Large Degree Priority in G 1 (CLDP1).G 1 is selected as the reference network, where a set R 1 (|R 1 | = P r ) of nodes are picked up according to their degrees by following process.The node of the largest degree in G 1 is firstly selected as the only member of R 1 .
Denoting the neighbor set of R 1 as ) is at least connected to one node in R 1 , at each time the nodes in V 1 \ R 1 are sorted by the number of neighbors belonging to U 1 in descending order and the top one is selected to join in R 1 .Update R 1 and U 1 and repeat the selecting process until the set R 1 contains exactly P r nodes.Then the set R 1 of nodes in G 1 as well as their matched nodes in G 2 are selected as the revealed pairwise matched nodes.• Centralized Large Degree Priority in G 2 (CLDP2).G 2 is selected as the reference network, where a set R 2 (|R 2 | = P r ) of nodes are picked up according to their degrees by following process.The node of the largest degree in G 2 is firstly selected as the only member of R 2 .
Denoting the neighbor set of R 2 as ) is at least connected to one node in R 2 , at each time the nodes in V 2 \ R 2 are sorted by the number of neighbors belonging to U 2 in descending order and the top one is selected to join in R 2 .Update R 2 and U 2 and repeat the selecting process until the set R 2 contains exactly P r nodes.Then the set R 2 of nodes in G 2 as well as their matched nodes in G 1 are selected as the revealed pairwise matched nodes.

159
A Review on Node-Matching Between Networks www.intechopen.com The similarity between two nodes belonging to different networks can be measured by the number of pairs of revealed matched nodes around them, e.g., the number of common friends they contact with in different communication networks, where a common friend is denoted by a pair of revealed matched nodes in corresponding communication networks.Denote by n L (v 1 i ) and n L (v 2 j ) the numbers of links connected to the node v 1 i and v 2 j in the networks G 1 and G 2 , respectively, and by n M (v 1 i , v 2 j ) the number of pairs of revealed matched nodes k , in the corresponding networks.Then the similarity between v 1 i and v 2 j can be calculated by a number of methods (Jaccard, 1901;Lü & Zhou, 2011;Newman, 2001;Ravasz et al., 2002;Salton & McGill, 1983;Sørensen, 1948), as presented in Table .2. Here, we adopt Jaccard Index to calculate the similarities between nodes of interactional networks.

Optimal node-matching algorithm
When revealed pairwise matched nodes are selected by LDP strategies, the similarity of each pair of the remaining nodes from different interactional networks can be calculated by Jaccard Index.Then, reviewing the definitions in Section 2.1, the node-matching problem between G 1 and G 2 can be transferred to a maximum matching problem for the bipartite graph G b =( U 1 , U 2 , W) where U i = {v i P r +1 , v i P r +2 ,...,v i N } (i = 1, 2),a n dW denotes the set of links weighted by the similarities between these two groups of nodes.Without loss of generality, under the assumption N 1 N 2 , the task is to find a set of nonadjacent weighted links {w 1 , w 2 ,...,w N 1 −P r } to maximize the sum of their weights ∑ N 1 −P r i=1 s i , which can be solved by the classical KM algorithm (Kuhn, 2005;Munkres, 1957).Note that, although the KM algorithm was developed for the case N 1 = N 2 , it could be also feasible in the case N 1 < N 2 through factitiously adding N 2 − N 1 isolated nodes in G 1 .For this reason we supposed N 1 = N 2 = N for simplicity.
Since the KM algorithm has relatively high complexity O(N 3 ), the sizes of the test networks cannot be very large.Here the two interactional networks G 1 and G 2 are both created by the BA model with N = 100 nodes and average degree k = 8.Then they interact with each other with different interactional degrees η 1 = 0.9 and η 2 = 0.1 by the model shown in Fig. 1 (b).Denote the sample ratio by γ = P r /N, the matching results are shown in Fig. 2, where we can see that, in most cases, LDP1 is prior to LDP2.This result is reasonable because when η 1 ≫ η 2 , Eq. ( 8) suggests that larger F ij can be expected when select those nodes with large degrees in G 1 and their correspondences in G 2 as the revealed matched nodes.Note that, in this experiment, we set M = N for simplicity, that is, every node in one network has its correspondence in the other network.In reality, M may be smaller than N, i.e., some individuals may be active in only one of the interactional networks.In this case, we need further select M − P r pairs of matched nodes from N − P r pairs of matched nodes obtained by the node-matching algorithm.If the value of M is known a priori, we can simply sort N − P r pairs of matched nodes by their attached similarities, then select the top M − P r pairs with larger similarities as the final pairs of matched nodes.However, if M is unknown, we have to set a threshold θ ∈ [0, 1) beforehand, and those pairs of matched nodes with similarities larger than θ then are selected as the final pairs of matched nodes, which will not be further discussed here.That is, in the following studies, we always set M = N 1 = N 2 = N for simplicity.Fig. 2. The matching precision φ as the function of the sample ratio γ by adopting the two revealed matched nodes selection strategies, i.e., LDP1 and LDP2, for scale-free networks created by the BA model with N = 100 and k = 8 and different interactional degrees η 1 = 0.9 and η 2 = 0.1 (Xuan & Wu, 2009).For each γ and each selection strategy, the experiment is implemented on 100 different pairs of scale-free networks, then the average matching precision as well as the error bar is recorded.

Iterative node-matching algorithm
As we can see in Fig. 2, the optimal node-matching algorithm fails to achieve acceptable results when there are only a relatively small number of pairwise matched nodes revealed beforehand, e.g., in order to achieve a matching precision of 80%, we have to reveal as many as 60% correspondences between nodes of the two networks in advance, which, as well as its long running time, hinders its efficient application in node-matching between real-world networks of quite large size.Based on the CDLP revealed matched nodes selecting strategies and Jaccard similarities between nodes of different networks, the iterative node-matching algorithm is simply composed of the following two steps (Xuan, Du & Wu, 2010b): • Node matching.At each time, a pair of unmatched nodes belonging to different networks with the largest similarity are selected as a pair of matched nodes.Then this pair of matched nodes are considered as a pair of newly revealed matched nodes, then recalculate the similarities between the remaining nodes, and so forth.• Termination.The iterative process is terminated when all of the nodes in the interactional networks have been matched.
The time complexity of the above node-matching algorithm mainly depends on the recalculation of the similarities.Generally, once a pair of nodes from different networks are matched at (τ − 1)th round, we need to recalculated the similarities of about k 1 τ k 2 τ pairs of nodes mutually connected to that pair of matched nodes at τth round, where k i τ (i = 1, 2) represents the degree of the matched node in G i at (τ − 1)th round.Provided N 1 = N 2 = M = N, the running time of the algorithm, denoted by Γ, can be calculated by Eq. ( 15) statistically, 161 A Review on Node-Matching Between Networks www.intechopen.com If the two networks under study are strongly dependent each other, i.e., extremely G 1 and G 2 are identical and a node in one network only can be matched to the node of equal degree in the other network, Eq. ( 15) can be replaced by Eq. ( 16), For scale-free networks generated by the BA model, the degree distribution follows p(k) ∼ k −3 , thus the running time can be simplified by Eq. ( 17), However, if the two target networks are relatively independent from each other, i.e., a node with large degree in one network can be matched to a node with small degree in the other network, which is more common in reality, Eq. ( 15) can be approximatively transferred to Eq. ( 18), where k i represents the average degree of the network G i .I n m o s t c a s e s , k i can be considered as a constant, therefore, Eq. ( 18) suggests a linear time complexity O(N) of the algorithm (Xuan, Du & Wu, 2010b).Eqs. ( 17) and ( 18) mean that the iterative node-matching algorithm has much lower complexity than the optimal node-matching algorithm.
In order to compare to the optimal node-matching algorithm, here we take the same example to test the iterative node-matching algorithm.Since the iterative algorithm is able to solve node-matching problems between networks of quite large size, the two interactional networks G 1 and G 2 here are also created by the BA model with same average degree k = 8, but much larger network size N = 500.Then these two networks interact with each other with different interactional degrees η 1 = 0.9 and η 2 = 0.1 by the same model shown in Fig. 1 (b).The matching results are show in Fig. 3 (a).At this time, in order to correctly reveal most of matched nodes in the networks (e.g., φ 80%), we only need to have a very small percentage of matched nodes revealed beforehand (1% for CLDP1 and 1.6% for CLDP2), i.e., the iterative node-matching algorithm is far more efficient than the optimal node-matching algorithm on interactional artificial scale-free networks.
However, when we test this iterative node-matching algorithm on the real-world interactional chat network and friendship network introduced in Section 2.3, the matching results, as shown in Fig. 3 (b), are not that satisfactory, i.e. the final matching precision between the pair of real-world networks is much lower than that between the artificial networks generated by the BA model when adopting the same proportion of pairwise revealed matched nodes.For example, only about 40% matched nodes are revealed correctly, even though there are as many as 10% matched nodes are revealed beforehand.This phenomenon may be caused by the relatively high symmetry of the chat network and the friendship network.Generally, the local symmetry between the two non-linked nodes v i and v j in a network is defined by (Xuan, Du & Wu, 2010b) (b) Matching results on real-world networks Fig. 3.The matching precision φ as the function of the sample ratio γ by adopting the two revealed matched nodes selection strategies, i.e., CLDP1 and CLDP2, for (a) the interactional scale-free networks created by the BA model with N = 500 and k = 8 and different interactional degrees η 1 = 0.9 and η 2 = 0.1, and (b) the interactional real-world chat network and friendship network (Xuan, Du & Wu, 2010b).For artificial networks, the experiment is implemented on 100 different pairs of scale-free networks for each γ and each selection strategy, then the average matching precision as well as the error bar is recorded.
where χ c ij and χ t ij are the numbers of their common and total neighbors, respectively.If nodes v i and v j are connected, release the link and then calculate the symmetry between them following Eq.( 8).Since it is impossible to distinguish two nodes v i and v j in a network with the symmetry χ ij = 1 (i.e. they share the same neighbors excluding themselves) just by adopting their topological information, those highly symmetric nodes in one network may be wrongly matched to the nodes in the other network with quite a high probability, and thus one-to-one node-matching algorithms may produce poor results in such situations.

One-to-many iterative node-matching algorithms
In order to overcome the above limitation of one-to-one node-matching algorithms, we proposed one-to-many node matching (Du et al., 2010) through expanding the number of nodes in each matching step.In fact, one-to-many node matching has its practical significance because it can help to quickly narrow down the searching range of a target individual in different complex systems.Particularly, a 1-to-M algorithm should output N − P r correspondences as defined by Eq. ( 20), where v 1 i (i = P r + 1, P r + 2,...,N)i san o d ei nG 1 ,a n dQ 2 i is a node set including the top M most likely matched nodes of v 1 i in G 2 .It should be noted that here 1-to-M match is just a natural generalization of 1-to-1 match, therefore, Eq. ( 20) also provides a consistent 1-to-1 match, i.e., v .D e n o t i n gP M (P M N − P r ) as the number of nodes in G 1 that are correctly matched, the matching precision φ M for the 1-to-M node 163 A Review on Node-Matching Between Networks www.intechopen.commatching algorithm can be calculated by Eq. ( 21), and naturally Eq. ( 22) is always satisfied.
Next, we will introduce two different one-to-many iterative node-matching algorithms (Du et al., 2010).
1) A1: Local mapping.Since the similarity between each pair of nodes may change as the one-to-one iterative algorithm is implemented step by step, it is possible to correct some initially wrongly matched nodes by recalculating their similarities after the one-to-one node matching algorithm is terminated.This fact leads to the first one-to-many node matching algorithm based on local mapping.In particular, the Algorithm A1 is defined by the following two steps (Du et al., 2010): • Iterative 1-to-1 node matching.N − P r pairs of nodes, i.e., v 1 i ↔ Q 2 i = {v 2 i 1 } (i = P r + 1, P r + 2,...,N), are firstly matched by the iterative 1-to-1 node matching algorithm.
• Candidate nodes selection.D e n o t eb yX 1 i the neighbor set of node v 1 i in G 1 , which has a matched node set X 2 i in G 2 where the nodes are 1-to-1 matched to those in X 1 i ,thendenote by Y 2 i the neighbor set of X 2 i , including all the nodes directly connected to those in X 2 i .Based on the definition of similarity, only the similarities between node v 1 i (i = P r + 1, P r + 2,...,N)andthenodesinY 2 i can be larger than 0 and thus are recalculated.Then the top M − 1 nodes with largest similarities are selected as the candidate corresponding nodes of v 1 i .It should be noted that v 2 i 1 is not reconsidered here, and if Y 2 i only contains fewer than M − 1 nodes, other M − 1 −|Y 2 i | nodes can be randomly selected from G 2 to be consistent with Eq. (20).
2) A2: Ensembling.In the area of machine learning, it is a common way to improve the generalization performance of an algorithm by combining the results of many different predictors (Breiman, 1996;Freund & Schapire, 1997;Krogh & Sollich, 1997;Miyoshi et al., 2005).However, the above iterative one-to-one node matching algorithm is totally deterministic, i.e., for a given pair of target networks and certain revealed matched nodes, the algorithm must produce the same matching result.Therefore, it cannot be directly used for ensemble, and thus a new statistical iterative one-to-one node matching algorithm have to be introduced first, where a pair of newly revealed matched nodes is adopted only with probability p(p < 1) to calculate the similarities between those unrevealed nodes of different networks in the succeeding iterative process.Then a group of different one-to-one matching results can be obtained by implementing such a statistical iterative one-to-one node matching algorithm for several rounds, and the obtained results can be merged into a unique one-to-many matching result by a voting strategy.In particular, the algorithm A2 is defined by the following three steps (Du et al., 2010): • Iterative 1-to-1 node matching.N − P r pairs of nodes, i.e., v 1 i ↔ Q 2 i = {v 2 i 1 } (i = P r + 1, P r + 2,...,N), are firstly matched by the deterministic iterative 1-to-1 node matching algorithm.
• Implement and vote.
Fig. 4. The matching precision φ as the function of the sample ratio γ for M = 1, 2, 5 (M = 1 means the one-to-one matching result) between the friendship network and the chat network obtained from the database of Alibaba trademanager (Du et al., 2010).Here, the chat network is taken as the reference network.
with its size (the number of nodes) satisfying |Z 2 i |≤B.It should be noted that each node in Z 2 i is attached by a positive integer as its weight representing the times that it is matched to v 1 i in the total B rounds, and similarly v 2 i 1 is excluded here.• Candidate nodes selection.T h et o pM − 1 nodes with largest weights in Z 2 i are selected as the M − 1 candidate corresponding nodes of v 1 i .Sometimes, there may be only fewer than M − 1nodesinZ 2 i , i.e., |Z 2 i |≤M − 1, in such a situation, other M − 1 −|Z 2 i | nodes can be randomly selected from G 2 to be consistent with Eq. (20).
Similarly, these two one-to-many iterative node matching algorithms are tested on the real-world interactional chat network and friendship network introduced in Section 2.3, and the matching results are shown in Fig. 4. As we can see, both the proposed one-to-many algorithms (especially the random algorithm A2) can significantly improve the matching precision, and thus can be considered to partially overcome the limitation of one-to-one node-matching algorithms.

Conclusion
Since an individual may appear in different systems with different identities, many real-world complex systems are considered to be interacted with each other all the time.Revealing these identities of the same individual is a common task in many areas such as sociology, linguistics, biology, etc, by their dedicated methods.When these complex systems are described by networks, this common task can be changed to a node matching problem between different complex networks, and thus can be solved in the framework of graph theory.
In this chapter, we reviewed the overall process to solve such node-matching problems between different networks: We first calculated the similarities between nodes of different networks through their connections to several pairs of preliminarily revealed matched nodes and transferred the node matching problem between two different networks to a maximum 165 A Review on Node-Matching Between Networks www.intechopen.comweighted bipartite matching problem; then we proposed several node-matching algorithms to solve such problem.By comparison, the iterative node-matching algorithm has approximately linear complexity and behaves much better than the traditional KM algorithm in graph theory.However, it seems that almost all of the network structure-based one-to-one node-matching algorithms lose their efficiencies when the target networks are highly symmetric, e.g., the iterative node-matching results are not that good on real-world chat network and friendship network obtained from the database of Alibaba trademanager.Such limitation can be partially overcome by the proposed one-to-many node-matching algorithms, which mainly focus on quickly narrowing down the searching range, rather than revealing exact one-to-one mapping between nodes of different networks.Meanwhile, we also introduced several degree-based revealed matched nodes selecting strategies for optimal and iterative node-matching algorithms, respectively, in order to further improve the matching results.In the future, more information about individuals and connections may be adopted to create more efficient node-matching algorithms. .
The statistical 1-to-1 node matching algorithm with parameter p(p < 1) is implemented for B (B ≫ M) rounds and a group of B different 1-to-1 matching results are obtained.All of the correspondences in G 2 of v 1 i in G 1 a r eg r o u p e db yan o d es e tZ 2