Open access peer-reviewed chapter

Evolving Networks and Social Network Analysis Methods and Techniques

By Mário Cordeiro, Rui P. Sarmento, Pavel Brazdil and João Gama

Submitted: January 18th 2018Reviewed: May 22nd 2018Published: October 31st 2018

DOI: 10.5772/intechopen.79041

Downloaded: 630

Abstract

Evolving networks by definition are networks that change as a function of time. They are a natural extension of network science since almost all real-world networks evolve over time, either by adding or by removing nodes or links over time: elementary actor-level network measures like network centrality change as a function of time, popularity and influence of individuals grow or fade depending on processes, and events occur in networks during time intervals. Other problems such as network-level statistics computation, link prediction, community detection, and visualization gain additional research importance when applied to dynamic online social networks (OSNs). Due to their temporal dimension, rapid growth of users, velocity of changes in networks, and amount of data that these OSNs generate, effective and efficient methods and techniques for small static networks are now required to scale and deal with the temporal dimension in case of streaming settings. This chapter reviews the state of the art in selected aspects of evolving social networks presenting open research challenges related to OSNs. The challenges suggest that significant further research is required in evolving social networks, i.e., existent methods, techniques, and algorithms must be rethought and designed toward incremental and dynamic versions that allow the efficient analysis of evolving networks.

Keywords

  • evolving networks
  • social network analysis

1. Introduction

One of the consequences of today’s information society is the rise of the digital network society [1]. Organizations and individuals are increasingly connected through a wide range of online and offline networks at different relational levels: social, professional, interaction, information flow, etc. The analysis and modeling of networks and also networked dynamical systems have been the subject of considerable interdisciplinary interest covering a wide range of areas from physics, mathematics, computer science, biology, economics and sociology, in the so-called “new” science of networks [2]. According to [3], media organizations, media content and audiences are no exception: news articles have hyperlinks to link other content; news organizations disseminate news via online social network (OSN) platforms like Twitter and Facebook; and users comment, share and react directly below online news. At the level of news events, recent observation proves that some events and news emerge and spread first using those media channels rather than other traditional media like the online news sites, blogs or even television and radio breaking news [4, 5]. Natural disasters, celebrity news, products announcements or mainstream event coverage show that people increasingly make use of those tools to be informed, discuss and exchange information [6]. Concerning the novelty and timely dissemination of news events, empirical studies show that the online social networking services like Twitter are often the first mediums to break critical natural events such as earthquakes often in a matter of seconds after they occur [4, 5]. Herewith, social networks’ temporal dimension of information is of crucial importance and follows a time decay pattern, that is, posted messages in social media are exchanged, forwarded or commented in early stages and decrease in importance as time passes. The importance of information contained in those messages has their importance peak right after being posted or in the following hours or days [7]. Besides, the timing of many human activities, ranging from communication to entertainment and work patterns is characterized by bursts of rapidly occurring events separated by long periods of inactivity, following non-Poisson statistics [8]. The nature of the time decay pattern communication, bursts or peaks, and inactivity periods, enforces the importance of dynamic network analysis methods and techniques in a network analysis context as the best approaches to model these problems.

1.1. Solving problems with evolving networks

Typical tasks of social network analysis involve the identification of the most influential, prestigious or central actors, using statistical measures; the identification of hubs and authorities, using link analysis algorithms; the discovery of communities, using community detection techniques; the visualization the interactions between actors; or spreading of information. These tasks are instrumental in the process of extracting knowledge from networks and consequently in the process of problem-solving with network data. Specifically, in the areas of journalism and investigation the following topics have been discussed actively in the last years: Criminological research. In 2001, according to [9], social network analysis has the genuine potential to uncover the complexities of criminal networks. Ref. [10] introduced the intersection of terrorism studies and what was called the “networked criminology”. In 2011, [9] concluded that the applications of social network analysis in criminology were still insufficient when compared to other research areas like sociology and public health [11]. Nevertheless, since then, social network analysis has become one of the major tools for criminal analysis with [12] identifying the three main areas of analysis and applications of social networks in criminological research. Other topics include the influence of personal networks on crime and, more generally, on delinquent behavior [13, 11]; the analysis of neighborhood networks and their influence on crime [12]; and exploration and modeling of the organization of crime, that is, street gangs, terrorist groups and organized crime groups [12]. Concrete examples of criminological research using social network analysis are the Enron company dataset [14]; the Panama Papers analysis in connection with the economic network analysis of Portuguese companies; individuals with connections to offshore companies [15]; and the analysis of terrorist networks [16, 17]. Gill and Malamud [18] presented a broad overview, characterization and visualization of the interaction relationships between natural hazards using social network analysis. In addition to the previous contributions, essential and recent work regarding social network analysis and terrorism were published by Malm et al. [19], while Berlusconi [20] devoted an entire book to social network analysis and crime prevention. Research on journalism: Fu [21] presented an essay with the purpose of promoting social network analysis in the study of journalism. Starting with a communication network taxonomy [22, 23], their focus was on network relations in the study of journalism. The proposed framework presents the four types of communication relations that characterize different networked journalism phenomena [23]: Affinity relations describes the socially constructed relationships between two actors, such as alliances and friendships, with the valence of the relation being either positive or negative; flow relations refers to the exchange and transmission of data, resources and information; representational relations focuses on the symbolic affiliation between two entities; and semantic networks describes the associations, or semantic relations, among concepts, words or people’s cognitive interpretations toward some shared objects in the network, aiming to build a knowledge base. Another relevant journalism research example is Ref. [24] that examines the temporal dynamics of reciprocity in the setting of legislative co-sponsorship in the 113th US Congress (2013–2015).

1.2. Related work

In the past years, several overviews of social network analysis can be found in the literature. Ref. [25] provide a general and succinct overview of the essentials of social network analysis, for static networks, with emphasis on simple statistical measures, link analysis, properties of real-world networks and community detection. Tabassum et al. [26] in an updated overview of social network analysis, based on Oliveira’s and Gama’s [25] work, included a full section devoted to evolving networks. Devoted explicitly to evolutionary network analysis, the Aggarwal and Subbian [27] survey provides an overview of the vast literature on graph evolution analysis. Although the scope of Aggarwal and Subbian [27] work was graph analysis in general, that is, it did not address specifically the social network analysis problem, many of the mentioned applications and particular contexts of applicability of the methods are social networks. The literature analyzed by Aggarwal and Subbian [27] covered both snapshot-based and streaming methods and algorithms, and critical applications of evolutionary network analysis on different domains such as the world wide web, telecommunication and communication networks, recommendation networks and social network events, among others, were given. Spiliopoulou [28] said that the advances on evolution is social networks into a common framework to model a network across the time axis and identified the four dimensions associated with knowledge discovery in social networks: dealing with time, objective of study, definition of community, and evolution as an objective versus assumption. Spiliopoulou [28] enumerated the several challenges of the social network streams although this problem is generally conceived as a stream problem. An exciting application of temporal network theory and temporal networks to functional brain connectivity was presented by Thompson et al. [29]; in his work, the theory and methods that introduce the reader on how to add the temporal dimension to network analysis and precisely how many of well-known methods can be transposed from static networks to temporal networks were presented. Thompson et al. [29] included a complete list of network measures adapted or proposed for temporal networks.

2. Representation of social networks

A social network is a social structure consisting of a finite set of social actors, such as individuals or organizations, connected by interpersonal relationships. These relationships, also known as ties, can be of personal or professional nature and can range from casual friends, acquaintances or co-workers to the close family bonds. Besides the relations, social networks often represent flow of information, interactions and similarities, among the set of social actors. Social network analysis is the investigation of the relationship between actors. In network terminology, vertices—also known as nodes—refer to actors or subjects. Edges, also known as links or ties, describe the relationship between actors. Usually, this social structure, or network structure, is represented by graphs which are mathematical structures used to model pair-wise relations between objects. A graph in this context is made up of vertices, nodes or points which are connected by edges, arcs or lines. Therefore, a social network is a graph Gcomposed of two fundamental components: a nonempty set of vertices Vand a set of edges E. Formally it can be defined as G=VE. Vertices represent objects, states, positions, placeholders and are represented by a set of unique vertices. No two vertices represent the same object or state where Vcan be represented by v1v2v3vn. For each graph edge eE, there is associated a pair of graph vertices u, v. Mathematically this can be formulated as eEeuvwhere u,vV. Edges can be directed or undirected and can be weighted (or labeled) or unweighted. An undirected edge e=vivj, with vi,vjV, indicates that the relationship or connection is bi-directional, that is, can go from vito vjand vice versa. A directed edge e=vivjspecifies a one-directional relationship or connection, that is, can only go from vito vj; this means that vivjvjvi. The total number of vertices nof graph G, mathematically V=n, is called the graph order. The total number of edges E=mis known as the size of the graph G. The maximum number of edges in a undirected graph is mmax=nn12, while for the directed ones, it is mmax=nn1. The representation of graphs is done via two distinct types of graph-theoretic data structures: list structures and matrix structures. List structures, such as incidence lists and adjacency lists, reduce the required storage space for sparse graphs. Matrix structures such as incidence matrices, adjacency matrices, sociomatrices, Laplacian matrices and distance matrices are appropriated to represent the full matrices with dimension n×n, where nis the total number of vertices of the graph. Figure 1 shows several types of graphs that can be used to model different kinds of social networks. The classification of graphs is done according to the direction of their links and according to the values assigned to each link. Graphs whose edges, or arcs, connect unordered pairs of vertices or, in other words, each edge of the graph that connects simultaneously two vertices in both directions are called undirected graphs or undirected networks. On the other hand, graphs whose all edges, or arcs, have an orientation assigned are called directed graphs or directed networks. Formally, a directed graph Dis an ordered pair VAconsisting of a nonempty set, V, of vertices, and a set Aof arcs. These arcs are disjoint from V. If e12is an arc and v1and v2are vertices such that e12=v1v2then e12is said to join v1and v2, v1being called the initial vertex and v2called the terminal vertex. Depending on the presence of values assigned to the edges or arcs, the distinction between unweighted or weighted graphs or networks is made. Unweighted graphs or networks are binary by definition. This means that it is only represented by the presence or non-presence of an edge or arc between two vertices. Unless it is explicitly said, we always assume that graphs are unweighted. In weighted graphs, each edge has associated a weight wR0+providing more information about the relation between the two vertices (i.e., the strength of the relation). If e12is an arc between the two vertices v1and v2, w12defines the strength of the connection. For undirected and unweighted graphs, adjacency matrices are binary as a consequence of being unweighted and symmetric as a consequence of being undirected. The edge between vertices v1and v2is the same, eij=eji, with wij=wji=1. The absence of edges between vertices vkand vlis represented by wkl=wlk=0. For directed and weighted graphs, the matrices are nonsymmetric and values from the interval are thus: 0maxw.

Figure 1.

(a) Types of edge graphs and their representation according to an adjacency matrix (b) or an adjacency list (c).

3. From static to evolving networks

Previously, the network types and their representations in a static context were described. In real life, however, many networks are dynamic. As time passes by, new nodes are added to the network, existing ones are removed and edges come and go too. Static networks lack one of the most critical dimensions, that is, the temporal dimension of a network. So by definition static networks are assumed not to change or evolve over time, ignoring the temporal dimension. In this section, we cope with the representation of evolving networks. Evolving networks arise in a wide variety of application domains such as the web, social networks and communication networks. In recent years the interest in the area of dynamic social networks leads to new research and the need of analysis of evolving networks. The evolution analysis in graphs has applications in a number of scenarios like trend analysis in social networks and dynamic link prediction—to mention two typical examples. Figure 2 shows the example of a contact evolving network with instantaneous interactions between vertices. When the interaction between network peers has a time duration, we are in the presence of interval evolving networks as shown in Figure 3(b). Assuming that the time Tduring which a network is observed is finite we can consider the start point as tstart=0and the end time as tend=T. A dynamic network graph G0,TDVE0,Ton a time interval [0,T[consists of a set of vertices or nodes Vand a set of temporal edges E0,T. The evolving network is a set of graphs across the time axis within discrete time points t1,t2,,tn1,tn. At time point tna graph instance GVnEnis observed also denoted as Gnwhere Enis the set of temporal edges; uvtnE0,Tat time point tnwith edges between vertices uand von time interval as tn=tnbegintnendsuch that tnbeginTand tnendtnbegin0. Examples of network changes that may occur between two time points tn1and tnare the addition of new edges, that is, EnEn1, and the appearance of additional nodes, that is, VnVn1.

Figure 2.

Example of contact evolving network: (a) shows a labeled aggregate network where the labels denote the times of contact, and (b) shows a time-line plot, where each of the lines corresponds to one vertex and time runs from left to right.

Figure 3.

Example of interval evolving network: (a) shows the labeled aggregate network where the labels denote the time interval of the relation, and (b) shows a time-line plot, where each of the lines corresponds to one vertex and gray zones the time duration between two edges.

3.1. Models of temporal representation

Several models for representing evolving networks are available in the literature. Kim and Anderson [30] introduced the concept of a time-ordered graph, and Thompson et al. [29] with a similar conceptual representation chose the time graphlet to represent time-varying graphs. Casteigts et al. [31] presented the time-varying graph formalism (TVG) with the concept of a journey to catch the temporal information on graphs. Santoro et al. [32] used this formalism to describe several network measures. Although there are differences in the representation and nomenclature, these models are conceptually equal. Figure 4 presents the concept of a time-ordered graph for an example network for the time interval 03. Figure 4(a) shows all the time intervals aggregated into a single graph G1,3. The discretization of the network by converting the temporal information into a sequence of nsnapshots is presented in Figure 4(b). In this example the evolving network is represented as a series of static networks G1,G2,,Gn. The time-ordered graph G=VEof Figure 4(c) assumes that at each time step, a message can be delivered along a single edge. It is an asymmetric directed graph with a vertex vtfor each vVand for each t01nfor each edge uvGt; it has a directed edge from ut1to vand vice versa. Although it was not represented in the figure, the time-ordered graph also has edges from vt1to vtfor all vVfor all t1n. The time-ordered graph G=VEconstructed from nstatic networks of a dynamic network Gi,jD=VEi,jis a powerful tool to define network metrics and capture their temporal characteristics. In the example of Figure 4(c), the temporal shortest path from node u=Ato node v=Bis shown. The temporal shortest path from Ato Bin the interval 03is A0A1D2B3. The time-ordered graph of Kim and Anderson [30] will be the model used during the course of the rest of the document.

Figure 4.

Comparison of aggregated representation (a) and time series representation (b). The corresponding time ordered (c) graph G is presented for the interval 0 3 .

3.2. Timescale of evolving networks

Regarding the evolution of the networks, not all networks evolve equally. Some networks evolve faster than others or have edges that are being added at different rates. Two distinct timescale examples of networks are email networks, where edges are added at the timescale of seconds, and bibliographic networks, where edges are added at the scale of weeks or months. Different time-evolving scenarios require different types of analysis [27]. Slowly evolving networks: When networks evolve slowly over time, snapshot analysis can be used very effectively. In this case, dynamic networks are discretized in time by converting temporal information into a sequence of nstatic snapshots. All the analysis can be done in each snapshot of the network at different times t1, t2, , tnusing static analysis methods. Usually, in slowly evolving networks, a discretization of the time axis into intervals of equal length occurs. For a time discretization in years, days or seconds, a time window size wfor each snapshot are set to T/nwith nbeing the number of snapshots. Another solution consists of record buckets of equal size for numerical discretization—a window size w, in this case, is set to a given number of network updates. In both cases the dynamic network can be represented as a series of static graphs G1,G2,,Gnwith 0ntn. A time point tnis the moment where the network suffers a set of changes in the network represented by the addition or removal of sets of edges (+e1+e2e3e4+/en) and/or appearance or disappearance of vertices (+v1+v2v3v4+/vn). Signs +and represent the additions or removals, respectively. Streaming networks: When networks are built by a never-ending flow of transient interactions, such as email or telecommunications networks, they should be modeled and represented as graph streams. Graph streams typically require real-time analytical methods. The scenario of graph streams is more challenging because of the computational requirements and the inability to hold complete graphs on memory or disc. Velocity is also an issue because common methods require dealing with graph updates at very high edge rates. Time point tiis the moment in which a single change in the network occurs. Guha et al. [33] defined this to be the stream model of computation with a stream being a sequence of records x1,x2,,xnarriving in increasing order of the index i, where ximay be a new vertex vior a new edge ei.

3.3. Landmark versus sliding windows

When the temporal dimension is added to the analysis of networks, different methodologies regarding the strategy to cope with data that is being analyzed vary. Figure 5 shows three types of graph data windowing strategies. Landmark windows by Gehrke et al. [34] encompass all the data from a specific point in time up to the current moment. In the landmark window, the model is initialized in a fixed time point, the so-called landmark that marks the beginning of the window. In successive snapshots, the data window grows to consider all the data seen so far after the landmark. Sliding windows are better suitable when we are not interested in computing statistics over all events of the past but only over the recent past [35]. Datar et al. [36] incorporate a forgetting mechanism by keeping only the latest information inside the window and disregarding all the data falling outside the window. Usually, the sliding windows are of fixed size. The time-based length sets the window length as a fixed time span. Sliding windows can be overlapping and non-overlapping depending on whether two consecutive windows share some data between them or not. From the several window models presented in the literature [37, 38], two basic types of sliding windows are commonly defined: sequence-based models, where the size of the window is determined regarding the number of observations, and timestamp-based models where the size of the window is defined concerning duration. A timestamp window of size tconsists of all elements whose timestamp is within a time interval tof the current period.

Figure 5.

Types of data windows: Landmark window (a) non-overlapping sliding window (b) and overlapping sliding window (c).

3.4. Types of evolving network analysis

Depending on the timescale of the network and the chosen strategy to cope with network data, distinct evolving network analysis methods are available. These methods are divided into one of the following categories [27]. Maintenance methods: In these methods it is desirable to maintain the results of the data mining process continuously over time. Examples of maintenance methods are classification and clustering. Analytical evolution methods: In this case it is desirable to directly quantify and understand the changes that have occurred in the underlying network. Such models are focused on modeling change. Bridge methods: From a methodological point of view, and in the context of a few key problems, an overlapping of maintenance and analytical evolution methods occurs. These bridge methods, such as community detection, fall into both categories.

4. Elementary network measures

In this section, elementary network measures and popular metric used in the analysis of social networks are presented.

4.1. Actor-level statistical measures

Actor-level or node-level statistical measures determine the importance of an actor or node within the network. These measures reveal the individuals in which the most important relationships are concentrated and give an idea about their social power within their peers.

4.1.1. Degree or valency

Dv=u=1nau,v,0<Dv<nE1
Dv=Nv,0<Dv<nE2

The degree of valency of a node vis usually denoted as Dvand measures the involvement of the node in the network. It is computed as the number of edges incident on a given node or as the number of neighbors of node v. The neighborhood Nvis defined as the set of nodes directly connected to v. The degree is an effective measure to access the importance and influence of an actor in a network despite some of its drawbacks like not taking into consideration the global structure of the network. In static networks, the degree can be computed via the adjacency matrix by (1) or using the neighborhood of a node (2). Depending on the type of the networks different degree calculation methods should be made for directed and undirected networks and weighted and unweighted networks.

Di,j+v=t=iju=1nau,vE3
Di,jv=t=iju=1nav,uE4
Di,jwv=t=iju=1nau,vwE5

For dynamic networks, and generalizing to a directed and unweighted network, the temporal degree Di,jvis the total number of inbound edges and outbound edges from a node vVon a time interval ijwhere 0i<jn. If we disregard the self-edges from vt1to vtfor all ti+1j, Di,jvis equal to t=ij2.Dtvwhere Dtvis the degree of vin Gt(i.e., the dynamic graph at time t). For directed networks, there are two variants of degree centrality: considering in degree, denoted by Di,j+v, (3) is the number of incoming edges to node vor edges that end at vand considering out degree, denoted by Di,jv, (4) is the number of outgoing edges from node vor edges that start at v. For weighted networks, strength is the equivalent to degree but is computed as the sum of the weights of the edges adjacent to a given node (5).

4.1.2. Betweenness

Bv=s,dVG\νσsdνσsdE6
Bi,jv=it<jsudVσt,jsd>0σt,jsdvσt,jsdE7

Node betweenness Bvmeasures the extent to which a node lies between the other nodes in the network. For static networks, (6) is used, where σsddenotes the number of shortest paths between vertices sand d(usually σsd=1) and σsdνexpresses the number of shortest paths passing through node ν. Nodes with high betweenness occupy critical roles in the network structure once their position allows them to work as an interface between different regions of the network. The temporal betweenness Bi,jv(7) for a node vVon a time interval ijwhere 0i<jnis the sum of the proportion between all the temporal shortest paths passing by the vertex vand the total number of temporal shortest paths passing over all pairs of nodes in each time interval tj:i<tj. The temporal betweenness for node vis given by (7). Examples of betweenness algorithms are the Brandes algorithm [39], the incremental algorithm proposed by Nasre et al. [40] and the algorithm proposed by Kas et al. [41] for evolving graphs.

4.1.3. Closeness

Cv=1nuV\vduvE8
Ci,jv=it<juV\v1Δt,juvE9

Closeness measures the overall position of an actor in the network giving an idea of how long it will take, on average, to reach other nodes from a given starting node. It is represented by the average length of the shortest path between the node and all other nodes in the graph. Thus more central a node is, the closer it is to all other nodes. In general, it is only computed for nodes within the largest component of the network as shown in (8). The temporal closeness is defined by considering mintervals tj:i<tjwhere m=jiby varying the start time tof each time interval from ito j1instead of one time interval ijwith the starting time as i. Formally the temporal closeness for a node vis given by (9) where Δt,juvis the temporal shortest path distance from uto von a time interval tj. If there is no temporal path from vto uon a time interval tj, Δt,juvis defined as . Since the time-ordered graph Gis a directed graph, Δt,juvis different from Δt,jvu. Regarding the update of closeness centrality in evolving graphs, it was worked on by Kas et al. [42] and Sariyuce et al. [43]. Kas et al. [42] developed incremental closeness centrality algorithms for dynamic networks. An extension of the Ramalingam and Reps [44] algorithm computes the closeness values incrementally, using all-pairs shortest paths for streaming, dynamically changing social networks.

4.1.4. Eigenvector centrality

xv=1λtMvxt=1λtGav,txtE10
Ax=λxE11

For a given graph G=VEwith Vvertices, let A=av,tbe the adjacency matrix of an unweighted network, that is, av,t=1if vertex vis linked to vertex t, and av,t=0otherwise. The relative centrality score of vertex vcan be defined by (10), where Mvis a set of the neighbors of vand λis a constant. This definition can be rewritten in vector notation as the eigenvector equation using small arrangements is shown in (11). The classical eigenvector centrality had improvements or variants developed to approach the evolving graphs’ problem. Examples are Google’s PageRank [45] and Katz centrality [46] as the possible variants of this measure as proposed by Society [47]. The concrete implementation of PageRank variants to evolving networks was developed by several researchers, for example, by Bahmani et al. [48], Desikan et al. [49] and Kim and Choi [50]. These improvements over the original PageRank measure show significantly faster results when compared with the original PageRank that, for being an iterative process, did not scale well to large-scale graphs. This algorithm will be discussed in detail in Section 5.1.

4.1.5. Laplacian centrality

ELG=i=1nλi2E12
ELG=i=1nxi2+2i<jwi,j2E13

The Laplacian centrality permits to consider intermediate environmental information around a vertex or node to compute its centrality measure. The centrality of some vertex vis then characterized as a function of the number of 2-walks that vertex vtakes part in the network. To estimate the centrality of a vertex, we need to reflect not only the first-order connections but also the importance of their neighbors. These results and related calculations describe the so-called Laplacian energy of the network. Therefore this strategy is known as the Laplacian centrality. The Laplacian energy ELGfor a weighted network G=VEWwith nvertices and λ1,λ2,,λneigenvalues of its Laplacian matrix is defined by (12). Considering that x1,x2,,xnare the vertex sum weights calculated by xi=j=1nwi,jwhere wi,jis the weight of the edge from vertex ito j, the Laplacian energy ELGcan be computed by (13). The motivation for the incremental Laplacian centrality is supported by the fact that it is known to be a local measure [17, 51].

4.1.5.1. Locality of the Laplacian centrality

The Laplacian centrality metric is not a global measure, that is, it is a function of the local degree plus the degrees of the neighbors (with different weights for each). Qi et al. [17, 51] show that local degree and the 1-order neighbors’ degree are all that are needed to calculate the metric for unweighted networks (Figure 6).

Figure 6.

Calculated node centralities with edge {(4, 6)} added. Dark gray nodes affected by addition of edges. Light gray nodes centralities need to be calculated due to their neighborhood with affected nodes.

4.1.5.2. Dynamic Laplace centrality

Regarding the original Laplace centrality algorithm proposed by Qi et al. [51], despite being a static algorithm, it can be used to calculate centralities in changing networks. This is true by considering full calculations of the centralities for each network snapshot. In Sarmento et al. [52] proposal, Qi et al. [51] principles were adapted and resulted in two incremental algorithms. The incremental Laplace algorithm by Sarmento et al. [52] presents better computational efficiency, by performing careful Laplace centrality calculations only for the nodes affected by the addition and removal of edges in each one of the snapshots. Thus, it reuses information of the previous snapshot to perform the Laplace centrality calculations on the current snapshot, for unweighted networks only.

4.2. Network-level statistical measures

Before describing network-level statistical measures, it is important to describe three fundamental concepts that are common to static and dynamic networks. Path represents a sequence of nodes in which consecutive pairs of non-repeating nodes are linked by an edge. When adding the temporal dimension of dynamic networks, the concept of path slightly changes, because non-repeating nodes are now considered only within the same snapshot or time step in what is called a temporal path. Temporal paths can have repeating nodes in different time steps, for example, A0B0C1B1A2. Geodesic distance, or the shortest path, between nodes uand vis denoted as δuvand defines the length of the shortest path, or minimal path, between nodes uand vin a static graph. For a given time-ordered graph G, a temporal path from node uto node von time interval ijwhere ii<jnis defined as any path, p=<ui,,vi>where i<kj, having the path length p=mini<ljδuivl. δuvis the shortest path distance, in a static graph, from uto v. The temporal shortest path from node uto node vis defined as the temporal path connecting uto vwhich has minimum temporal length. In Figure 4, an example of a temporal shortest path in a time-ordered graph was shown. Eccentricity is the greatest geodesic distance between a given vertex vand any other in the graph, that is, εv=miniVG\vdvi.

4.2.1. Edge bursts

Bij=στijσμijστij+σμijE14

A hallmark of a bursty edge is the presence of multiple edges with short interconnect times, followed by longer and varying interconnect times [29]. One of the methods available to quantify bursts is the burstiness coefficient B. Presented by Goh and Barabasi [53], it can be formulated for discrete graphs [54] where bursts are computed by edges using (14), with τijbeing a vector of the intercontact times between nodes iand jthough time, στis the standard deviation and σμis the mean. For temporal connectivity being considered as bursty, that is, B>0, it occurs when the standard deviation στis greater than the mean σμ.

4.2.2. Fluctuability

F=ijUAi,jijtAi,jtE15
FiN=jUAi,jjtAi,jtE16

As discussed before, centrality measures provide information about the degree of temporal connectivity while bursts describe the distribution of the temporal patterns of connectivity at the node level. Also, fluctuability can be used to retrieve information about the global state of a temporal network, in this case, the quantification of the temporal variability of connectivity [29]. The fluctuability F, as shown in (15), is the ratio of the number of edges present in Aover the sum of At, with Ubeing a function of the binary output: UAi,jis set to 1 if at least one of the edges occurs between nodes iand jacross time t=1,2,,Tand 0 if not. Tis the number of time points. The maximum value of Fis 1 and occurs only when every edge is unique and occurs only once. The definition of fluctuability FiNat the node level, when UAi,j>0, is defined using (16); when UAi,j=0, FiNis equal to 0 (Figure 7).

Figure 7.

Variation of Fluctuability and volatility measures over three different evolving contact networks.

4.2.3. Volatility

V=1T1t=1T1DAtAt+1E17
Vi,jL=1T1t=1T1DAi,jtAi,jt+1E18

The volatility Vis a global measure of temporal order that represents how much, on an average, the connectivity between consecutive temporal time-ordered graphs changes [29]. This measure indicates how volatile the temporal network is over time and is computed by (17), where Dis a distance function and Tis the total number of time points. The distance function quantifies the difference between the temporal time-ordered graph Gtand the temporal time-ordered graph Gt+1. One example of a distance function for volatility can be the Hamming distance. Volatility can be defined at the local level, for example, a per-edge volatility can be computed using (18). An estimate of the volatility centrality of node ican be computed by taking the mean Vi,jLover j(Figure 7).

4.2.4. Reachability latency

Rr=1TNtidi,ktE19
R1=1TNtimaxjdi,jtE20

Reachability measures, like reachability ratio and reachability time, focus on estimating the time taken to reach the nodes in a temporal network [55]. While the reachability ratio is the percentage of edges that have a temporal path connecting them, the reachability time is defined as the average length of all temporal paths. When applying these reachability measures to most real-world networks, if we consider a sufficient time interval, any vertex or node of the networks can reach all the others within that time span. With this assumption in mind, Thompson et al. [29] defined the reachability latency, which quantifies the average time it takes for a temporal network to reach an a-priori-defined reachability ratio as defined in (19), where ditis an ordered vector of length Nof the shortest temporal paths for node iat time point t. Value krepresents the rNth element in the vector. In case r=1, that is, all nodes are reachable, the former formula can be simplified to (20), which is also known as the temporal diameter of the network [56].

4.2.5. Temporal efficiency

E=1TN2Ni,j,t1di,jt,ijE21

For static networks, efficiency is computed as the inverse of the average shortest path for all nodes [29]. Temporal efficiency, at first, is calculated at each time point as the inverse of the average shortest path length of all nodes; subsequently, these values are averaged across time points to obtain an estimate of global temporal efficiency as shown in (21).

4.2.6. Diameter and radius

The diameter Dis given by the maximum eccentricity of a set of vertices D=maxεv:vVand, analogously, the radius Ris defined as the minimum eccentricity of the set of vertices R=minεv:vV.

4.2.7. Average geodesic distance

LG=112nn1uvδuvE22
LG=1nn1uvduvE23

The average geodesic distance Lgives an idea on how far apart nodes will be, on average. For static networks, all combinations of vertex pairs in a network are computed as in (22), where δuvis the geodesic distance between nodes uand vand 12nn1is the number of possible edges in a network of nnodes. Tang et al. [57, 58] defined the characteristic temporal path length as the natural extension of the average geodesic distance to time-varying graphs. It is defined as the average temporal distance over all pairs of nodes in the graph as shown in (23), with the temporal distance du,vbetween uand vas the temporal length of the temporal shortest path from uto v.

4.2.8. Average degree

DG=1n1ti=0tu=1nv=1natuvE24

The average degree is the mean of the edges of all vertices in a network for all time steps t, with atbeing the adjacency matrix at time t, as shown in (24).

4.2.9. Reciprocity

r=#mutual#mutual+#asymmetricE25

For static networks, reciprocity ris a specific quantity of directed networks that measures the tendency of pairs of nodes to form mutual connections between each other. The value of reciprocity represents the probability that two nodes in a directed network point to each other. For each of the nn1/2dyads in the network are assigned to one of the three types: mutual, that is, node ihas a tie to node jand node jhas a tie to i; asymmetric, that is, either ihas a tie to jor jhas a tie to ibut not both; or null, that is, neither the ito jtie nor the jto itie is present [59]. Given this, reciprocity can be computed using (25), where #mutualdenotes the number of mutual dyads and #asymmetricthe number of asymmetric dyads. In an undirected network, reciprocity is always maximum (r=1) because all pairs of nodes are symmetric, that is, dyads are of the type mutual. Brandenberger [24] analyzed the temporal dynamics of reciprocity in congressional collaborations using relational event models.

4.2.10. Density

ρG=mGmmaxGE26

Density ρexplain the general level of connectedness of a network. It is computed by measuring the proportion of edges in the network relative to the maximum possible number of edges as seen in (26), where mGis the total number of edges of network Gand mmaxGthe number of possible edges of network G, which is nn12for undirected networks and nn1for directed ones.

4.2.11. Global clustering coefficient

Cui et al. [60] propose two definitions of the temporal clustering coefficient of a temporal network. The definitions are temporal-delayed clustering coefficient and the temporal-weighted clustering coefficient.

5. Link analysis

In network theory, link analysis is a data analysis technique used to evaluate relationships (connections) between nodes. Link analysis has been used for the investigation of fraud detection, terrorist networks, computer security analysis, search engine optimization (SEO), market research and medical research, among others. To find the most valuable, authoritative or influential node or the list of nodes in networks, link analysis algorithm were devised to solve this problem in the past. By exploring the relationship between links and the content of web pages, the PageRank algorithm [45] is one of the seminal methods employed to the build of modern and efficient search engines and the information retrieval system in the web.

5.1. Incremental PageRank algorithm

There have been several attempts to improve the original PageRank algorithm [45]. The purpose of several of these improvements was to adapt this algorithm for streaming data. In the original algorithm, each page rank is dependent of the ranks of the pages pointing to it. The PageRank value of a page pis mathematically written as:

PRp=dn+1dq,pGPRqOutDegreeqE27

where nis the number of vertices/pages in the graph and OutDegreeqis the number of hyperlinks on the node/page q. Eq. (27) illustrates an example of computing PageRank of a page Pfrom the pages (Figure 8(a)) P1, P2and P3pointing to it using (28):

PRp=dN+1dPRP1OutDegP1+PRP2OutDegP2+PRP3OutDegP3E28

Figure 8.

PageRank and graph partitioning used in Desikan et al. [49] incremental page rank.

Desikan et al. [49] provided a solution for the update of nodes’ PageRank values in evolving graphs in an incremental fashion. The algorithm explores the fact that the web evolves incrementally and with small changes between updates. Figure 8(b) shows the two partitions created; one of them, partition P, is unchanged since the last computation, and it has only outgoing edges to the other partition. The other partition, partition Q, is the rest of the graph, which has changed since the last time the metric was computed. The principal idea is to find a partition P in a way that there are no incoming links in the graph from the other partition Q (includes all changed nodes). Then the computation of the PageRank of partition Q can be done separately, scaled and merged with the rest of the graph to get the updated PageRank values of the vertices in this partition. The PageRank of partition Q is computed, taking the border vertices that belong to partition P and have edges pointing to the vertices in partition Q. The PageRank values of partition P are obtained by simple scaling, due to the addition of new nodes. Let the graph of Figure 8(b) at the new time be GVE: Vbis the vertex on the border of the left partition (only outgoing edges to the right partition); Vul, vertex on the left partition remains unchanged; Vur, vertex on the right partition remains unchanged but whose PageRank is affected by vertices in the changed component; Vcris the vertex on the right partition which has changed, or there has been a new addition. Desikan et al. [61] proposed a divide and conquer approach for efficient PageRank computation based on these assumptions. Other page rank analysis algorithms, in which it is desirable to estimate the page rank on a dynamic evolving graph stream, are available in [62, 63]. Sarma et al. [62] developed a method that can estimate the page rank distribution, the mixing time and the conductance of the graph. Bahmani et al. [63] developed a method for real-time estimation of the personalized page rank in graph streams. Zhang et al. [64] proposed a method for an approximate personalized PageRank on dynamic graphs. The update of PageRank node values in dynamic graph streams has been extensively used to leverage the efficiency of large-scale link analysis, inclusively with applications that have known issues with scalability like, for example, text streams [65].

5.2. Link prediction

To understand the association between two specific nodes, researchers commonly study the dynamics of evolving graphs. In link prediction, the problem we wish to solve is the prediction of the likelihood of a future association between two nodes, knowing that there is no association yet between the nodes, that is, no edge between the nodes. Link prediction is used in bioinformatics, where potential protein connections are inferred from known connections, and during the research of terrorist/criminal networks, where potential criminal connections are inferred from current knowledge of the relationships between criminals. Link prediction is a complex problem. For a social network GVE, there are n2mpossible edges to infer from our current graph. This is true if we randomly select a non-existing edge. If Gis sparse, then mn. Thus, in limit situations, with a high amount of nodes, we have a n2edges to choose from, and the probability of inferring correctly at random is 1/n2. Commonly, social networks derived from real-world phenomena are sparse, so inferring random edges is expected to have low accuracy. As networks evolve, it is expected that nodes are added to the network, and the number of possible links grows quadratically while it is expected that new edges grow in a linear fashion with added new nodes. Thus, it is a problem that gets worse in evolving networks as time goes forward.

5.2.1. Common neighbors

Newman has verified a significant correlation between the number of common neighbors of uand vat time t, and there is the probability that uand vwill connect or collaborate at sometime after t[66]. The common-neighbors predictor concept is based on the assumption that two—not yet connected—nodes with common neighbors will get connected sometime in the future. This introduction between unconnected nodes means the effect of “closing the triangle”.

5.2.2. Jaccard coefficient

The Jaccard coefficient is a common metric in measuring the similarity between different samples. It is used throughout validation tasks in information retrieval research. It measures the probability that both uand vhave a feature f, for a randomly selected feature fthat either uor vhas.

JC=NuNvNuNvE29

where Nuis the list of neighbors of u, and Nvis the list of neighbors of v. Thus, in (29), the numerator is the number of common neighbors, and the divisor is the number of unique uand vneighbors.

5.2.3. Adamic/Adar

This measure—also called frequency-weighted common neighbors—refines the simple counting of common features in JCby weighting rarer features more heavily [67]. The Adamic/Adar predictor is based on the intuitiveness of considering unusual features more critical in predicting future outcomes. In this example uand vwould have to be introduced by a common friend z, person zwill have to choose to introduce the pair (u,v) from Nzpairs of his connections.

PA=NuNv1logNzE30

5.2.4. Preferential attachment

Another concept, this time with lower complexity, is the preferential attachment [68]. This metric is only in need of node degree information. The intuitiveness is that those nodes with a higher degree have more probability to connect to each other than with a neighbor with a lower degree.

PA=NuNvE31

5.2.5. Katz

The Katz concept [46] is based on the assumption that the closer connected nodes are with a higher number of paths in the network, and these nodes will have more probability of connecting in the future. The concept is also called “exponentially damped path counts”.

Katzscore=L=1βLPathu,vLE32

where βLis exponentially damped by length Pathu,vLof the Path, with Lbeing the number of hops between uand v.

5.2.6. Recent developments

Ibrahim and Chen [69] present a method for link prediction in dynamic networks by integrating temporal information, community structure and node centrality in the network providing greater weights for frequently occurring links. Wahid-Ul-Ashraf et al. [70] described the parallelism between Newton’s law of universal gravitation and the link prediction tasks. To apply this law, the authors attributed nodes with the notion of mass and distance. Node centrality could be considered as mass, and the authors inclusively tested this concept with degree centrality. The distance between nodes was considered obtainable through several possible methods, that is, by retrieving the shortest path, path count or inverse similarity, by using previously stated measures like Adamic/Adar, Katz score or others. Choudhury and Uddin [71] considered the evolutionary aspects of community network structure. They build dynamic similarity metrics or dynamic features to measure similarity/proximity between actor pairs.

6. Community detection

As a consequence of both global and local heterogeneity of edge distribution in a graph, specific regions of a graph evidence the high concentration of edges within particular regions, called communities, whereas interregions have low concentrations of edges. In the context of networks, these occurrences of groups of nodes in a network that are more densely connected internally than with the rest of the network are called community structures. Also known as modules or clusters, communities can, therefore, be straightforwardly defined as similar groups of nodes. A complete definition using the concept of density can be the following: communities can be understood as densely connected groups of vertices in the network, with sparser connections between them.

6.1. Finding communities in static networks

Fortunato [72] has a comprehensive survey about methods and techniques regarding finding communities. Hierarchical clustering methods can be of two types: agglomerative algorithms, in which clusters are iteratively merged if their similarity is sufficiently high, and divisive algorithms, in which clusters are iteratively split by removing edges connecting vertices with low similarity. Divisive algorithms: One of the most known divisive algorithms is the one proposed by Girvan and Newman [73]. The philosophy of divisive algorithms is the idea that a simple way to identify communities in a graph is to detect the edges that connect vertices of different communities and remove them so that the clusters get disconnected from each other. Agglomerative algorithms: Examples of agglomerative algorithms are the ones that assume that high values of modularity indicate good partitions. So the partition corresponds to maximum value of modularity on a graph. Therefore a modularity measure Qis used to evaluate the quality of the community structure of a graph. Modularity serves as the objective function during the process of calculating the communities [74]. Modularity Qwith higher values means better community structures. Therefore, to obtain a global higher modularity, the objective is to find community assignments for each one of the nodes of the network such that Qis also maximized. A greedy algorithm based on modularity optimization has been introduced by Blondel et al. [75] where initially all vertices of the graph are put in different communities (Figure 9(b)). The first step consists of a sequential sweep over all vertices, for each of the neighbors picks the community that yields the largest increase of modularity (Figure 9(c)). At the end of the sweep, one obtains first-level partition. In the second step, communities are replaced by super vertices, and the weight of the edge between the super vertices is the sum of the weights of the edges between the represented communities at the lower level (Figure 9(d)). The two steps of the algorithm are then repeated, yielding new hierarchical levels and supergraphs (Figure 9(f)).

Figure 9.

Example of an agglomerative community detection algorithm. In this case the original Louvain [75] with all algorithm steps.

6.2. Finding communities in dynamic networks

When discussing methods for finding communities in dynamic networks, the division of methods for slowly evolving networks and streaming networks is consensual [27]. In the following section, an algorithm for both scenarios will be presented and analyzed.

6.2.1. Slowly evolving networks

When moving from static community detection to dynamic community detection, often, static techniques are used to detect communities in evolving network. The Louvain algorithm by Blondel et al. [75] is no exception, and it is still one of the fastest ways to perform community detection on evolving networks by considering individual static snapshots. Frequently employed in dynamic network community detection scenarios by performing individual runs of the algorithm in snapshots of the network, this approach is computationally inefficient and does not allow the tracking of communities in a fine-grained way between static snapshots. The community detection work referenced in the Fortunato [72] survey was later complemented by an incremental community detection algorithm based on modularity and was proposed by Shang et al. [76]. The algorithm applies the principles of events in the life of communities (growth, contraction, merging, splitting, birth and death) as defined by Palla et al. [77] and, in each one of the iterations, calculates the modularity gain of affected communities. This allows to detect and track communities over time in incremental networks. This algorithm only considers the addition of new edges and relies on the original two-step approach used in community detection for static communities. The QCA [78], presented as a fast and adaptive algorithm, provides efficient identification of the community structure of dynamic social networks by allowing the addition and removal of nodes and edges dynamically. The algorithm starts with the initial communities calculated via the Louvain method, and then it applies the adaptive node community changes by considering each node as an autonomous agent demonstrating flocking behavior toward their preferable neighboring groups [79]. The AFOCS [80] community detection algorithm for dynamic networks shares the same principles of QCA being only modified in order to allow the possibility of detection of overlapping communities. A detailed comparison between QCA and AFOCS was presented by Nguyen et al. [80]. Label propagation techniques and specifically speaker-listener label propagation (SLPA) were used in community detection over large networks. LabelRank [81] and GANXiSw [81, 82] used the SLPA technique to perform static network community detection while LabelRankT [83] was designed to handle dynamic networks. Being designed for overlapping community detection, all of the previous algorithms also work in a non-overlapping mode, with satisfactory performance for low overlapping density networks [84].

Cordeiro et al. [85] presented a modularity-based dynamic community detection algorithm. The algorithm is a modification of the original Louvain method where dynamically added and removed nodes and edges only affect their related communities. In each iteration, the algorithm remains unchanged in all the communities that were not affected by modifications to the network. By reusing community structure obtained by previous iterations, the local modularity optimization step operates in smaller networks where only affected communities are disbanded to their origin. The stability of communities is also an improvement over the original algorithm (Figure 10). Given that only parts of the network change during iterations, the non-determinism of the algorithm will have a reduced effect on the community assignment. Most node-community assignments remain unchanged between snapshots, providing better community stability than its counterparts.

Figure 10.

Example of Cordeiro et al. [85] dynamic Louvain for the addition of a cross-community edge (1–4). Top figures show the lower-level network. At the bottom, are shown, the corresponding upper-level network with aggregated communities.

6.2.2. Streaming networks

For the cases when a large number of edges, representing interactions, arrive continuously, in some cases at high or very high rates, and are superposed over much larger networks, streaming graph algorithms should be preferred to perform community detection. In streaming scenarios, the ability to perform the deletion of edges in community detection algorithms is important. In short, as discussed in Section 3.3, this will dictate if the method of analysis is to be performed over the sliding window of edges, and therefore edges are deleted from the tail end of the sliding window, or over a landmark window, in case there is no possibility to delete or forget old edges. Several methods were proposed for dynamic community discovery in graph streams. Wang et al. [86] motivated by the variability of the underlying social behavior of individuals over different graph regions modeled the problem according to the so-termed local heterogeneity, where a local weighted-edge-based pattern (LWEP) summary is efficiently maintained and used afterward to cluster the graph stream and perform dynamic community detection in weighted graph streams. Taking an almost linear time, Raghavan et al. [87] investigated a simple label propagation algorithm that uses the network structure alone as its guide and requires neither optimization of a predefined objective function nor prior information about the communities. By analyzing the problem of real-time community detection in large networks and having by baseline the algorithm proposed by Raghavan et al. [87] with linear time Omon a network with medge-label propagation, or “epidemic” community detection, Leung et al. [88] proposed a method with near-linear time community detection in graphs. Leung et al. identified the characteristics and drawbacks of the base [87] algorithm and extended it by incorporating different heuristics to facilitate reliable and multifunctional real-time community detection. Yun et al. [89] proposed two efficient streaming memory-limited clustering algorithms for community detection based on spectral methods. Yun and Proutière [90] proposed community detection via random and adaptive sampling. Sariyuce et al. [91] proposed SONIC, a find-and-merge type of overlapping community detection algorithm that can efficiently handle streaming updates. Recently, Hollocou et al. [92] proposed SCoDA, a linear streaming algorithm, for community detection in very large networks.

7. Visualization of evolving networks

The visualization of networks is known to be challenging, and this task gains additional complexity when moving from static to evolving networks. In this section an overview of the methods and techniques is presented, currently used for the visualization of evolving networks.

7.1. Challenges of evolving networks’ visualization

The dynamics of social networks remain a challenge regarding visualization [93]. Many researchers argue that traditional graph visualization methods have issues when applied to evolving networks. Additionally, the application of conventional node-link methods to large-scale networks provides low-quality cluttered insights. The overlap of nodes in these conditions is not appropriated when trying to extract information from the network. Zaidi et al. [94] and Aggarwal and Subbian [27] presented an overview of the different techniques and methods that exist for the analysis and visualization of dynamic networks. It included the discussion of the basic definitions, formal notations and a set of the most important and recent work regarding analysis and the visualization of dynamic networks. While static graph visualizations are often divided into node-link and matrix representations, Beck et al. [95, 96] presented a hierarchical taxonomy of dynamic graph visualization techniques. This survey about the state of the art in visualizing dynamic graphs identified the representation of time as the major distinguishing feature for dynamic graph visualizations. Two major visualization categories were found: in one category, graphs are represented as animated diagrams or in a second one, visualizations are a set of static charts based on a timeline. Similar conceptual dynamic network visualization categories were devised by Moody et al. [97], and the authors divide dynamic network visualizations also called as network movies into static flip books, where the node position remains constant but edges cumulate over time and dynamic movies, where nodes move as a function of changes in relations. The graph animation is often used to lower the cognitive effort required to follow the transition from one visualization to the next, according to Brandes and Corman [98]. To facilitate the simultaneous analysis of state and change, a layered three-dimensional network visualization was proposed by Brandes and Corman [98] in which the evolution of the network is unrolled, and each step is represented as a layer. A complex network with a larger number of links may prevent users from recognizing salient structural patterns. To overcome this common problem with visualization, two widely known link reduction algorithms, namely minimum spanning trees (MSTs) and pathfinder networks (PFNETs), were analyzed and compared by Chen and Morris [99]. Bender-deMoll and McFarland [100] propose a framework for visualizing social networks and their dynamics and presented a tool that enables debate and reflection on the quality of visualizations used in empirical research. With the focus on the evolution of communities over time, Falkowski et al. [101] proposed two approaches to analyze the evolution of two different types of online communities on the level of subgroups. This analysis was conducted by observing changes in the interaction behavior of the members of the communities. Chen [102] devised a generic approach for detecting and visualizing emerging trends and transient patterns in scientific literature. Other recent work of interest is presented by Beck et al. [103, 104] and the visualization of evolving graphs with multiple visual metaphors of Burch [105]. The combination of dynamic network visualization with graph sampling techniques is often used [106].

8. Conclusion and future trends

This chapter provided an overview of the methods and techniques for modeling, analyzing, measuring and visualizing evolving social network analysis. In the past, static techniques were adapted to dynamic networks with relative success, but nowadays, with the advent of social media, scale and velocity of most of those static techniques reveal weaknesses that only can be addressed by methods and techniques designed for dealing with evolving data. After presenting two areas of direct applicability of evolving network analysis such as criminological research and research on journalism, the ways on how dynamic networks can be represented and modeled according to their timescale, windowing strategies and methods of analysis were discussed. These theoretical aspects were then used to present elementary network measures, link analysis methods, community detection methods and visualization techniques. It is clear that in recent years this area of research will continue to have significant development in the future, several problems are still unsolved and many of them can be significantly improved. The areas of applicability of evolving networks and social network analysis are also broader, with many of the abovementioned techniques moving from well-succeeded areas like world wide web, communication, telecommunication and mobile networks, to newer areas like social network recommendations, news and blog analysis and social network event detection. Specifically in the area of social network event detection, the detection of unusual patterns, anomalies or changes in trends in the social streams can lead to valuable information, which can be used timely in many real-word scenarios [107]. Cordeiro [108] addressed the monitoring and tracking of the dynamics of social network communities with the objective to unveil real-world events, whereas Cordeiro [109] was devoted to the problem of mining the twitter stream to unravel events, interactions and communities in real time. Future trends of social network analysis will continue to be driven by future trends and characteristics of the network data, such as the size of data, which is incredibly getting large, and changes in space and time. On one side, there is the urge for scalable and efficient social network analysis methods, and on the other side, there is the need for methods to study the dynamics and evolution of social networks, able to deal with future velocity and timescale dimensions of the network data. Stray [110] focused network analysis as a tool to bridge the “research to reporting” gap in journalism, starting with two use cases (Seattle Art World [111] and Hot Wheels [112]) and the recent state-of-the-art network analysis and visualizations applied to the Panama Papers case [113] where graph databases and entity recognition were used to build interactive network maps from structured data and raw documents. Therefore, it is expected that the study of evolving networks will continue to be a significant strand of research in the context of social network analysis in the near future.

Acknowledgments

This work was fully financed by the Faculty of Engineering of the Porto University. Rui Portocarrero Sarmento also gratefully acknowledges funding from FCT (Portuguese Foundation for Science and Technology) through a PhD grant (SFRH/BD/119108/2016). The authors also want to thank the reviewers for the constructive reviews provided in the development of this publication.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Mário Cordeiro, Rui P. Sarmento, Pavel Brazdil and João Gama (October 31st 2018). Evolving Networks and Social Network Analysis Methods and Techniques, Social Media and Journalism - Trends, Connections, Implications, Ján Višňovský and Jana Radošinská, IntechOpen, DOI: 10.5772/intechopen.79041. Available from:

chapter statistics

630total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Introductory Chapter: Some Notes on Journalism in the Age of Social Media

By Ján Višňovský and Jana Radošinská

Related Book

First chapter

Online Journalism: Current Trends and Challenges

By Ján Višňovský and Jana Radošinská

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us