## Abstract

Evolving networks by definition are networks that change as a function of time. They are a natural extension of network science since almost all real-world networks evolve over time, either by adding or by removing nodes or links over time: elementary actor-level network measures like network centrality change as a function of time, popularity and influence of individuals grow or fade depending on processes, and events occur in networks during time intervals. Other problems such as network-level statistics computation, link prediction, community detection, and visualization gain additional research importance when applied to dynamic online social networks (OSNs). Due to their temporal dimension, rapid growth of users, velocity of changes in networks, and amount of data that these OSNs generate, effective and efficient methods and techniques for small static networks are now required to scale and deal with the temporal dimension in case of streaming settings. This chapter reviews the state of the art in selected aspects of evolving social networks presenting open research challenges related to OSNs. The challenges suggest that significant further research is required in evolving social networks, i.e., existent methods, techniques, and algorithms must be rethought and designed toward incremental and dynamic versions that allow the efficient analysis of evolving networks.

### Keywords

- evolving networks
- social network analysis

## 1. Introduction

One of the consequences of today’s information society is the rise of the digital network society [1]. Organizations and individuals are increasingly connected through a wide range of online and offline networks at different relational levels: social, professional, interaction, information flow, etc. The analysis and modeling of networks and also networked dynamical systems have been the subject of considerable interdisciplinary interest covering a wide range of areas from physics, mathematics, computer science, biology, economics and sociology, in the so-called “new” science of networks [2]. According to [3], media organizations, media content and audiences are no exception: news articles have hyperlinks to link other content; news organizations disseminate news via online social network (OSN) platforms like Twitter and Facebook; and users comment, share and react directly below online news. At the level of news events, recent observation proves that some events and news emerge and spread first using those media channels rather than other traditional media like the online news sites, blogs or even television and radio breaking news [4, 5]. Natural disasters, celebrity news, products announcements or mainstream event coverage show that people increasingly make use of those tools to be informed, discuss and exchange information [6]. Concerning the novelty and timely dissemination of news events, empirical studies show that the online social networking services like Twitter are often the first mediums to break critical natural events such as earthquakes often in a matter of seconds after they occur [4, 5]. Herewith, social networks’ temporal dimension of information is of crucial importance and follows a time decay pattern, that is, posted messages in social media are exchanged, forwarded or commented in early stages and decrease in importance as time passes. The importance of information contained in those messages has their importance peak right after being posted or in the following hours or days [7]. Besides, the timing of many human activities, ranging from communication to entertainment and work patterns is characterized by bursts of rapidly occurring events separated by long periods of inactivity, following non-Poisson statistics [8]. The nature of the time decay pattern communication, bursts or peaks, and inactivity periods, enforces the importance of dynamic network analysis methods and techniques in a network analysis context as the best approaches to model these problems.

### 1.1. Solving problems with evolving networks

Typical tasks of social network analysis involve the identification of the most influential, prestigious or central actors, using statistical measures; the identification of hubs and authorities, using link analysis algorithms; the discovery of communities, using community detection techniques; the visualization the interactions between actors; or spreading of information. These tasks are instrumental in the process of extracting knowledge from networks and consequently in the process of problem-solving with network data. Specifically, in the areas of journalism and investigation the following topics have been discussed actively in the last years: **Criminological research.** In 2001, according to [9], social network analysis has the genuine potential to uncover the complexities of criminal networks. Ref. [10] introduced the intersection of terrorism studies and what was called the “networked criminology”. In 2011, [9] concluded that the applications of social network analysis in criminology were still insufficient when compared to other research areas like sociology and public health [11]. Nevertheless, since then, social network analysis has become one of the major tools for criminal analysis with [12] identifying the three main areas of analysis and applications of social networks in criminological research. Other topics include the *influence of personal networks* on crime and, more generally, on delinquent behavior [13, 11]; the analysis of *neighborhood networks* and their influence on crime [12]; and exploration and *modeling of the organization* of crime, that is, street gangs, terrorist groups and organized crime groups [12]. Concrete examples of criminological research using social network analysis are the Enron company dataset [14]; the Panama Papers analysis in connection with the economic network analysis of Portuguese companies; individuals with connections to offshore companies [15]; and the analysis of terrorist networks [16, 17]. Gill and Malamud [18] presented a broad overview, characterization and visualization of the interaction relationships between natural hazards using social network analysis. In addition to the previous contributions, essential and recent work regarding social network analysis and terrorism were published by Malm et al. [19], while Berlusconi [20] devoted an entire book to social network analysis and crime prevention. **Research on journalism:** Fu [21] presented an essay with the purpose of promoting social network analysis in the study of journalism. Starting with a communication network taxonomy [22, 23], their focus was on network relations in the study of journalism. The proposed framework presents the four types of communication relations that characterize different networked journalism phenomena [23]: *Affinity relations* describes the socially constructed relationships between two actors, such as alliances and friendships, with the valence of the relation being either positive or negative; *flow relations* refers to the exchange and transmission of data, resources and information; *representational relations* focuses on the symbolic affiliation between two entities; and *semantic networks* describes the associations, or semantic relations, among concepts, words or people’s cognitive interpretations toward some shared objects in the network, aiming to build a knowledge base. Another relevant journalism research example is Ref. [24] that examines the temporal dynamics of reciprocity in the setting of legislative co-sponsorship in the 113th US Congress (2013–2015).

### 1.2. Related work

In the past years, several overviews of social network analysis can be found in the literature. Ref. [25] provide a general and succinct overview of the essentials of social network analysis, for static networks, with emphasis on simple statistical measures, link analysis, properties of real-world networks and community detection. Tabassum et al. [26] in an updated overview of social network analysis, based on Oliveira’s and Gama’s [25] work, included a full section devoted to evolving networks. Devoted explicitly to evolutionary network analysis, the Aggarwal and Subbian [27] survey provides an overview of the vast literature on graph evolution analysis. Although the scope of Aggarwal and Subbian [27] work was graph analysis in general, that is, it did not address specifically the social network analysis problem, many of the mentioned applications and particular contexts of applicability of the methods are social networks. The literature analyzed by Aggarwal and Subbian [27] covered both snapshot-based and streaming methods and algorithms, and critical applications of evolutionary network analysis on different domains such as the world wide web, telecommunication and communication networks, recommendation networks and social network events, among others, were given. Spiliopoulou [28] said that the advances on evolution is social networks into a common framework to model a network across the *time axis* and identified the four dimensions associated with knowledge discovery in social networks: dealing with time, objective of study, definition of community, and evolution as an objective versus assumption. Spiliopoulou [28] enumerated the several challenges of the social network streams although this problem is generally conceived as a stream problem. An exciting application of temporal network theory and temporal networks to functional brain connectivity was presented by Thompson et al. [29]; in his work, the theory and methods that introduce the reader on how to add the temporal dimension to network analysis and precisely how many of well-known methods can be transposed from static networks to temporal networks were presented. Thompson et al. [29] included a complete list of network measures adapted or proposed for temporal networks.

## 2. Representation of social networks

A social network is a social structure consisting of a finite set of social actors, such as individuals or organizations, connected by interpersonal relationships. These relationships, also known as ties, can be of personal or professional nature and can range from casual friends, acquaintances or co-workers to the close family bonds. Besides the relations, social networks often represent flow of information, interactions and similarities, among the set of social actors. Social network analysis is the investigation of the relationship between actors. In network terminology, vertices—also known as nodes—refer to actors or subjects. Edges, also known as links or ties, describe the relationship between actors. Usually, this social structure, or network structure, is represented by graphs which are mathematical structures used to model pair-wise relations between objects. A graph in this context is made up of vertices, nodes or points which are connected by edges, arcs or lines. Therefore, a social network is a graph *order*. The total number of edges *size* of the graph *list structures* and *matrix structures*. List structures, such as incidence lists and adjacency lists, reduce the required storage space for sparse graphs. Matrix structures such as incidence matrices, adjacency matrices, sociomatrices, Laplacian matrices and distance matrices are appropriated to represent the full matrices with dimension *undirected graphs* or undirected networks. On the other hand, graphs whose all edges, or arcs, have an orientation assigned are called *directed graphs* or directed networks. Formally, a directed graph *initial vertex* and *terminal vertex*. Depending on the presence of values assigned to the edges or arcs, the distinction between *unweighted* or *weighted* graphs or networks is made. Unweighted graphs or networks are binary by definition. This means that it is only represented by the presence or non-presence of an edge or arc between two vertices. Unless it is explicitly said, we always assume that graphs are unweighted. In weighted graphs, each edge has associated a weight

## 3. From static to evolving networks

Previously, the network types and their representations in a static context were described. In real life, however, many networks are dynamic. As time passes by, new nodes are added to the network, existing ones are removed and edges come and go too. Static networks lack one of the most critical dimensions, that is, the temporal dimension of a network. So by definition static networks are assumed not to change or evolve over time, ignoring the temporal dimension. In this section, we cope with the representation of *evolving networks*. Evolving networks arise in a wide variety of application domains such as the web, social networks and communication networks. In recent years the interest in the area of dynamic social networks leads to new research and the need of analysis of evolving networks. The evolution analysis in graphs has applications in a number of scenarios like trend analysis in social networks and dynamic link prediction—to mention two typical examples. Figure 2 shows the example of a contact evolving network with instantaneous interactions between vertices. When the interaction between network peers has a time duration, we are in the presence of interval evolving networks as shown in Figure 3(b). Assuming that the time

### 3.1. Models of temporal representation

Several models for representing evolving networks are available in the literature. Kim and Anderson [30] introduced the concept of a *time-ordered graph*, and Thompson et al. [29] with a similar conceptual representation chose the *time graphlet* to represent time-varying graphs. Casteigts et al. [31] presented the *time-varying graph* formalism (TVG) with the concept of a journey to catch the temporal information on graphs. Santoro et al. [32] used this formalism to describe several network measures. Although there are differences in the representation and nomenclature, these models are conceptually equal. Figure 4 presents the concept of a time-ordered graph for an example network for the time interval

### 3.2. Timescale of evolving networks

Regarding the evolution of the networks, not all networks evolve equally. Some networks evolve faster than others or have edges that are being added at different rates. Two distinct timescale examples of networks are email networks, where edges are added at the timescale of seconds, and bibliographic networks, where edges are added at the scale of weeks or months. Different time-evolving scenarios require different types of analysis [27]. **Slowly evolving networks**: When networks evolve slowly over time, *snapshot* analysis can be used very effectively. In this case, dynamic networks are discretized in time by converting temporal information into a sequence of **Streaming networks**: When networks are built by a never-ending flow of transient interactions, such as email or telecommunications networks, they should be modeled and represented as graph streams. Graph streams typically require real-time analytical methods. The scenario of graph streams is more challenging because of the computational requirements and the inability to hold complete graphs on memory or disc. Velocity is also an issue because common methods require dealing with graph updates at very high edge rates. Time point

### 3.3. Landmark versus sliding windows

When the temporal dimension is added to the analysis of networks, different methodologies regarding the strategy to cope with data that is being analyzed vary. Figure 5 shows three types of graph data windowing strategies. **Landmark windows** by Gehrke et al. [34] encompass all the data from a specific point in time up to the current moment. In the landmark window, the model is initialized in a fixed time point, the so-called landmark that marks the beginning of the window. In successive snapshots, the data window grows to consider all the data seen so far after the landmark. **Sliding windows** are better suitable when we are not interested in computing statistics over all events of the past but only over the recent past [35]. Datar et al. [36] incorporate a forgetting mechanism by keeping only the latest information inside the window and disregarding all the data falling outside the window. Usually, the sliding windows are of fixed size. The time-based length sets the window length as a fixed time span. Sliding windows can be overlapping and non-overlapping depending on whether two consecutive windows share some data between them or not. From the several window models presented in the literature [37, 38], two basic types of sliding windows are commonly defined: sequence-based models, where the size of the window is determined regarding the number of observations, and timestamp-based models where the size of the window is defined concerning duration. A timestamp window of size

### 3.4. Types of evolving network analysis

Depending on the timescale of the network and the chosen strategy to cope with network data, distinct evolving network analysis methods are available. These methods are divided into one of the following categories [27]. **Maintenance methods**: In these methods it is desirable to *maintain* the results of the data mining process continuously over time. Examples of maintenance methods are classification and clustering. **Analytical evolution methods**: In this case it is desirable to directly *quantify* and *understand* the changes that have occurred in the underlying network. Such models are focused on modeling change. **Bridge methods**: From a methodological point of view, and in the context of a few key problems, an overlapping of maintenance and analytical evolution methods occurs. These *bridge* methods, such as community detection, fall into both categories.

## 4. Elementary network measures

In this section, elementary network measures and popular metric used in the analysis of social networks are presented.

### 4.1. Actor-level statistical measures

Actor-level or node-level statistical measures determine the importance of an actor or node within the network. These measures reveal the individuals in which the most important relationships are concentrated and give an idea about their social power within their peers.

#### 4.1.1. Degree or valency

The degree of valency of a node

For dynamic networks, and generalizing to a directed and unweighted network, the temporal degree *in degree*, denoted by *out degree*, denoted by *strength* is the equivalent to degree but is computed as the sum of the weights of the edges adjacent to a given node (5).

#### 4.1.2. Betweenness

Node betweenness

#### 4.1.3. Closeness

Closeness measures the overall position of an actor in the network giving an idea of how long it will take, on average, to reach other nodes from a given starting node. It is represented by the average length of the shortest path between the node and all other nodes in the graph. Thus more central a node is, the closer it is to all other nodes. In general, it is only computed for nodes within the largest component of the network as shown in (8). The temporal closeness is defined by considering

#### 4.1.4. Eigenvector centrality

For a given graph

#### 4.1.5. Laplacian centrality

The Laplacian centrality permits to consider *intermediate* environmental information around a vertex or node to compute its centrality measure. The centrality of some vertex

#### 4.1.5.1. Locality of the Laplacian centrality

The Laplacian centrality metric is not a global measure, that is, it is a function of the local degree plus the degrees of the neighbors (with different weights for each). Qi et al. [17, 51] show that local degree and the 1-order neighbors’ degree are all that are needed to calculate the metric for unweighted networks (Figure 6).

#### 4.1.5.2. Dynamic Laplace centrality

Regarding the original Laplace centrality algorithm proposed by Qi et al. [51], despite being a static algorithm, it can be used to calculate centralities in changing networks. This is true by considering full calculations of the centralities for each network snapshot. In Sarmento et al. [52] proposal, Qi et al. [51] principles were adapted and resulted in two incremental algorithms. The incremental Laplace algorithm by Sarmento et al. [52] presents better computational efficiency, by performing careful Laplace centrality calculations only for the nodes affected by the addition and removal of edges in each one of the snapshots. Thus, it reuses information of the previous snapshot to perform the Laplace centrality calculations on the current snapshot, for unweighted networks only.

### 4.2. Network-level statistical measures

Before describing network-level statistical measures, it is important to describe three fundamental concepts that are common to static and dynamic networks. **Path** represents a sequence of nodes in which consecutive pairs of non-repeating nodes are linked by an edge. When adding the temporal dimension of dynamic networks, the concept of path slightly changes, because non-repeating nodes are now considered only within the same snapshot or time step in what is called a temporal path. Temporal paths can have repeating nodes in different time steps, for example, **Geodesic distance**, or the shortest path, between nodes **Eccentricity** is the greatest geodesic distance between a given vertex

#### 4.2.1. Edge bursts

A hallmark of a bursty edge is the presence of multiple edges with short interconnect times, followed by longer and varying interconnect times [29]. One of the methods available to quantify bursts is the burstiness coefficient

#### 4.2.2. Fluctuability

As discussed before, centrality measures provide information about the degree of temporal connectivity while bursts describe the distribution of the temporal patterns of connectivity at the node level. Also, *fluctuability* can be used to retrieve information about the global state of a temporal network, in this case, the quantification of the temporal variability of connectivity [29]. The fluctuability

#### 4.2.3. Volatility

The *volatility*

#### 4.2.4. Reachability latency

Reachability measures, like *reachability ratio* and *reachability time*, focus on estimating the time taken to reach the nodes in a temporal network [55]. While the reachability ratio is the percentage of edges that have a temporal path connecting them, the reachability time is defined as the average length of all temporal paths. When applying these reachability measures to most real-world networks, if we consider a sufficient time interval, any vertex or node of the networks can reach all the others within that time span. With this assumption in mind, Thompson et al. [29] defined the *reachability latency*, which quantifies the average time it takes for a temporal network to reach an a-priori-defined reachability ratio as defined in (19), where *temporal diameter* of the network [56].

#### 4.2.5. Temporal efficiency

For static networks, efficiency is computed as the inverse of the average shortest path for all nodes [29]. *Temporal efficiency*, at first, is calculated at each time point as the inverse of the average shortest path length of all nodes; subsequently, these values are averaged across time points to obtain an estimate of global temporal efficiency as shown in (21).

#### 4.2.6. Diameter and radius

The diameter

#### 4.2.7. Average geodesic distance

The average geodesic distance *characteristic temporal path* length as the natural extension of the average geodesic distance to time-varying graphs. It is defined as the average temporal distance over all pairs of nodes in the graph as shown in (23), with the temporal distance

#### 4.2.8. Average degree

The average degree is the mean of the edges of all vertices in a network for all time steps

#### 4.2.9. Reciprocity

For static networks, reciprocity

#### 4.2.10. Density

Density

#### 4.2.11. Global clustering coefficient

Cui et al. [60] propose two definitions of the temporal clustering coefficient of a temporal network. The definitions are temporal-delayed clustering coefficient and the temporal-weighted clustering coefficient.

## 5. Link analysis

In network theory, link analysis is a data analysis technique used to evaluate relationships (connections) between nodes. Link analysis has been used for the investigation of fraud detection, terrorist networks, computer security analysis, search engine optimization (SEO), market research and medical research, among others. To find the most valuable, authoritative or influential node or the list of nodes in networks, link analysis algorithm were devised to solve this problem in the past. By exploring the relationship between links and the content of web pages, the PageRank algorithm [45] is one of the seminal methods employed to the build of modern and efficient search engines and the information retrieval system in the web.

### 5.1. Incremental PageRank algorithm

There have been several attempts to improve the original PageRank algorithm [45]. The purpose of several of these improvements was to adapt this algorithm for streaming data. In the original algorithm, each page rank is dependent of the ranks of the pages pointing to it. The PageRank value of a page

where

Desikan et al. [49] provided a solution for the update of nodes’ PageRank values in evolving graphs in an incremental fashion. The algorithm explores the fact that the web evolves incrementally and with small changes between updates. Figure 8(b) shows the two partitions created; one of them, partition P, is unchanged since the last computation, and it has only outgoing edges to the other partition. The other partition, partition Q, is the rest of the graph, which has changed since the last time the metric was computed. The principal idea is to find a partition P in a way that there are no incoming links in the graph from the other partition Q (includes all changed nodes). Then the computation of the PageRank of partition Q can be done separately, scaled and merged with the rest of the graph to get the updated PageRank values of the vertices in this partition. The PageRank of partition Q is computed, taking the border vertices that belong to partition P and have edges pointing to the vertices in partition Q. The PageRank values of partition P are obtained by simple scaling, due to the addition of new nodes. Let the graph of Figure 8(b) at the new time be

### 5.2. Link prediction

To understand the association between two specific nodes, researchers commonly study the dynamics of evolving graphs. In link prediction, the problem we wish to solve is the prediction of the likelihood of a future association between two nodes, knowing that there is no association yet between the nodes, that is, no edge between the nodes. Link prediction is used in bioinformatics, where potential protein connections are inferred from known connections, and during the research of terrorist/criminal networks, where potential criminal connections are inferred from current knowledge of the relationships between criminals. Link prediction is a complex problem. For a social network *m* ≋ *n*. Thus, in limit situations, with a high amount of nodes, we have a

#### 5.2.1. Common neighbors

Newman has verified a significant correlation between the number of common neighbors of

#### 5.2.2. Jaccard coefficient

The Jaccard coefficient is a common metric in measuring the similarity between different samples. It is used throughout validation tasks in information retrieval research. It measures the probability that both

where

#### 5.2.3. Adamic/Adar

This measure—also called frequency-weighted common neighbors—refines the simple counting of common features in

#### 5.2.4. Preferential attachment

Another concept, this time with lower complexity, is the preferential attachment [68]. This metric is only in need of node degree information. The intuitiveness is that those nodes with a higher degree have more probability to connect to each other than with a neighbor with a lower degree.

#### 5.2.5. Katz

The Katz concept [46] is based on the assumption that the closer connected nodes are with a higher number of paths in the network, and these nodes will have more probability of connecting in the future. The concept is also called “exponentially damped path counts”.

where

#### 5.2.6. Recent developments

Ibrahim and Chen [69] present a method for link prediction in dynamic networks by integrating temporal information, community structure and node centrality in the network providing greater weights for frequently occurring links. Wahid-Ul-Ashraf et al. [70] described the parallelism between Newton’s law of universal gravitation and the link prediction tasks. To apply this law, the authors attributed nodes with the notion of mass and distance. Node centrality could be considered as mass, and the authors inclusively tested this concept with degree centrality. The distance between nodes was considered obtainable through several possible methods, that is, by retrieving the shortest path, path count or inverse similarity, by using previously stated measures like Adamic/Adar, Katz score or others. Choudhury and Uddin [71] considered the evolutionary aspects of community network structure. They build dynamic similarity metrics or dynamic features to measure similarity/proximity between actor pairs.

## 6. Community detection

As a consequence of both global and local heterogeneity of edge distribution in a graph, specific regions of a graph evidence the high concentration of edges within particular regions, called *communities*, whereas interregions have low concentrations of edges. In the context of networks, these occurrences of groups of nodes in a network that are more densely connected internally than with the rest of the network are called *community structures*. Also known as *modules* or *clusters*, communities can, therefore, be straightforwardly defined as similar groups of nodes. A complete definition using the concept of density can be the following: communities can be understood as densely connected groups of vertices in the network, with sparser connections between them.

### 6.1. Finding communities in static networks

Fortunato [72] has a comprehensive survey about methods and techniques regarding finding communities. Hierarchical clustering methods can be of two types: agglomerative algorithms, in which clusters are iteratively merged if their similarity is sufficiently high, and divisive algorithms, in which clusters are iteratively split by removing edges connecting vertices with low similarity. **Divisive algorithms**: One of the most known divisive algorithms is the one proposed by Girvan and Newman [73]. The philosophy of divisive algorithms is the idea that a simple way to identify communities in a graph is to detect the edges that connect vertices of different communities and remove them so that the clusters get disconnected from each other. **Agglomerative algorithms**: Examples of agglomerative algorithms are the ones that assume that high values of modularity indicate good partitions. So the partition corresponds to maximum value of modularity on a graph. Therefore a modularity measure

### 6.2. Finding communities in dynamic networks

When discussing methods for finding communities in dynamic networks, the division of methods for slowly evolving networks and streaming networks is consensual [27]. In the following section, an algorithm for both scenarios will be presented and analyzed.

#### 6.2.1. Slowly evolving networks

When moving from static community detection to dynamic community detection, often, static techniques are used to detect communities in evolving network. The Louvain algorithm by Blondel et al. [75] is no exception, and it is still one of the fastest ways to perform community detection on evolving networks by considering individual static snapshots. Frequently employed in dynamic network community detection scenarios by performing individual runs of the algorithm in snapshots of the network, this approach is computationally inefficient and does not allow the tracking of communities in a fine-grained way between static snapshots. The community detection work referenced in the Fortunato [72] survey was later complemented by an incremental community detection algorithm based on modularity and was proposed by Shang et al. [76]. The algorithm applies the principles of events in the life of communities (growth, contraction, merging, splitting, birth and death) as defined by Palla et al. [77] and, in each one of the iterations, calculates the modularity gain of affected communities. This allows to detect and track communities over time in incremental networks. This algorithm only considers the addition of new edges and relies on the original two-step approach used in community detection for static communities. The QCA [78], presented as a fast and adaptive algorithm, provides efficient identification of the community structure of dynamic social networks by allowing the addition and removal of nodes and edges dynamically. The algorithm starts with the initial communities calculated via the Louvain method, and then it applies the adaptive node community changes by considering each node as an autonomous agent demonstrating flocking behavior toward their preferable neighboring groups [79]. The AFOCS [80] community detection algorithm for dynamic networks shares the same principles of QCA being only modified in order to allow the possibility of detection of overlapping communities. A detailed comparison between QCA and AFOCS was presented by Nguyen et al. [80]. Label propagation techniques and specifically speaker-listener label propagation (SLPA) were used in community detection over large networks. LabelRank [81] and GANXiSw [81, 82] used the SLPA technique to perform static network community detection while LabelRankT [83] was designed to handle dynamic networks. Being designed for overlapping community detection, all of the previous algorithms also work in a non-overlapping mode, with satisfactory performance for low overlapping density networks [84].

Cordeiro et al. [85] presented a modularity-based dynamic community detection algorithm. The algorithm is a modification of the original Louvain method where dynamically added and removed nodes and edges only affect their related communities. In each iteration, the algorithm remains unchanged in all the communities that were not affected by modifications to the network. By reusing community structure obtained by previous iterations, the local modularity optimization step operates in smaller networks where only affected communities are disbanded to their origin. The stability of communities is also an improvement over the original algorithm (Figure 10). Given that only parts of the network change during iterations, the non-determinism of the algorithm will have a reduced effect on the community assignment. Most node-community assignments remain unchanged between snapshots, providing better community stability than its counterparts.

#### 6.2.2. Streaming networks

For the cases when a large number of edges, representing interactions, arrive continuously, in some cases at high or very high rates, and are superposed over much larger networks, streaming graph algorithms should be preferred to perform community detection. In streaming scenarios, the ability to perform the deletion of edges in community detection algorithms is important. In short, as discussed in Section 3.3, this will dictate if the method of analysis is to be performed over the sliding window of edges, and therefore edges are deleted from the tail end of the sliding window, or over a landmark window, in case there is no possibility to delete or forget old edges. Several methods were proposed for dynamic community discovery in graph streams. Wang et al. [86] motivated by the variability of the underlying social behavior of individuals over different graph regions modeled the problem according to the so-termed *local heterogeneity*, where a local weighted-edge-based pattern (LWEP) summary is efficiently maintained and used afterward to cluster the graph stream and perform dynamic community detection in weighted graph streams. Taking an almost linear time, Raghavan et al. [87] investigated a simple label propagation algorithm that uses the network structure alone as its guide and requires neither optimization of a predefined objective function nor prior information about the communities. By analyzing the problem of real-time community detection in large networks and having by baseline the algorithm proposed by Raghavan et al. [87] with linear time

## 7. Visualization of evolving networks

The visualization of networks is known to be challenging, and this task gains additional complexity when moving from static to evolving networks. In this section an overview of the methods and techniques is presented, currently used for the visualization of evolving networks.

### 7.1. Challenges of evolving networks’ visualization

The dynamics of social networks remain a challenge regarding visualization [93]. Many researchers argue that traditional graph visualization methods have issues when applied to evolving networks. Additionally, the application of conventional node-link methods to large-scale networks provides low-quality cluttered insights. The overlap of nodes in these conditions is not appropriated when trying to extract information from the network. Zaidi et al. [94] and Aggarwal and Subbian [27] presented an overview of the different techniques and methods that exist for the analysis and visualization of dynamic networks. It included the discussion of the basic definitions, formal notations and a set of the most important and recent work regarding analysis and the visualization of dynamic networks. While static graph visualizations are often divided into node-link and matrix representations, Beck et al. [95, 96] presented a hierarchical taxonomy of dynamic graph visualization techniques. This survey about the state of the art in visualizing dynamic graphs identified the representation of time as the major distinguishing feature for dynamic graph visualizations. Two major visualization categories were found: in one category, graphs are represented as animated diagrams or in a second one, visualizations are a set of static charts based on a timeline. Similar conceptual dynamic network visualization categories were devised by Moody et al. [97], and the authors divide dynamic network visualizations also called as network movies into static flip books, where the node position remains constant but edges cumulate over time and dynamic movies, where nodes move as a function of changes in relations. The graph animation is often used to lower the cognitive effort required to follow the transition from one visualization to the next, according to Brandes and Corman [98]. To facilitate the simultaneous analysis of state and change, a layered three-dimensional network visualization was proposed by Brandes and Corman [98] in which the evolution of the network is unrolled, and each step is represented as a layer. A complex network with a larger number of links may prevent users from recognizing salient structural patterns. To overcome this common problem with visualization, two widely known link reduction algorithms, namely minimum spanning trees (MSTs) and pathfinder networks (PFNETs), were analyzed and compared by Chen and Morris [99]. Bender-deMoll and McFarland [100] propose a framework for visualizing social networks and their dynamics and presented a tool that enables debate and reflection on the quality of visualizations used in empirical research. With the focus on the evolution of communities over time, Falkowski et al. [101] proposed two approaches to analyze the evolution of two different types of online communities on the level of subgroups. This analysis was conducted by observing changes in the interaction behavior of the members of the communities. Chen [102] devised a generic approach for detecting and visualizing emerging trends and transient patterns in scientific literature. Other recent work of interest is presented by Beck et al. [103, 104] and the visualization of evolving graphs with multiple visual metaphors of Burch [105]. The combination of dynamic network visualization with graph sampling techniques is often used [106].

## 8. Conclusion and future trends

This chapter provided an overview of the methods and techniques for modeling, analyzing, measuring and visualizing evolving social network analysis. In the past, static techniques were adapted to dynamic networks with relative success, but nowadays, with the advent of social media, scale and velocity of most of those static techniques reveal weaknesses that only can be addressed by methods and techniques designed for dealing with evolving data. After presenting two areas of direct applicability of evolving network analysis such as criminological research and research on journalism, the ways on how dynamic networks can be represented and modeled according to their timescale, windowing strategies and methods of analysis were discussed. These theoretical aspects were then used to present elementary network measures, link analysis methods, community detection methods and visualization techniques. It is clear that in recent years this area of research will continue to have significant development in the future, several problems are still unsolved and many of them can be significantly improved. The areas of applicability of evolving networks and social network analysis are also broader, with many of the abovementioned techniques moving from well-succeeded areas like world wide web, communication, telecommunication and mobile networks, to newer areas like social network recommendations, news and blog analysis and social network event detection. Specifically in the area of social network event detection, the detection of unusual patterns, anomalies or changes in trends in the social streams can lead to valuable information, which can be used timely in many real-word scenarios [107]. Cordeiro [108] addressed the monitoring and tracking of the dynamics of social network communities with the objective to unveil real-world events, whereas Cordeiro [109] was devoted to the problem of mining the twitter stream to unravel events, interactions and communities in real time. Future trends of social network analysis will continue to be driven by future trends and characteristics of the network data, such as the size of data, which is incredibly getting large, and changes in space and time. On one side, there is the urge for scalable and efficient social network analysis methods, and on the other side, there is the need for methods to study the dynamics and evolution of social networks, able to deal with future velocity and timescale dimensions of the network data. Stray [110] focused network analysis as a tool to bridge the “research to reporting” gap in journalism, starting with two use cases (Seattle Art World [111] and Hot Wheels [112]) and the recent state-of-the-art network analysis and visualizations applied to the Panama Papers case [113] where graph databases and entity recognition were used to build interactive network maps from structured data and raw documents. Therefore, it is expected that the study of evolving networks will continue to be a significant strand of research in the context of social network analysis in the near future.

## Acknowledgments

This work was fully financed by the Faculty of Engineering of the Porto University. Rui Portocarrero Sarmento also gratefully acknowledges funding from FCT (Portuguese Foundation for Science and Technology) through a PhD grant (SFRH/BD/119108/2016). The authors also want to thank the reviewers for the constructive reviews provided in the development of this publication.