## 1. Introduction

Ecologists have been interested in spatial and temporal dimensions of ecological processes in plant populations for a long time. While data collected in most ecological studies have spatial and temporal aspects, the importance of spatio‐temporal analysis has been discovered recently. As stated in Reference [1], until the 1980s, most ecological studies avoided the explicit consideration of space and most of the field experiments were designed to remove spatial signals. Techniques such as randomization and block designs were especially common in use.

During the 1980s, there was a fundamental shift in ecology toward spatial explicit consideration of relationships between organisms. Among factors favorable to use spatial analysis in ecological studies one can distinguish the needs to include spatial structure of natural systems in ecological theories, changes in landscapes altering ecosystems, and the needs to evaluate their spatial heterogeneity and—which was most influential—to develop a modern technology which increase possibilities in analyzing a large spatio‐temporal data sets together with the development of specific statistical methods (e.g., point process statistics), technology (e.g., LIDAR), and software dedicated to spatial analysis [2]. The third factor allowed to analyze, model and visualize a complex spatial relationships between organisms even in rather complex biosystems, like tropical forests. Thus, at present, spatial analysis has been one of the most rapidly growing fields in ecology and it is now related directly to a growing awareness among researchers that spatial structure of populations (e.g., forest trees) is important in ecological thinking.

An important concept related to biological structures includes self‐organization, structure relations, and pattern recognition [3]. Self‐organization involves a variety of interactions between individuals (e.g., competition, facilitation), which can modify their growing spaces and spatial niches. Ecological processes leave signs in the form of spatial patterns but the spatial structure of the system can determine its properties, as well. In a forest, for example, population structure affects the biomass production, biodiversity, and habitat functions. Pattern recognition plays thus an important role in forest ecology and usually helps to identify and link spatial patterns with corresponding properties of population [1, 4–7].

The questions tried to be answered on the basis of spatial analysis often revolve around identifying the potential causes, e.g., ecological processes and mechanisms, staying behind the observed arrangement of individuals in the population [1, 8]. Historically, spatial analysis based on point pattern statistics provided only the assessment whether the empirical pattern of the studied population emerged by chance, which meant that individuals’ occurrence did not depend on the presence of others, and the probability of the occurrence was the same across the whole study area. This expectation is called complete spatial randomness (CSR). Two alternatives to CSR are individuals that are distributed according to the specific mechanisms promoting either their overdispersion (aggregations, clumping) or underdispersion (regularity) [1, 8]. Nowadays, modern spatial statistics, e.g., point pattern analysis, allows us to find out more detailed information on spatial relationships between individuals in the investigated population. Some complex null models, such as Cox and Gibbs processes, can be helpful for that. In general, cluster models of Thomas, Neyman‐Scott, and Matern, being representatives of Cox processes, provide detailed information on the average cluster size and the number of clusters per unit area. On the other hand, the Gibbs class of point process models (e.g., Strauss and Markov processes) can characterize inhibition mechanisms between individuals [8]. Point process models, mentioned above, are important tools employed in spatial analyses. Their importance results from their usefulness in determination weather there is any significant spatial structure in empirical data, they can summarize the properties of the spatial structure and test ecological hypotheses concerning the mechanisms that may generate the observed spatial structure in a data set [1].

Fundamental ecological questions arising in forestry concern the forest structure and its influence on forest dynamics, forest productivity, and biodiversity [9–12]. This refers to the way in which the attributes of trees (species, sizes) are distributed in the forest.

It affects most ecological processes running in the forest ecosystem, among which forest regeneration, tree growth, surviving and mortality, seed dispersal, competition, or facilitation between individuals are especially important (**Figure 1**). Moreover, most of biological processes generate themselves the specific structures. Thus, the structure‐processes relations are not independent. Forest dynamics depends thus to a large degree on the forest structure.

This chapter is divided into the following subchapters:

Data types—what should be known before running the spatial analysis.

Patterns and processes—the mutual dependence causes some inferential problems.

Spatial indices—an easy way to describe population structure.

Functional spatial statistics—the most informative way to discover complex structures.

Conclusion

## 2. Data types—what should be known before running the spatial analysis

Generally, the aim of spatial analysis is to describe the structure of the pattern created by objects distributed in space. Each object is usually treated as a point, regardless their real shapes and point pattern statistics seem to be valuable tools in such analysis.

As mentioned above, most data collected in ecological studies can be characterized by spatial dimensions. However, data can be of different types and selection of the appropriate statistical method (the so‐called summary statistics) depends on two things: the data we want to analyze and ecological questions we want to answer [8, 13]. Individuals being the subjects of spatial analysis are usually characterized by their location (*x*, *y* coordinates) and additionally by their different attributes, quantitative, or qualitative ones (e.g., size, species, sex, quality, health status, and age). It is also possible to use as a tree attribute any constructed mark [14].

Individuals described only by coordinates can be analyzed as the so‐called unmarked point pattern, while data described by any mark are suitable to analyze as the marked point pattern [8, 15]. The appropriate summary statistics (indices and functions) that quantify the statistical properties depend on the form of the data type one collected in the field. Another important issue associated with the point pattern analysis is the heterogeneity of environment conditions. In ecology, heterogeneity plays an important role and its quantification seems to be a key task in spatial analysis. To do that, the information on environmental covariates (soil quality, slope, aspect, etc.) should be incorporated in analysis [16].

In the unmarked point pattern analysis, one would like to characterize the spatial relationships between objects, e.g., trees in the forest. Moreover, the unmarked pattern may include one or more types of individuals. The analysis of such point patterns concerns the following basic categories: univariate, bivariate, and multivariate point patterns [1, 15]. Univariate point pattern analysis is focused only on one type of points, e.g., particular tree species. The questions to be answered are about the understanding of the mechanisms (processes) responsible for the distribution of the individuals within the study area. The fundamental null model for the univariate analyses is the complete spatial randomness and it is called the (homogeneous) Poisson model. According to CSR, points are distributed with equal probability within the region of interest and each point is distributed independently of the others. The alternatives to CSR are, either aggregated or hyperdispersed arrangement of points. In the case of the bivariate point pattern, two types of points are the subjects of analysis. It is important to keep in mind that these two types of points must be created by two different processes [8]. Such points have the so‐called a priori properties [16]. Good examples of bivariate point patterns in forest studies are analyses of spatial correlation between two different tree species or live stages (adults and juveniles). In the case of the bivariate pattern, the null model is spatial independence of two patterns and the alternatives are spatial attraction (positive association) and spatial repulsion/segregation (negative association). The main question is focused on the role of interactions between two types of points. Bivariate analysis can support the theory of species coexistence in multispecies forests [17–22]. In multivariate point pattern analysis, several data types (e.g., tree species) are involved and each of them is created by different processes. The relevant ecological questions for such data types involve detecting and understanding spatial structures in diversity, namely whether tree species tend to form intraspecific and interspecific structures or different tree species tend to be well mixed over the study region. According to the hypothesis of spatial segregation in promoting the species coexistence, for example, intraspecific clusters for a certain species are responsible for the interspecific segregation [1, 16, 23].

In spatial analysis mentioned above, points of similar or different type were characterized only by their location. If we describe each point additionally by any mark (e.g., tree diameter, tree height, and health status), we obtained qualitatively or quantitatively marked patterns. Thus, summary statistics from the so‐called marked point pattern statistics should be used. Qualitative marks are usually created by the a posteriori marking process over the given point pattern. This situation is quite different from the case of the bivariate pattern, created by a priori process. In the case of a qualitatively marked point pattern, one is interested in the characteristics of the process distributing the marks over the pattern. The relevant null model for qualitative marks is a random labeling (or independent marking) model, in which marks are shuffled in a random way over the joined pattern [1, 15]. In the case of quantitative marks, the relative ecological questions are about the spatial correlation of marks created a posteriori, too [7, 24, 25]. Such analysis can reveal, for example, the importance of competition (or cooperation) between trees in the dependence on the distance they are apart from each other.

**Figure 2** presents major characteristic of the forest structure and its important variables.

## 3. Patterns and processes: complex mutual dependence

As mentioned above, the natural processes and mechanisms leave some traces in the spatial pattern of individuals occupying a certain area [6]. These traces encompass different aspects of population structure: species composition and species mixing, spatial arrangement of individuals and spatial variation of their size [26, 27]. To understand the functional processes it is needed to identify the structure and spatial scales at which processes operate. Spatial patterns in plant populations, e.g., forests, determine their integrity, functionality as well as stability to the large extent [1, 5, 9, 10, 16, 26, 28].

In ecological studies, there are numerous examples of the attempts of inference the underling processes from the observed patterns (structures). Spatial patterns of any population can be treated as an “ecological archive” in which the past ecological processes are conserved [16]. Decoding the signals from spatial patterns is still challenging due to the complex relationships between the pattern and the structure of plant population. Some potential problems arise from the fact that different processes can generate the same spatial pattern or they may interact. The processes may also be the result of the specific spatial patterns (spatial structures). Moreover, nonrandom processes can also generate random pattern [1, 6, 9, 27–31]. The inverse situations—that means a nonrandom process can create structured patterns—can be true either. Different processes do not have to interact simultaneously and a single process can generate exactly a single pattern [32].

The appropriate use of null models in spatial analyses, as well as complete description of the properties of the observed spatial pattern, allows us to minimize the problems stated above. One possibility to solve them is the use of several summary statistics simultaneously. The more structured population, the more number of summary statistics should be used in description of the pattern [33]. However, the use of a single or two summary statistics are the most common in the literature [16]. Historically, only a single null model, namely CSR, was used to state if the population is randomly distributed or not. Now, there are much more null models available for better analysis [8, 34–37].

In forests, spatial patterns revealed by trees are usually the result of three main biological processes: tree growth, mutual interactions, and mortality [14]. All these factors influence the forest dynamics and also its structure at the subsequent forest development stages. Tree growth can be impeded or “accelerated” due to different ecological processes and the neighborhood effect is among others [32]. Competition processes are difficult to measure directly; however, its effect on the tree growth and survival can be studied by spatial pattern analysis. Distance‐dependent mortality of trees has been quite frequently referred to as a consequence of density‐dependent competition, and this process frequently leads in crowded population to a more regular distribution of surviving trees [4, 38–40]. The relationships between small and large trees may be more complex. Small trees may tend toward aggregation around large trees because of better moisture conditions around larger trees or they tend to be segregated from large individuals because of poor light regimes for their growth and development [41]. In multispecies forests, interspecific competition may be reflected by spatial segregation of different tree species, and it is extremely important for weaker competitors allowing them to survive [42]. Thus, heterospecific segregation promotes species coexistence in mixed forests [1].

## 4. Spatial indices: an easy way to describe population structure

Spatial explicit indices can be divided into three main groups: quadrat counts, distance‐based, and angle‐based indices. Great advantage of the use of spatial indices is related to the fact that they are easy to calculate and results can be interpreted easily. However, the use of indices usually does not allow to draw conclusions on the spatial pattern of individuals at different spatial scales, but results can be interpreted only at a single scale, e.g., nearest neighborhood [3].

### 4.1. Quadrat counts

A quadrat counts method is based on counting points in subareas (quadrats) located in the particular region of interest [8, 43, 44]. This method is the oldest and the simplest measure of the pattern and intensity of population. The simplicity results from the fact that only the number of objects (trees) in a quadrat is obtained and there is no need to know the exact position of them. However, it limits the statistical analysis. The disadvantage of quadrat counts method is that the dispersion of the objects may depend on the scale of the study and the size of the sample unit [37, 43].

#### 4.1.1. Variance‐mean index (VM)

The most common index that can be applied to quadrat counts is the index of dispersion, also called the variance‐mean ratio, and being based on the Poisson distribution. For the random distribution of points (followed the Poisson distribution), the index VM = 1. If points are aggregated then VM > 1 and if they are evenly scattered, thus regularly distributed, the index VM < 1 [43, 45–47]. In the first case, the variability in the process is stronger than in the Poisson process, and in the second case—the variability is smaller. For statistical inferences about the significance of the deviation from 1 (randomness), *χ*^{2} test for *n* − 1 degrees of freedom can be used (*n* is the number of quadrats).

#### 4.1.2. Morisita index (I_{M})

Another easy‐to‐calculate index related to the quadrat counts method is Morisita's index, *I _{M}*, calculated from the number of objects on the squares, the number of squares and the total numbers of individuals [9, 43]. The standardized index takes the values from

*I*ϵ{−1, 1} using either of two values calculated from

_{M}*χ*

^{2}test with

*n*− 1 degrees of freedom. If

*I*< 0 then points within the population are distributed regularly, while

_{M}*I*> 0 indicates the aggregated spatial structure [43]. Random distribution of individuals is for

_{M}*I*= 0. The standardized index is assumed to be a very good measure of the spatial pattern because it is not affected by the population density and sample size. This index was applied in References [48–51].

_{M}**Example 1**

To illustrate the application of the Morisita index, data sets from an old‐growth oak‐dominated (*Quercus robur* L.) forest, located in western Poland will be used. **Figure 3** presents the stem map of the forest. Only hornbeam (*Carpinus betulus* L.) was taken into consideration for IM calculations.

The dependence of the spatial point pattern on the spatial scale on the basis of the Morisita index is presented in **Figure 4**. The pattern was divided into 2·2 quadrats, then 3·3, 4·4, etc., *I _{M}* index was calculated for each quadrat.

Results indicated that trees belonging to this species were distributed in clumps (*I _{M}* > 1), especially at small spatial scale. The larger spatial scale, the lower clumping intensity was observed.

### 4.2. Point pattern statistics

Spatial point pattern analysis is based on the data sets consisted of objects with known locations. Modern ecological analyses are mainly based on point pattern (process) statistics and objects being the subjects of analysis are represented by points and marks describing them. In this subchapter, the most common and powerful methods are briefly described and they are supported by examples based on the real data sets from forest ecosystems. For the readers convenience, mathematical concepts are omitted in this chapter but they can be found in many textbooks on spatial statistics, e.g., in Refs. [1, 8, 37, 44, 52, 53].

#### 4.2.1. Spatial arrangement

### 4.2.1.1. Distance‐based indices

Spatial structure of a forest is largely determined by the relationships between close neighbors, thus, the neighborhood scale seems to be very important. A group of methods called the nearest neighbor statistics are based on the relative positions of individuals in the population [27]. Different indices from this group can provide the information on the different aspects of spatial structure: spatial arrangement of trees, spatial differentiation of their sizes, spatial mingling of tree species, etc. Some of them require an exact position of each tree in the population and the others require the position of only a sample trees. Distance within this group can be measured between the sample point to the nearest tree and from tree to its nearest neighbor [54].

### 4.2.1.1.1. Clumping index of Clark‐Evans with Donnelly's modification (CE)

This index was introduced by Clark and Evans in 1954 and then it was modified using an edge correction formulae [55]. This index has been historically the most commonly used in spatial pattern analysis due to its simplicity and easy interpretation. The index is based on the distances between the nearest neighbors, measured for each tree within the population under investigation. It is a measure of the extent to which the population being analyze deviates from the random one. For randomly dispersed population CE = 1. If individuals are distributed in clumps then CE < 1, if they are dispersed regularly then CE > 1 [56] and for two alternative pattern type it is CE > 1 (regularity) and CE < 1 (aggregated). The maximum value of CE index is CE∼2.15 for a hexagonal distribution of individuals [55–58]. The significance of the departures from 1 can be obtained by using a standard, normally distributed test value [59]. This author argued that the special attention with the application of the CE index should be drawn in populations where clustering is likely to be present. Then, other indices are assumed to provide more reliable results. Another weakness of the CE index is that it assumes that the process generating tree location is homogeneous and in the case of spatial variations of point density this index will show the virtual aggregation [37].

### 4.2.1.1.2. Hopkins‐Skellam index of dispersion (HS)

This index, unlike CE, takes the nearest neighbor distances between the randomly sampled points and the random object of the pattern (e.g., tree). The pattern is random when points are independently distributed from each other and the distance from the data point to its nearest neighbor should have the same probability distribution as the distance from a fixed spatial location to the nearest point of the pattern [43, 37]. This index, similarly to the CE index, is dimensionless. For random population HS = 1, for aggregated structure HS < 1 and for regularly spaced individuals, HS > 1. The HS test compares the value of the index to the *F*‐distribution. Hopkins‐Skelam index is less sensitive than CE due to edge effect bias and spatial inhomogeneity [37].

### 4.2.1.2. Angle‐based indices

Both indices described above require the measurement of the distances that is rather time consuming and laborious. For this reasons, two indices based on angles between nearest neighbors, namely, contagion index and mean directional index, have been introduced by Corral-Rivas et al. [60] and Aguirre et al. [61], respectively. Their basic idea is to characterize the spatial pattern of trees at the neighborhood scale by the directions under which the *n* neighbors of the so‐called reference point were visible. Each point of the pattern takes a role of reference point.

### 4.2.1.2.1. Uniform angle index (also known as contagion index) (UAI)

This index is based on the classification of the angles *α _{ij}* (

*i*refers to the reference tree and

*j*to its neighbors) between two neighbors. It compares these angles with an appropriate reference angle,

*α*

_{0}, which is selected so that it yields 360°/

*n*[10]. The contagion is defined as the proportion of angles

*α*between the four neighbors, which are smaller than

_{ij}*α*

_{0}, and the index takes the values between 0 (regularity) and 1 (clumping). In the case of four neighbors, UAI can take five values: 0.0, 0.25, 0.5, 0.75, and 1.0. Mean values for a stand are an arithmetic mean of all UAI values calculated for each trees. Mean values of UAI > 0.6 indicate clumped distribution and UAI < 0.5—regularity [9, 10, 62]. More informative than the stand average value is the distribution of UAI that provides detailed information how many trees are arranged in clumps and how many trees are distributed randomly or regularly. As stated above, this index is a suitable tool when the number of points exceeds 100 individuals [61].

### 4.2.1.2.2. Mean directional index (MDI)

This index is more conventional that the previous one and more accurate angle measurements are necessary, but still no distances should be measured. Usually, values obtained by MDI index correspond well with values obtained by the UAI index. If trees are distributed in regular manner MDI = 0 and if they are distributed in clumps—MDI takes larger values. The mean MDI index for the stand can be also calculated. The value of the MDI index for a random population is exact 1.7999 (∼1.8). Thus, values MDI > 1.8 indicate aggregated structure and MDI < 1.8—regular distribution of individuals. This index is suitable in the case of the populations with the number of individuals exceeding 50 objects [61, 63].

**Example 2**

The application of spatial indices is supported by real data set collected from old‐growth oak‐dominated (*Quercus robur* L.) forest, located in western Poland. **Figure 3** presents the stem map of trees in the forest located in the nature reserve in Poland. This forest has been excluded from any human interventions since the last 50 years. The main tree species is pedunculated oak (overstory), and hornbeam (*Carpinus betulus* L.) in the understory. The age of oaks was approximately 160 years and hornbeam ca. 70–90 years. Each tree was described by its coordinates and marks: diameter at the breast height (*dbh* in cm) and the total tree height (*h* in m). **Table 1** presents the values for nearest‐neighbors indices (CE, HS, UAI and MDI) for all trees and for each tree species, separately.

Spatial measure | All trees | Pedunculate oak | Hornbeam |
---|---|---|---|

CE | 0.90* | 1.10 | 0.90* |

HS | 0.65* | 0.89 | 0.56* |

UAI | 0.51 | 0.50 | 0.51 |

MDI | 1.89* | 1.99* | 1.88* |

*Significant departures from CSR at the α = 0.05.

Both distance‐based indices, CE and HS, clearly indicated clustering of all living trees. In the case of angle‐based indices, only MDI was consistent with results obtained by distance‐based ones. The UAI showed random distribution of living trees. Oaks showed random distribution and it was confirmed by CE, HS, and UAI indices but not by MDI. The latter showed their clumped distribution. The spatial pattern of hornbeam was also clumped and most indices confirmed that, except UAI. On the basis of obtained results, one can state that the spatial pattern of trees in the forest density of hornbeam, easily regenerated from sprouts.

#### 4.2.2. Spatial variation in size: spatially explicit size differentiation indices

Apart from the spatial arrangement of trees, tree size differentiation is assumed to be an important characteristic describing population diversity. Two commonly applied spatial indices seem to be interesting: size differentiation index and (relative) dominance index.

### 4.2.2.1. Size differentiation index (T)

This index describes the similarity or dissimilarity of size of individuals being the nearest neighbors. The neighborhood of the reference tree consists of three or four neighbors of a reference tree. The *T* index is a single value calculated for each tree within the population and an arithmetic mean gives the information on the average size differentiation of trees in the forest. In extremely high structured population the value of *T* = 1, whereas in population where individuals are quite similar it is close to *T* = 0. The arithmetic mean provides the general insight into structural diversity of the forest, at the stand level. However, more informative is the share of trees belonging to the particular differentiation classes: 0–0.30, very small differentiation; 0.30–0.50, moderate differentiation; 0.50–0.70, high differentiation; 0.70–1.00, very high differentiation [10]. To find out if the departures from the expected value of *T* under the random conditions are statistically significant, a permutation procedure can be applied.

### 4.2.2.2. Size dominance index (D)

This index aims at the description of the relative dominance of a given tree to its nearest neighbors. It can be defined as the proportion of the *n* neighbors of a reference tree, which are smaller in size than the reference tree [62, 64]. If four neighbors are taken into consideration, *D* index can take again five values corresponding to different biosocial categories according to Kraft's crown classification: 0.00, very suppressed (all neighbors are smaller than the reference tree); 0.25, moderately suppressed; 0.50, codominant; 0.75, dominant; 1.00, strongly dominant (none of neighbors are smaller than the reference tree).

**Example 3**

**Figure 5** presents the location of trees in a managed old‐growth beech‐dominated (*Fagus sylvatica* L.) forest. The main tree species was European beech and silver fir (*Abies alba* L.) was admixture species. Both tree species occurred in the overstory. The average age of the forest was 145 years. Up to the year of measurements, the forest stand has been managed according to Polish standards for beech stands. Apart from the location of each live tree in the stand (*x*, *y* coordinates), diameter at the breast height (dbh, in cm) and the total tree height (*h*, in m) were measured and tree species were reported.

The average diameter and height differentiation index was *T*_{dbh} = 0.33 and *T _{h}* = 0.20, respectively. Results indicated that the diameter of living trees was more differentiated between close neighbors than was observed for tree height. The distribution of trees in the particular differentiation classes showed that the neighbors of ca. 43% of trees were only slightly different in dbh, and 50% of trees was surrounded more differentiated individuals. In the case of tree height, the trend is similar but the differences between nearest neighbors are much less stressed (

**Figure 6**).

The average spatial differentiation index calculated for diameter for beech and silver fir was *T*_{dbh} = 0.32 and *T*_{dbh}=0.37, respectively. In the case of tree height, these indices were *T*_{h} = 0.19 and *T*_{h}= 0.26 for beech and fir, respectively. **Figure 7** shows the distribution of trees in the particular size differentiation classes. Trees of both species showed more or less similar distribution in particular size differentiation classes in the case of both tree attributes.

Trees showed small to moderate diameter differentiation in the neighborhood scale (ca. 90% of trees). At the same time, height differentiation of nearest neighbors was clearly lower and most trees showed small differentiation around (ca. 83% of trees) (**Figure 6**). In general, the diameter was more differentiated than the tree height for both tree species in the forest.

Dominance criterion is useful for describing the relative dominance of different tree species, for example European beech and silver fir from example data set presented here. The distribution of beech is left‐skewed meaning that the majority of trees of this tree species are surrounded by at least three bigger neighbors. However, there are few dominant beech trees. Similar constellation was observed in the case of silver fir (**Figure 8**).

#### 4.2.3. Spatial mixing of species

The third aspect of spatial structure is attributed to the relative mingling of different species in plant community. Two indices can be taken into consideration: species mingling index introduced by von Gadow and species segregation index introduced by Pielou [65].

### 4.2.3.1. Species mingling index (MI)

This index describes the spatial distribution of different tree species around the reference tree [10, 27, 64, 66]. It is determined for each individual (reference tree) within the population and it gives the proportion on the nearest neighbors (e.g., 4), which are not of the same species as reference tree is. The index takes values between 0 and 1 and if four neighbors are taken into account, five values of MI can be obtained: 0.0 (all neighbors are of the same species as reference tree), 0.25, 0.50, 0.75, and 1.0 (all neighbors are of different tree species as reference tree). Similarly to previously describe indices, the distribution of MI provides a more detailed insight into species composition of the forest. To find out whether departures from the random mixing are statistically significant, a permutation procedure can be applied.

### 4.2.3.2. Species segregation index (SSI)

This index describes the relative mixing of only two species regardless of their spatial pattern. If there are more than two species in the population, each pair of species should be analyzed separately. The SSI index is based on the comparison of the observed number of mixed species pairs and the expected number if the two species would be distributed independently of each other [9, 59, 67]. The SSI values can lie between −1 and 1. Two species are associated together (aggregated) if SSI < 0 and they are segregated if SSI > 0. They are randomly distributed from one another if SSI = 0 [59]. A*χ*^{2} test may be applied to judge the significance of the departures from random mixing of both species.

**Example 4**

Let's go back to the oak‐dominated old‐growth forests introduced earlier (see Example 1). Two tree species are present in the stand. The average value for the mingling index (MI) is small, MI = 0.13, suggesting that tree species are distributed in a homogeneous patches. In the case of oak, MI = 0.40 indicating that they are distributed in heterogeneous clumps, while hornbeams are distributed in homogeneous patches (MI = 0.06).

As shown in **Figure 9**, the trees form mostly homogeneous patches. About 70% of trees are surrounded by the same tree species. It is caused mostly due to the hornbeam. About 80% of individuals of this tree species are surrounded by conspecifics. The surroundings of oaks are mostly heterogeneous and three of four neighbors of this tree species (70% of oaks) are of different species.

Applying the Pielou's segregation index (SSI), we obtained only limited information on the probability to find individuals of one species in the neighborhood of the individuals of the other species. In the example, the SSI index showed random mixing of oak and hornbeam (SSI = 0.25, *p*‐value = 0.25).

## 5. Functional spatial statistics: the most informative way to discover complex structures

A great advantage of the use of simple indices described above is their simplicity in calculation and easy interpretation. However, the modern point processes statistics functions, which depend on the distances between all points of the pattern or distances between the nearest neighbors, are commonly used at present. Thus, functional summary statistics characterize a pattern as a function of scale. Depending on the data type, ecological questions to be answered and hypotheses to be tested, different functional summary statistics can be selected.

### 5.1. Nearest‐neighbor distance‐based distribution functions

There are a few functions that are able to quantify the spatial distribution of individuals as random, regular, or clumped. This is an important aspect of spatial structure of any population.

#### 5.1.1. Nearest‐neighbor distance distribution function (G‐function)

The *G*‐function is based on the distances from a point of the pattern (e.g., tree) to its nearest neighbor. The values of the *G*‐function are nondecreasing as a function of distance *r*, starting from *G*(*r*) = 0. The nearest‐neighbor distribution function for CSR is easy to calculate and it is equal to G(r)=1-exp(-λπr^{2}). The empirical *G*‐function is plotted against the theoretical expectation and it indicates how the individuals are spaced in the population. Clustered arrangement can be stated if *G*_{obs} > *G*_{csr}, and thus the nearest‐neighbor distances between neighbors are smaller than it would be expected under randomness. In the case of regular pattern *G*_{obs} < *G*_{csr}, that is, the distances between nearest neighbors are larger than under random distribution [37, 68, 69].

#### 5.1.2. Empty‐space function (F‐function)

The *F*‐function characterizes the empty space in a pattern, and it is also known in the literature as the spherical contact distribution function. The function is based on the distribution of all distances between arbitrary selected points, but not the location of any point of the pattern, and its nearest neighbor [1]. The empty‐space function characterizes the point pattern on the basis of the distances from the so‐called test point to its nearest neighbors. This statistics is closely related to the *G*‐function but its interpretation is opposite to that. The value of the *F*‐function for CSR is the same as for *G*: *F*(r)=1-exp(-λπr^{2}). The empirical *F*‐function is again plotted against the theoretical values. Clumped distribution is assumed if the values of *F*_{obs} > *F*_{csr}. That is, the distances from an arbitrary point to its nearest neighbor of the pattern will be larger (on average) than under the CSR because the clustered pattern contains larger gaps than the random distribution. In the case of regular pattern, *F*_{obs} > *F*_{csr}, that is, the gaps are smaller and the distance from any point to its nearest neighbor will be smaller.

It is worth noting that both functions have their inhomogeneous versions, which can be applied in cases when the spatial pattern of individuals within the population is not homogeneous.

**Example 5**

**Figure 10** presents the stem map generated from the data set collected in the 30‐year old Scots pine (*Pinus sylvestris* L.) monoculture. The stand was planted artificially at the initial spacing 1.5 m × 1.5 m and it has not been managed so far. For each tree, the diameter at the breast height (dbh) was measured as well as location coordinates (*x*, *y*) were reported.

The nearest‐neighbor distribution *G*‐function (*G*) was calculated for the data, and the empirical function was plotted against the function for complete spatial randomness. Both functions are presented on the left panel in **Figure 11**. The graph of the *G*‐function for the data set is clearly below the expectation indicating the regularity in trees distribution. Up to the distance of 1.8 m, *G*(*r*) = 0. This distance may be interpreted as the minimum distance between the nearest individuals and it is due to the hard‐core process. This is the simplest kind of interaction between individuals.

The empty‐space *F*‐function (*F*) is presented in the right panel in **Figure 11**. It confirms regularity in the spatial pattern of pines stated on the basis of the nearest‐neighbor function.

**Figure 12** presents the graphs of the *G*‐ and *F*‐functions (left and right panels, respectively) calculated for hornbeams from an old‐growth oak‐dominated forest. Both functions confirmed the aggregated pattern of this tree species that is inconsistence with results obtained by spatial indices.

### 5.2. Second‐order summary functions

Second‐order statistics rely on the spatial relationships of pairs of trees, not only on nearest neighbor distances [37].

#### 5.2.1. Second‐order functions to discover the spatial arrangement of points

### 5.2.1.1. Univariate (unmarked) point pattern analysis

It refers to the pattern of points (e.g., trees in the forest) described only by their position (coordinates). Information on additional point attributes (e.g., size, sex, etc.) is not provided.

### 5.2.1.1.1. Ripley's function (K(r))

It appears to be the most common second‐order summary function [1, 16, 44, 69]. This function is based on the measurements of distances between all individuals of the point pattern. It determines the expected number (*λ*) of points present within the distance *r* of typical point of the pattern. The expectation for the CSR is that there should be *λπr*^{2} individuals within the distance *r* of the typical point of the pattern. Under CSR the function yields *K*(*r*) = *πr*^{2}. For clustered pattern K(r) > *πr*^{2} and for regular pattern K(r) < *πr*^{2}. Usually, the *K*‐function is plotted—together with its expectation—against the different distances *r* (spatial scales). Its shape provides valuable information on the point pattern distribution. If the empirical K(r) > *πr*^{2} it means that the distribution of the individuals within the population is consistent with clustering at the certain *r* distance. Opposite, the pattern is consistent with regularity if K(r) < *πr*^{2}. Because of the *K*‐function increases at the rate of *r*^{2} under the CSR expectation, it is better to use its transformation, the *L*‐function, which stabilizes its variance and transforms *K*(*r*) to the straight line *L*(*r*) = *r* [37]. The interpretation of the *L*‐function is quite easy. For regular distribution *L*(*r*) < *r*, and in the case of aggregated pattern—*L*(*r*) > *r*. To infer the scale of spatial interaction in a point pattern, it is obvious to estimate it by reading off the position where the function for the observed data set lies further away from the expectation under the CSR. It is not always correct because of its cumulative nature and effects at smaller distances obscure the effects at larger scales.

### 5.2.1.1.2. Pair correlation function (g(r))

The alternative to the *K*‐function is the pair correlation function, a noncumulative summary statistics. This function is closely related to the *K*‐function and is recommended by [1, 8]. It contains the contributions only from interpoint distances equal to the distance *r*. The advantage of the *g*(*r*)‐function is that under CSR it is equal 1 and independent of the intensity of the pattern. The tendency toward clustering means, that there will be more (on average) individuals at smaller distances *r* than expected under CSR and *g*(*r*) > 1. Conversely, for regular arrangement of individuals, there will be, on average, fewer individuals at the smaller distances than under CSR, and *g*(*r*) < 1 [1, 37].

#### 5.2.1.2. Bivariate point pattern analysis

Both, Ripley's function and pair correlation function can be extended to discover spatial relationships between the points of two types. For example, bivariate point pattern analysis is a suitable tool to discover the spatial relationships between two different tree species mixed in the forest.

### 5.2.1.2.1. Bivariate Ripley's function (K_{12}(r))

Ripley's function can be extended to the bivariate form and for more details on the suitable estimator, see Refs. [1, 8, 37]. The ecological questions here concern the detecting possible interactions between two types of objects (e.g., tree species in the forest). The fundamental benchmark is spatial independence separating two alternatives: association and repulsion (small scale) or segregation (large scale) of both types. Bivariate *L*_{12}(*r*) is an analog of univariate *L*(*r*)‐function. In case of the spatial independence of type 1 and type 2 of points *L*_{12}(*r*) = *r*. If *L*_{12}(*r*) > *r* then two types of objects show spatial association at the certain distance r and if *L*_{12}(*r*) < *r*—points of different types show spatial repulsion (separation).

### 5.2.1.2.2. Bivariate pair correlation function (g_{12}(r))

Similarly to the *L*‐function, the *g*(*r*)‐function can be easily extended to bivariate forms, g_{12}(r), to discover correlations between two types of objects. Then, *g*_{12}(*r*) = 1 indicates the spatial independence of two types of points being at the distance *r* apart. If *g*_{12}(*r*) > 1 then spatial association of both types of objects can be stated and if *g*_{12}(*r*) < 1—they are spatially segregated at the distance *r*.

Both functions, Ripley's function and pair correlation function, can also be calculated for inhomogeneous point patterns, thus in the case of spatial variation in the intensity of the pattern [37].

**Example 6**

To present different shapes of univariate *L*‐ and *g*‐functions for regular and aggregated patterns, data sets from Scots pine stand and old‐growth oak‐dominated forest, described previously, were used. Both functions for the empirical data sets are presented in **Figure 13**.

In the left panel, both functions calculated for live trees in pine stand showed clear evidence for regularity. Functions lie below the expectation referred to CSR and the departures from the expectation were significant at the distance up to 1.8 m (*g*‐function) and 2 m (*L*‐function). Up to these distances both functions are equal 0. It indicates the minimum distance between trees. Moreover, the shape of the pair correlation function is typical for plantations, where trees have been planted in rows that are also reflected by the wave‐like shape of the function. Thus, the spatial pattern of trees can provide important information about the history of establishment of the forest.

In the right panel in **Figure 13**, there is an example of clustering of trees. Both functions lie above the expectation for CSR. Because the *L*‐function has cumulative character it is rather hardly to make statements on the distance at which aggregations of trees can be observed. In the case of pair correlation function, this distance is clearly visible. The maximum value of the *g*‐function at the certain distance is equal to the average cluster size of trees. In case of hornbeams it was about 0.5 m. Such small spatial aggregations of hornbeams are typical for regeneration from sprouts, which is quite frequently observed in the case of this tree species.

Spatial correlation between oak (subscript: db) and hornbeam (gb) —an example of bivariate analysis—is presented in **Figure 14**. Bivariate pair correlation function indicated spatial negative association (spatial repulsion) between these two tree species in the old‐growth oak‐dominated forest. It means that both trees are spatially separated. In virgin forests, spatial segregation is assumed to decrease the interspecific competition, and it is supported by different mechanisms, e.g., different niche requirements of tree species. Thanks to spatial separation of tree species they can coexist together in a multispecies forest.

In oak‐dominated forest, the correlation range between oak and hornbeam was about *r* = 11 m, thus *g*_{12}(*r*) < 1 up to this distance. The negative association of both species results more likely from the extremely different abundance of oak and hornbeam as well as their different life stages. Clumped pattern of hornbeam may results from sprouting while random distribution of oak is typical for old, large trees. In plant populations, low intraspecific competition and higher interspecific competition favor species coexistence in multispecies forests.

#### 5.2.2. Inhomogeneous point pattern analysis

Inhomogeneous point pattern analysis should be used in cases when point density differs significantly with their location. Such cases are frequently observed in the natural forests, e.g., due to the forest site variation, seed dispersion, etc. Incorrect use of the second‐order summary function leads to misinterpretation of the results, the so‐called virtual aggregation. To avoid it, one can use inhomogeneous versions of the summary functions mentioned above or special function introduced by Schiffers et al. [70].

### 5.2.2.1. K2‐function

This function was developed as an extension of *g(r)* that can be used to discover the regular or clumped patterns despite the presence of the spatial variation in the point intensity across the study region [70]. Unlike the *L*‐ and *g*‐functions, the *K*2‐function relates the intensities at a given scale to the intensities at the adjacent scales [70]. It allows to interpret scales of significant deviations from the expectations at distances where transitions from low (or high) to high (low) intensities occur. The negative values of the *K*2‐function indicate clustering because the neighborhood density decreases with increasing distance. It has the positive values for regular pattern due to the steep increase of neighborhood density at a certain distance.

**Example 7**

**Figure 15** presents stem map generated for European yew (*Taxus baccata* L.) located in the Kórnik Arboretum, western Poland [71]. The population of yew developed spontaneously during last decades. The map represents the location of male individuals only.

Visual inspection provides information that the density of males across the study plot was inhomogeneous, and there is a density gradient from the south (bottom) to the north (top) of the plot. Inhomogeneity in the tree density can be clearly seen on the graph with pair correlation function that lies completely above the value 1 indicating the so‐called virtual aggregation due to the heterogeneity in tree density because the pair correlation function is related to the global intensity in the surrounding of a tree.

Thus, pair correlation function would lead to misinterpretation about the aggregated structure of males. The dependence in global intensity restriction is circumvented by the *K*2‐function. In the right panel of **Figure 16**, the estimated *K*2‐function lies completely within the confidence region under the CSR expectation. There are only weak deviations (statistically insignificant) at the smallest spatial scale toward clumping of males. Thus, the distribution of males did not differ from the randomness.

#### 5.2.3. Marked point pattern analysis: spatial diversity of different plant attributes

Marked point pattern carries different marks (attributes) of points. Marks can be qualitative and quantitative. In this section, methods suitable to analyze the correlations among plant's attributes. (e.g., sizes, health status, etc.) are provided with real data examples.

### 5.2.3.1. Qualitative marks

Marked point pattern analysis for qualitative marks describes the points in a different way than in the case of bivariate pattern analysis (like in Section 5.2.1.2). Here, the mark is produced by the process acting a posteriori over the univariate pattern, and it is a fundamental difference to the bivariate pattern in which plant's attributes are generated a priori by two different processes (e.g., plant species) [72]. It means that qualitative marks are defined as something created conditional on a given pattern [1].

### 5.2.3.1.1. Mark connection function (p_{12})

This function is the conditional probability, given that there is a point of the process at the location *m* and the second point at the location *n* and they are separated by the distance *r* such that the first individual is of type 1 and the second one is of type 2 [8, 37]. If the marks attached to the points (e.g., trees) of the pattern are independent and identically distributed, then *p*_{12}(*r*) = *p*_{1}*p*_{2}, where *p*_{1} and *p*_{2} denote the probability that a point is of type 1 or 2, respectively. Values larger than this, *p*_{12}(*r*) > *p*_{1}*p*_{2}, indicate positive association between the two types, while *p*_{12}(*r*) < *p*_{1}*p*_{2} indicates the negative association.

**Example 8**

The mark connection function was applied to test whether there was any spatial correlation between trees of different health status of European yew (*T. baccata* L.). The study plot (**Figure 17**) was established in the Kniazdwor Nature Reserve, western Ukraine [73]. Yew occurred under the canopy of European beech (*Fagus sylvatica* L.) and silver fir (*Abies alba* L.). All individuals of the height *h* > 0.5 m were classified according to the simple general classification: 1, good health status; 2, poor health status. Details on the classification can be found in Ref. [71].

Trees of poor health status showed neither the negative nor the positive association, that is, the function *p*_{22}(*r*) ≈ *p*_{2}*p*_{2} (black solid line, **Figure 18**). Because trees of good health status showed highly clustered structure at small spatial scale, the probability of finding two healthy trees close to each other was higher than expected (*p*_{11}(*r*) > *p*_{1}*p*_{1}). Healthy trees have—over the same spatial scale—a lower than expected probability of having trees of poor health status as its neighbor, that is, *p*_{12}(*r*) < *p*_{1}*p*_{2} (**Figure 18**).

Healthy tree have—over the same spatial scale—a lower than expected probability of having trees of poor health status as its neighbor, that is, *p*_{12}(*r*) < *p*_{1}*p*_{2} (**Figure 18**).

#### 5.2.3.2. Quantitative marks

Quantitative marks additionally describe each tree and they are numerical values (e.g., stem diameter, tree height, etc.). One can be interested in finding out whether the sizes of trees growing at the distance *r* from each other show any spatial correlation, conditional on the their location (unmarked pattern). An appropriate summary statistics for quantitative data types are different mark correlation functions depending on the so‐called test function used in calculation [1, 7, 14, 16, 40, 74]. Two correlation functions seem to be especially important in the structural analysis of the population: mark correlation function and mark variogram.

### 5.2.3.2.1. Mark correlation function (k_{f}(r))

This function is a measure if the dependence between marks of two individuals of the pattern is separated by the distance *r* [8]. The test function with two marks, *m*_{1} and *m*_{2}, is a nonnegative number and the test function is of the following form: *t*(*m*_{1}, *m*_{2}) = *m*_{1}*m*_{2}. The normalized *k _{f}* for a random assignment of marks (lack of spatial correlation among marks) over the pattern is equal to 1. Values of

*k*

_{f}(

*r*) < 1 for the distance

*r*mean that both individuals have smaller marks than the average for the population. At the small distances it means that there is an inhibition between both individuals due to their close distance. If

*k*(

_{f}*r*) > 1 it means that two individuals growing at the distance

*r*show larger marks than the average. At small distances it means that they benefit from being close to one another [8]. Moreover, it offers another characteristic of the pattern, namely, correlation range. It is the distance

*r*at which the function approaches the value 1. Using this form of correlation function, one is interested in finding out whether the marks of two plants show any correlation in space.

### 5.2.3.2.2. Mark variogram (γ(r))

In this form of correlation function the test function is *t*(*m*_{1}, *m*_{2}) = 0.5(*m*_{1}−*m*_{2})^{2}. It characterizes the squared differences between marks of pairs of individuals with the distances of *r*. If individuals growing at the distance *r* apart have similar mark, then mark variogram has smaller (than under random condition) values. Large values of *γ*(*r*) indicate that marks of both individuals tend to be different at a certain distance *r*. Similarly to the *k _{f}*(

*r*) function, the correlation range can be stated [8, 37, 40].

**Example 9**

**Figure 19** presents the mark correlation function for diameter of trees in the oak‐dominated old‐growth forest from Example 1. Analysis of *k*_{f}(*r*) indicated that pairs of trees growing at the distance up to 9 m (correlation range) tended to have smaller diameters than the average for the stand. In ecological meaning it can be interpreted as the mutual growth inhibition of the neighboring trees. Mark variogram showed another interesting point of view providing details on similarity of dissimilarity of pairs of trees in the dependence on the distance *r* between them. In the oak‐dominated forest, live trees being close to one another tended to have similar diameters and the interaction range was about 12 m.

### 6. Conclusion

Spatial analyses have now largely been incorporated in ecological studies due to the realistic assumption of spatial dependence between individuals constituting plant populations. Population structure is one of the most important traits of each biosystem and its description allows deeper insight into mechanisms and processes responsible for population dynamics. To understand these natural processes, modifying the structure and dynamics of plant populations is important from ecological (scientific) and practical (managing of natural resources) point of view. As indicated, depending on the ecological questions stated, different methods of spatial point pattern analysis can be applied. All of them are suitable to extract the hidden and detailed information on the current state of any population and allow us to make the assumptions concerning their future development. It is important to remember that the key elements of spatial analysis in ecology are data type, the appropriate choice of summary statistics, and null models. Selecting few of them in a single analysis makes the statements more reliable and realistic in the changing world.