Results obtained by CompPlex H_{e}ROI for six land use and land cover categories in the study area (municipality of São Carlos, São Paulo state, Brazil).

## Abstract

Information entropy concept is the base for many measures used to evaluate the complexity of complex environmental systems. Its application has great potential to evaluate landscape organization and dynamics, especially if we consider that there is a direct relation between their patterns and processes: the spatial arrangement (structure) of units within a mosaic reflects on system functions. Consequently, changes on structure reflects on functions and vice versa. Here, we exemplify how three measures based on information entropy – LMC and SDL complexity measures and He/Hmax variability measure – could be applied to evaluating the degree of complexity of a landscape and its components by associating their heterogeneity with the diversity of information acquired from the remote sensors’ images. For this, we developed two scripts for a Geographical Information System (QGIS): (1) CompPlex HeROI, that compares the complexity of a landscape patch with others and also with their transition areas; and (2) CompPlex Janus, which analyzes how complexity varies in the landscape over space and time, generating landscape complexity maps. We also use LMC and SDL complexity measures and He/Hmax variability measure to evaluate complexity time series of environmental variables, as rain and temperature, which allow to evaluate how their variations along time and space affects landscape dynamics. Therefore, application of such metrics in multi-temporal studies of landscape dynamics provides indicators of landscape resilience and the degree of conservation or degradation of its different fragments due to anthropic impacts related to land uses.

### Keywords

- complexity
- Information entropy
- landscape metrics

## 1. Introduction

From the perspective of the Complexity Paradigm [1], the landscape can be interpreted as a complex environmental system that is established from the interdependence relationships of the physical-natural system (that is, by the elements and processes present in nature) and the socioeconomic system (that is, the elements and processes linked to human societies in their cultural, economic and social aspects). When these two systems interact, they are considered as subsystems of a system with a higher level of ecological organization: the landscape. Landscape can be considered the 2nd level of ecological organization, characterized by a set of interrelated ecosystems and formed, as the other levels, by the interactions between society and nature (Figure 1).

As is typical of complex adaptive system, the landscape presents non-linear negative and positive feedback processes generated from self-organization of its elements in interaction networks. The structure and dynamics of the landscape are affected and affect the other three levels (holons) constituents of this holarchic organization through bottom-up and top-down processes (Figure 1). The evidence of these interactions can be seen in landscape’s patterns, as in this type of system, there is a direct relationship between its patterns and processes [4]: the spatial arrangement (structure) of units within a mosaic influence system functions. Consequently, changes in structure reflects on functions and vice versa, therefore affecting landscape resilience and integrity.

Thus, the complexity of a landscape and of the units that comprise the mosaic can be associated with the heterogeneities of its spatial, temporal, and structural patterns, with greater complexities being represented by patterns located in regions of intermediate heterogeneity in a gradient that goes from totally ordered patterns up to those completely disordered [5, 6]. To capture this typical signature of complex environmental systems not only qualitatively, it is necessary to use indicators capable of representing it in a quantitative way. This can be done through measures based on information entropy, which can be applied to assess the structure and dynamics of landscapes, as done by Mattos et al. [7] and Piqueira et al. [8] in relation to ecological interactions between their populations or by Piqueira & Mattos [9] for abiotic factors present in them.

Here, the application of measures based on information entropy for two purposes is demonstrated. The first is to assess the complexity of climatic time series. The other is both in comparing the complexity of a landscape patch with others and also with their transition areas, as well as allowing to verify as complexity varies in the landscape over space and time.

## 2. Use of measures based on information entropy to evaluate landscapes complexity

Remote sensor images can be used to identify landscape patterns in two different ways. The first one is related to the degree of roughness caused by the variability of tone or color, which provides image geometry and texture from targets [10, 11, 12]. The other is by examining the image spectral features, analyzing the values (e.g., surface reflectance, radiance, digital numbers) for each wavelength interval (i.e., band) vs. matrix pixels obtained for a specific target, which provides the spectral signature of a specific target [11, 13].

Several methods have been developed for both texture and spectral analysis to recognize patterns in remote sensing data [14, 15, 16]. Generally, many measures (also called metrics) derived from these methods have the same purposes: identifying similar patterns that occur in different places and distinguishing different patterns within a landscape. Although theoretically simple, in practice, this objective is not always easily achieved. Here, we discuss the use of three metrics derived from information entropy to measure the complexity of landscape patterns and show their applications to some case studies.

### 2.1 Information entropy applied to landscape patterns recognition and the evaluation of their level of complexity

Several metrics applied to textural and spectral analysis try to capture landscape patterns by using approaches deriving from theories and methods associated to the complexity paradigm, such as General System Theory, Cybernetic, Theory of Dissipative Structures, Hierarchy Theory, Percolation Theory, Self-Organized Criticality, Catastrophe Theory and Fractal Geometry ([17, 18, 19, 20].

Information entropy and other measures derived from it are also extensively used to quantify landscape heterogeneity and, consequently, to evaluate its organization level and complexity [18, 21]. According to Shiner et al. [22], there are three broad categories in which complexity measures based on information entropy may be classified: the first is composed of measures that consider complexity as a direct function of disorder (as is the case of very popular Shannon diversity index). So, measures of this category attribute lower values of complexity to ordered states and higher values to disordered states [22, 23]. Another category inverts this interpretation by associating higher complexity to most ordered states [22].

However, both measure categories are considered inadequate since there is no real complexity in situations that present zero or maximum entropy [23]. This fact is particularly applicable in Landscape Ecology, since, as mentioned by Parrot [24], more spatially complex landscapes are those in which the spatial pattern is situated in regions of intermediary heterogeneity, between order and disorder patterns. Thus, the maximum complexity would be located between these two extreme situations, which could be mathematically expressed as a convex function of disorder, as are the measures belonging to the third category defined by Shiner et al. [22]. This is the case of the LMC and SDL measures, proposed initially by López-Ruiz et al. [25] and Shiner et al. [22], respectively. In both definitions, the two complementary parameters – disorder and order – are combined to obtain a complexity measure.

#### 2.1.1 H_{e}/H_{max}, LMC and SDL complexity measures and their application to remote sensing images

Shannon’s Information Theory [26] could be applied to reflectance data in a remote sensing image band. These data are represented by their discretization in single digital numbers (DN), each DN representing a pixel value related to the intensity of the radiation in a particular wavelength at the sensor [11]. As the occurrence of certain DN values becomes more likely than other values, the entropy of the image decreases.

The variability measure H_{e}/H_{max} is related to Shannon entropy, and it belongs to the first category mentioned by Shiner et al. [22]; therefore, considering that complexity increases as a function of increasing the system disorder. This measure is useful to verify if a landscape and its patches are near the ordered/homogeneous or the disordered/heterogeneous patterns. To use this measure, it is necessary first to define system extension (* N*), given by the system’s total number of possible states. In the case of remote sensing images,

*corresponds to the number of different DN values present in the region of interest (ROI). As the maximum entropy value of a ROI could only be reached when the occurrence of the DNs values (*N

*, states) is equiprobable, the maximum entropy (*i.e.

*) is calculated considering all DNs values with the same probability as follows (Eq.(1)):*H

_{max}

Dividing the number of pixels that have a determined DN value by the total DN values present in the ROI, we have the probability * p*of the i

^{th}DN value of occurrence of this value within the ROI. The Boltzmann-Gibbs-Shannon entropy (H

_{e}) for ROI is then calculated as (Eq.(2)):

Finally, the variability measure (V) is obtained by dividing the information entropy calculated (* H*) by the maximum entropy (

_{e}

*), as follows (Eq.(3)):*H

_{max}

It can be deduced that complexity values for this measure range between 0 and 1, with complexity values associated with disorder (thermodynamic equilibrium).

Differently from the variability measure H_{e}/H_{max}, SDL and LMC belong to the third category of complexity measures defined by Shiner et al. [22], considering that the highest complexity is situated between order/homogeneous and disorder/heterogeneous patterns, that is, regions of intermediary heterogeneity associated with a high degree of self-organization. A convex function of information entropy may mathematically represent this assumption.

SDL measure is composed of two terms: disorder and order, i.e. (Eq.(4)):

For the LMC measure, the order term is substituted by another term, called disequilibrium (* D*), which measures the distance between the system probability and the uniform distribution [27] (Eq.(5)):

Consequently, LMC is given by (Eq.(6))

or by (Eq.(7)):

Zero is the minimum value for both measures, while 0.25 and 0.15 are the maximum values for SDL and LMC, respectively. These maximun values occur when the DN distribution is uniform [27].

To apply these measures based on information entropy to the remote sensing images, we developed two scripts in Phyton language to be executed as plugins in QGIS, an open-source Geographic Information System. The first one is CompPlex H_{e}ROI, which calculates He/Hmax, LMC, and SDL complexity measures of a ROI and compares them with others patches and their transition areas. The other plugin is CompPlex Janus, composed of a sliding window that runs through the image, calculating those three measures for the set of pixels inside it. CompPlex Janus then generates complexity maps, allowing verification as complexity varies in the landscape over space and time.

Here we present examples illustrating the application of CompPlex H_{e}ROI to evaluate the complexity of several ROIs (Example 1) and the use of CompPlex Janus to evaluate the spatial distribution of landscape patterns complexity (Example 2), highlighting in both cases the efficiency of the measures based on the information entropy presented.

#### 2.1.1.1 Example 1: CompPlex H_{e}ROI applied to evaluate patterns of different land uses

In this example, we show how CompPlex H_{e}ROI had been applied to evaluate, by using metrics based on information entropy, the complexity of spatial patterns with different land uses present in two river neighbor basins located at municipality of São Carlos (São Paulo state, Brazil – Figure 2), especially as indicators of the resilience of its green areas, to help establishing a free space system for this region. Land uses inside these river basins had been identified using images from CBERS 4 remote sensor (Figure 3), and six categories of use were selected to be evaluated by CompPlex H_{e}ROI. Results obtained for ROIs of these categories are shown in Table 1, where they are compared for each measure and each band used.

Colors associated with values presented in Table 1 help to identify tendencies of each land use complexity pattern. In general, ROIs of exposed soil, urban areas, and pasture have high values for H_{e}, H_{max,} and H_{e}/H_{max} measures and low values for SDL and LMC measures, indicating that these land uses have more disordered patterns. In turn, agricultural use varies from low to relatively high values for H_{e}, H_{max,} and H_{e}/H_{max} measures, but has, in most cases, low values for SDL and LMC measures. Finally, vegetation areas have low values for the first three measures (H_{e}, H_{max} and H_{e}/H_{max}) and high values for SDL and LMC measures that use convex function of disorder to associate more complexity with patterns situated in a zone between ordered and disordered patterns. Therefore, these results are coherent and consistent with the Landscape Ecology assumption that more complexity is found in intermediary heterogeneity patterns [5, 6], as is the case of vegetation areas present in the two neighbor river basins studied. High values for SDL and LMC obtained by these areas could be related to their high levels of self-organization and resilience.

#### 2.1.1.2 Example 2: CompPlex Janus and landscape complexity maps

To exemplify how CompPlex Janus works to generate landscape complexity maps, here is presented a case study of sensor images from the Assis Ecological Station and its boundaries (located at Assis, São Paulo State, Brazil – Figure 4). Several tests varying the sliding window size, sensor band, and number of color classes had been performed to compare results obtained by H_{e}, H_{e}/H_{max}, SDL, and LMC measures. Some of these results are shown in Figures 5 and 6.

Comparing the four examples of maps of H_{e}, we observe that for sliding window of 3x3 pixels sides (Figure 5A and B), this measure highlights borders among different land uses and, especially for the image of band 3 (Figure 5A), the apparent homogeneity of natural vegetation is broken. For this same measure, but for sliding window of 9x9 pixels sides results are shown in Figure 5C and D. On the other side, these edges are blurred for the 9x9 pixel window. However, areas with higher values for this measurement (visualized by more intense red tones) are found around natural vegetation areas, possibly indicating areas of greater risk to their integrity and resilience.

For H_{e}/H_{max} measure, we can perceive a significant difference between maps generated by a window of 3x3 pixels sides (Figure 5E and F) and those of 9x9 pixels (Figure 5G and H). Due to the reduced amount of pixels in the smaller window, there is less diversity of information, and the interval between minimum and maximum values is high. As occurred with the He measurement, a larger window (9x9 pixels) generated a more extensive range for the minimum and maximum values, highlighting possible areas that represent greater risks to natural vegetation.

In Figure 6A-H, we show, respectively, some results obtained by Complex Janus to SDL and LMC measure. For a window of 3x3 pixels sides (Figure 6A, B, E and F), these measures allow identifying punctual areas with higher values within natural vegetation regions. This effect is best observed on images generated by windows of 9x9 pixels (Figure 6C, D, G, and H), where areas of natural vegetation have a more variable value gradient and areas with other land uses are ‘homogenized’ with low values for both measures. In general, SDL and LMC measures assign higher values to natural vegetation, consistent with the assumption of Landscape Ecology that more complex patterns are associated with intermediary spatial heterogeneity.

## 3. Measures based on information entropy applied to analyze climatic time series

Information entropy measures are also useful to verify complexity (in the sense of variability) of time series, as shown by Piqueira and Mattos [9]. To exemplify how these measures can be utilized for this purpose, here we show an application of H_{e}/H_{max}, SDL, and LMC measures for a time series corresponding to the maximum daily temperatures that occurred in each January from 1980 to 2017 in the municipality of São Carlos (São Paulo State, Brazil).

To calculate the measures, daily maximum temperature data for each January of the entire time series were used to define its quartiles. Then, for the January data for each year, we check the number of days that belonged to each quartile, which allowed us to calculate the probability * p*for each interval. The system extension (N) corresponded to the number of quartiles that presented at least one data.

With these values, it was possible to calculate the measures H_{e}/H_{max}, SDL, and LMC for data from January of each year, according to the equations previously presented. The results obtained are shown in Table 2. To compare the performance of measures and to try to identify any patterns from results for each measure we group each of them in decreasing order of value and organize in four classes. Table 3 shows pairwise comparisons between the measures to verify whether or not a given year occupies the same class and the same position for both measures. Through these comparisons, it is evident that H_{e}/H_{max} and SDL have the same behavior, while LMC measure behavior differs from them, revealing the differences in the relations between order and disorder terms present in these measures’ equations.

## 4. Conclusions

In addition to exploring results obtained from case studies here presented, we intended to show why and how measures based on information entropy can contribute to understanding complexity of landscapes patterns and processes. As shown in the first example, H_{e}/H_{max}, SDL, and LMC are complexity measures that represent useful tools for evaluating landscape patterns. H_{e}/H_{max} allows identifying ordered and disordered targets, while SDL and LMC are related to intermediary heterogeneity patterns presented by landscape patches. Comparing the landscape metrics used here with the spectral decomposition methods proposed in Mustard and Sunshine [13], they prove to be quite efficient in comparing the complexity of the patterns of different patches as well as their variation over the entire landscape. Based on this example, the application of such metrics is proposed for multi-temporal studies of landscape dynamics, for evaluating resilience and the degree of degradation of different fragments, for estimating the degree of the anthropic impact due to alterations on land usage, among other applications.

In the second example, we highlight the use of these measures to evaluate complexity in climatic time series. Our future studies involve the application of these measures as alternatives for classical statistical analysis, using them to assess the influences of both natural processes, such as El Niño and La Niña, and those resulting from anthropic processes, such as the increase in temperature and frequency of extreme weather events, such as severe droughts and heavier rains.