## 1. Introduction

In recent years, the hierarchical systems with cascade structure have attracted attention of many scientists. Representing a form of organization of complex systems, hierarchy is frequently observed within the natural world and in social institutions [1]. A fractal can be treated as a self‐similar hierarchy because a fractal object bears many levels, which are systematically arranged according to scaling laws [2, 3–5]. Fractal phenomena can be described with power laws, and a power law can be decomposed into two exponential laws by means of hierarchical structure. Generally speaking, it is difficult to solve an equation based on power laws or spatial network because of dimensional problems, but it is easy to deal with the problem based on exponential models or hierarchies. Using self‐similar hierarchy, we can transform fractal scaling into *a* hierarchical scaling with characteristic scales, thus many complex problems can be solved in a simple way. If we explore fractal systems such as a system of cities by means of hierarchy, we can use a pair of exponential laws to replace a power law, and the analytical process can be significantly simplified [6, 7]. A fractal is a special case of hierarchical scaling. Hierarchy suggests a new way for understanding fractal organization and exploring complex systems.

In scientific research, three factors increase the difficulty of mathematical modeling, that is, *spatial dimension*, *time lag* (response delay), and *interaction*. Economics is relatively simple because economists do not usually consider much the spatial dimension in economic systems [8]. However, all the difficult problems related to mathematical modeling, especially the spatial dimension, are encountered by geographers. If the spatial dimension is avoided, geography is not yet real geography. Geographers often study spatial structure by means of hierarchy. A discovery is that hierarchy and network structure represent two different sides of the same coin [2]. Two typical hierarchy theories are developed in human geography. One is central place theory [9, 10], and the other is rank‐size distributions [11, 12]. The two theories are related to fractal ideas [2, 13–16]. Fractal theory, scaling concepts, and the related methods become much more important in geographical analysis such as urban studies [3]. As Batty (2008) once observed [17], “*an integrated theory of how cities evolve, linking urban economics and transportation behavior to developments in network science, allometric growth, and fractal geometry, is being slowly developed.*” In fact, fractals, allometry, and complex network can be associated with one another in virtue of hierarchical scaling.

Hierarchical scaling suggests a new perspective to examine the simple rules hiding behind the complex systems. Many types of physical and social phenomena satisfy the well‐known rank‐size distribution and thus follow Zipf’s law [7, 11, 18]. Today, Zipf’s law has been used to describe the discrete power law probability distributions in various natural and human systems [3, 19]. However, despite a large amount of research, the underlying rationale of the Zipf distribution is not yet very clear. On the other hand, many types of data associated with Zipf’s law in the physical and social sciences can be arranged in good order to form a hierarchy with cascade structure. There are lots of evidences showing that the Zipf distribution is inherently related to the self‐similar hierarchical structure, but the profound mystery has not yet to be unraveled for our understanding of natural laws. The Zipf distribution is associated with fractal structure and bears an analogy with the 1/f fluctuation [6]. Fractals, 1/f noise, and the Zipf distribution represent the observation of the ubiquitous empirical patterns in nature [19]. This article provides scientists with a new way of looking at the relations between these ubiquitous empirical patterns and the complex evolution processes in physical and social systems, and thus to understand how nature works.

A scientific research actually includes two elements of methodology, namely description and understanding. Science should proceed first by describing how a system works and then by understanding why [20]. The description process is by means of mathematics and measurement, while the understanding process is by means of observation, experience, or even artificially constructed experiments [21]. This work is devoted to exploring fractal modeling and spatial analysis based on hierarchy with cascade structure. First of all, we try to describe and understand hierarchy itself; later, we try to use hierarchical scaling to describe and understand complex systems. Two research methods are utilized in this works. One is *logic analysis method*, including induction method and deduction method, and the other is *empirical analysis method*, fitting the mathematical models to observational data. The induction method is based on various regular fractals such as Cantor set, Koch snowflake curve, Vicsek box, and Sierpinski gasket, while the deduction method is mainly based on mathematical derivation. As for empirical analysis, systems of cities and population size distribution of languages can be taken as examples. Anyway, the success of natural sciences lies in their great emphasis on the interplay between quantifiable data and models [22]. The rest of the parts are organized as follows. In Section 2, a set of hierarchical models of fractals, including monofractals and multifractals, are presented. In Section 3, case studies are made by means of German systems of human settlements and world population size of different languages. In Section 4, several questions are discussed, and the hierarchical‐scaling modeling is generalized. Finally, the discussion is concluded by summarizing the main points of this work.

## 2. Models

### 2.1. Three approaches to estimating fractal dimension

A regular fractal is a typical hierarchy with cascade structure. Let’s take the well‐known Cantor set as an example to show how to describe the hierarchical structure and how to calculate its fractal dimension (**Figure 1**). We can use two measurements, the length (*L*) and number (*N*) of fractal copies in the *m*th class, to characterize the self‐similar hierarchy. Thus, we have two exponential functions such as

where *m* denotes the ordinal number of class (*m* = 1, 2, …), *N _{m}* is the number of the fractal copies of a given length,

*L*is the length of the fractal copies in the

_{m}*m*th class,

*N*

_{1}and

*L*

_{1}are the number and length of the initiator (

*N*

_{1}= 1), respectively,

*r*and

_{n}*r*are the

_{l}**and**

*number ratio***of fractal copies,**

*length ratio**N*

_{0}=

*N*

_{1}/

*r*,

_{n}*L*

_{0}=

*L*

_{1}

*r*,

_{l}*ω*= ln(

*r*),

_{n}*ψ*= ln(

*r*). From Eqs. (1) and (2), it follows the common ratios of number and length, that is

_{l}According to the definitions of *ω* and *ψ*, the logarithms of Eqs. (3) and (4) are

From Eqs. (1) and (2), we can derive a power law in the form

in which *k* = *N*_{1}*L*_{1}^{D} is the proportionality coefficient, and *D* = ln(*r _{n}*)/ln(

*r*) is the fractal dimension of the Cantor set (

_{l}*k*= 1). Thus, three formulae of fractal dimension estimation can be obtained. Based on the power law, the fractal dimension can be expressed as

Based on the exponential models, the fractal dimension is

Based on the common ratios, the fractal dimension is

In theory, Eqs. (8)–(10) are equivalent to one another. Actually, by recurrence, Eq. (7) can be rewritten as *N _{m}*

_{+1}=

*L*

_{m}_{+1}

^{−D}, and thus we have

Taking logarithms on both sides of Eq. (11) yields

For the Cantor set, *N _{m}*= 2

^{m−1},

*L*= 1/3

_{m}^{m−1},

*r*=

_{n}*N*

_{m}_{+1}/

*N*= 2,

_{m}*r*=

_{l}*L*/

_{m}*L*

_{m}_{+1}= 3,

*ω*= ln(2),

*ψ*= ln(3), thus we have

This suggests that, for the regular fractal hierarchy, three approaches lead to the same result. The fractal dimension can be computed by using exponential functions, power function, or common ratios, and all these values are equal to one another. However, in practice, there are subtle differences between the results from different approaches because of random noise in observational data. Certainly, the differences are not significant and thus can be negligible.

The mathematical description and fractal dimension calculation of the Cantor set can be generalized to other regular fractals such as Koch snowflake and Sierpinski gasket or even to the route from bifurcation to chaos. As a simple fractal, the Cantor set fails to follow the rank‐size law. However, if we substitute the multifractal structure for the monofractal structure, the multiscaling Cantor set will comply with the rank‐size rule empirically.

### 2.2. Multifractal characterization of hierarchies

Monofractals (unifractals) represent the scale‐free systems of homogeneity, while multifractals represent the scale‐free systems of heterogeneity. In fact, as Stanley and Meakin (1988) [23] pointed out, “*multifractal scaling provides a quantitative description of a broad range of heterogeneous phenomena.*” In geography, multifractal geometry is a powerful tool for describing spatial heterogeneity. A multifractal hierarchy of Cantor set can be organized as follows. At the first level, the initiator is still a straight line segment of unit length, that is, *S*_{1} = *L*_{1} = 1. At the second level, the generator includes two straight line segments of different lengths. The length of one segment is *a*, and the other segment’s length is *b*. Let *a* = 3/8, *b* = 2/3−*a* = 7/24. The summation of the two line segments’ length is 2/3, that is, *S*_{2} = *a*+*b* = 2/3, and the average length of the two segments is *L*_{2} = *S*_{2}/2 = 1/3. At the third level, there are four line segments, and the lengths are *a*^{2}, *ab*, *ba*, and *b*^{2}, respectively. The total length of the four line segments is 4/9, namely *S*_{3} = (*a*+*b*)^{2} = (2/3)^{2}, and the average length is *L*_{3} = *S*_{3}/4 = 1/3^{2}. Generally speaking, the *m*th level consists of 2^{m−1} line segments with lengths of *a ^{m}*

^{−1},

*a*

^{m}^{−2}

*b*,

*a*

^{m}^{−3}

*b*

^{2}, …,

*a*

^{2}

*b*

^{m}^{−3},

*ab*

^{m}^{−2}, and

*b*

^{m}^{−1}, respectively. The length summation is

*S*= (

_{m}*a*+

*b*)

^{m−1}= (2/3)

^{m−1}, so the average length is

where

From Eqs. (13) and (14), it follows a scaling relation as below:

which is identical in form to Eq. (7), and the capacity dimension *D*_{0} = ln(2)/ln(3) ≈ 0.631 is equal to the fractal dimension of the monofractal Cantor set (**Figure 1**).

Two sets of parameters are always employed to characterize a multifractal system. One is the set of *global* parameters, and the other is the set of *local* parameters. The global parameters include the generalized correlation dimension and the mass exponent; the local parameters comprise the Lipschitz‐Hölder exponent and the fractal dimension of the set supporting this exponent. For the two‐scale Cantor set, the mass exponent is

where *q* denotes the moment order (−∞ < *q* < ∞), *τ*(*q*) refers to the mass exponent, and *p* is a probability measurement. Taking the derivative of Eq. (16) with respect to *q* yields the Lipschitz‐Hölder exponent of singularity in the form

in which *α*(*q*) refers to the singularity exponent. Utilizing the Legendre transform, we can derive the fractal dimension of the subsets supporting the exponent of singularity such as

where *f*(*α*) denotes the local dimension of the multifractal set. Furthermore, the general fractal dimension spectrum can be given in the following form:

where *D _{q}* denotes the generalized correlation dimension. If the order moment

*q*≠1, the general dimension can also be expressed as

Using the above equations, we can describe multifractal Cantor set. For example, if the length of one line segment in the generator is *a* = 3/8 as assumed, then the length of another line segment is *b* = 7/24. Accordingly, the probability measures are *p* = *a*/(2/3) = 9/16 and 1−*p* = 7/16. By means of these formulae, the multifractal dimension spectra and the related curves can be displayed in **Figures 2** and **3**. The capacity dimension is *D*_{0} ≈ 0.631, the information dimension is *D*_{1} ≈ 0.624, and the correlation dimension is *D*_{2} ≈ 0.617. Substituting ln(2) for ln(3) in the equations shown above, we can use the multifractal models of Cantor set to describe multiscaling rank‐size distribution of cities [6, 24].

### 2.3. Hierarchical scaling in social systems

Fractal hierarchical scaling can be generalized to model general hierarchical systems with cascade structure. Suppose the elements (e.g., cities) in a large‐scale system (e.g., a regional system) are divided into *M* levels in the top‐down order. We can describe the hierarchical structure using a set of exponential functions as follows:

where *m* denotes the top‐down order (*m* = 1, 2, … *M*), *N _{m}* represents the element number in a given order,

*r*=

_{n}*N*

_{m}_{+1}/

*N*is actually the

_{m}**number ratio**,

*N*

_{1}is the number of the top‐order elements. Generally speaking, we have

*N*

_{1}= 1;

*P*represents the mean size of order

_{m}*m*,

*r*=

_{p}*P*/

_{m}*P*

_{m}_{+1}is the element

**size ratio**of adjacent levels,

*P*

_{1}is the mean size of the first‐order elements, that is, the largest ones;

*A*is the average area of order

_{m}*m*,

*r*=

_{a}*A*/

_{m}*A*

_{m}_{+1}is the

**area ratio**, and

*A*

_{1}is the area of the first order. Rearranging Eq. (22) yields

*r*

_{p}^{m}^{−1}=

*P*

_{1}/

*P*, then taking logarithm to the base

_{m}*r*of this equation and substituting the result into Eq. (21) yield a power function as

_{n}where *μ* = *N*_{1}*P*_{1}^{D}, *D* = ln(*r _{n}*)/ln(

*r*). Eq. (24) is hereafter referred to as the “size ‐number law,” and

_{p}*D*is just the fractal dimension of self‐similar hierarchies measured by city population size. Similarly, from Eqs. (21) and (23), it follows

in which *η* = *N*_{1}*A*_{1}^{d}, *d* = ln(*r _{n}*)/ln(

*r*). Eq. (25) is what is “area‐number law,” and

_{a}*d*is the fractal dimension of self‐similar hierarchies measured by urban area. Finally, we can derive the hierarchical allometric‐scaling relationships between area and size from Eqs. (22) and (23), or from Eqs. (24) and (25), and the result is

where *a* = *A*_{1}*P*_{1}^{−b}, *b* = ln*r _{a}*/ln

*r*. This is just the generalized allometric growth law on the area‐size relations. Further, a three‐parameter Zipf‐type model on size distribution can be derived from Eqs. (1) and (2) such as

_{p}where *k* is the rank among all elements in a given system in decreasing order of size, *P _{k}* is the size of the

*k*th element. As the parameters, we have the constant of proportionality

*C*=

*P*

_{1}[

*r*/(

_{n}*r*−1)]

_{n}^{1/D}, the small parameter

*ς*= 1/(

*r*−1)

_{n}*,*and the power

*α*= 1/

*D*= ln

*r*/ln

_{p}*r*. Where

_{f}*P*

_{1}is the size of the largest element,

*q*proved to be the reciprocal of the fractal dimension

*D*of city‐size distribution or urban hierarchies, that is,

*α*= 1/

*D*[7]. By analogy, we can derive a three‐parameter Zipf‐type model on area distribution from Eqs. (1) and (3), that is

where *G*, *ζ*, and *β* are parameters. In theory, *β* = 1/*d*. From Eqs. (27) and (28), it follows

which suggests an approximate allometric relation. If *ζ* = *ς*, then we can derive a cross‐sectional allometry relation between size and area from Eqs. (27) and (28) as below

where *a* = *A*_{1}*P*_{1}^{−b}, *b* = *β*/*α*. Eq. (30) is mathematically equivalent to Eq. (26), that is, the rank‐size allometric scaling is equivalent to hierarchical allometric scaling in theory. Further, if *ζ* = *ς* = 0, then Eqs. (27) and (28) will be reduced to the common two‐parameter Zipf’s models [3]. The fractal models (principal scaling laws), allometric model (the law of allometric growth), and rank‐size distribution model (Zipf’s law) are three basic scaling laws of hierarchical systems such as cities, and all the scaling relations can be derived from the hierarchical models expressed by exponential functions.

## 3. Empirical analysis

### 3.1. A case of Germany “natural cities”

#### 3.1.1. Material and data

First of all, the hierarchy of German cities is employed to illustrate hierarchical‐scaling method. Recently, Bin Jiang and his coworkers have proposed a concept of “natural city” and developed a novel approach to measure objective city sizes based on street nodes or blocks and thus urban boundaries can be naturally identified [18, 25]. The street nodes are defined as street intersections and ends, while the naturally defined urban boundaries constitute the region of what is called ** natural cities**. The street nodes are significantly correlated with population of cities as well as city areal extents. The city data are extracted from massive volunteered geographic information OpenStreetMap databases through some data‐intensive computing processes and three data sets on European cities, including the cities of

**France**,

**Germany**, and the

**United Kingdom**(UK), have been obtained. Among all these data sets, the set for German is the largest one, which encompasses the 5160 natural cities. Therefore, German cities are taken as an example to make empirical analysis. In the processing of data, the area variable is divided by 10,000 for comparability.

#### 3.1.2. Method and results

The analytical method is based on the theoretical models shown above. For the natural cities, the population size measurement (*P*) should be replaced by the amount of blocks in the physical areal extent (*A*), which can be treated as a new size measurement of cities. It is easy to use German cities to construct a hierarchy to illustrate the equivalence relation between the rank‐size law and the hierarchical scaling. Empirically, the 5160 German cities and towns follow the rank‐size rule and we have

where *k* is the rank of natural cities, and *P _{k}* denotes the city size defined with urban blocks in objective boundaries. The symbol “^” implies “estimated value,” “calculated value,” or “predicted value.” The goodness of fit is about

*R*

^{2}= 0.993, and the scaling exponent is around

*q*= 1.051, as shown in the equation (

**Figure 4**). The Zipf distribution suggests a hierarchical scaling of the urban system.

The models of fractals and allometry can be built for German hierarchies of cities as follows. Taking number ratio *r _{f}*= 2, we can group the cities into different classes according to the 2

^{n}rule [7, 26]. The results, including city number (

*N*), total amount of urban blocks (

_{m}*S*), average size by blocks (

_{m}*P*), total area (

_{m}*T*), and average area (

_{m}*A*), in each class, are listed in

_{m}**Table 1**. In a hierarchy, two classes, that is, top class and bottom class, are always special and can be considered to be exceptional values (

**Figure 5**). In fact, the power law relations always break down if the scale is too large or too small [19]. Thus, a scaling range can be found in a log‐log plot of fractal analysis on cities [3]. Two hierarchical‐scaling relations can be testified by the least‐squares calculation. For common ratio

*r*= 2, the hierarchical‐scaling relation between city size and number is

_{n}**Source**: The original data come from Jiang (http://arxiv.org/find/all/). ^{*}**Note**: The last class of each hierarchy is a *lame‐duck class* termed by Davis (1978) [26].

The goodness of fit is about *R*^{2} = 0.996, and the fractal dimension of the self‐similar hierarchy is *D* ≈ 1.025. The average size ratio within the scaling range is about *r _{p}*= 1.942, which is very close to

*r*= 2. Thus, another fractal dimension estimation is

_{n}*D*= ln(

*r*)/ln(

_{n}*r*) ≈ 1.045. The average size follows the exponential law, that is,

_{p}*P*= 133869.061*exp(−0.674

_{m}*m*). So, the third fractal dimension estimation is

*D*=

*ω*/

*ψ*≈ 0.693/0.674 ≈ 1.028. All these results are based on the scaling range rather than the whole classes. Similarly, the relation between urban area and city number is as below

The goodness of fit is about *R*^{2} = 0.998, and the fractal dimension of the self‐similar hierarchy is *d* ≈ 1.060. The average size ratio within the scaling range is about *r _{p}*= 1.927. So, another fractal dimension is estimated as

*d*= ln(

*r*)/ln(

_{n}*r*) ≈ 1.056. The average area complies with the exponential law, namely

_{a}*A*= 157737.532*exp(−0.653

_{m}*m*). Thereby, the third fractal dimension is estimated as

*d*=

*ω*/

*ψ*≈ 0.693/0.653 ≈ 1.061. Further, by means of the datasets of urban size and area, an allometric‐scaling model can be built as follows:

The goodness of fit is around *R*^{2} = 0.999, and the scaling exponent *b* ≈ 0.967 (**Figure 6**). Another estimation of the allometric exponent is *b* ≈ 1.025/1.060 ≈ 0.967. The two results are close to one another. The natural cities of Germany lend further support to the equivalent relationship between the rank‐size distribution and the self‐similar hierarchy.

### 3.2. A case of language hierarchy in the world

The hierarchical scaling can be used to model the rank‐size distribution of languages by population. Where population size is concerned, there are 107 top languages in the world such as Chinese, English, and Spanish. In data processing, the population size is rescaled by dividing it with 1,000,000 for simplicity. Gleich et al. (2000) [27] gave a list of the 15 languages by number of native speakers (**Table 2**). The rank‐size model of the 107 languages is as below:

**Source:** Ref. [27]. **Note**: If we use the lower limits of population size *s*_{1} = 520, *s*_{2} = 260, *s*_{3} = 130, and *s*_{4} = 65 to classify the languages in the table, the corresponding number of languages is *f*_{1} = 1, *f*_{2} = 2, *f*_{3} = 4, and *f*_{4} = 8, and the scaling exponent is just 1.

Unit: million.

where *k* refers to rank, and *P _{k}* to the population speaking the language ranked

*k*, the goodness of fit is about

*R*

^{2}= 0.986 (

**Figure 7**). The fractal dimension is estimated as

*D*≈ 0.949.

Using the hierarchical scaling, we can estimate the fractal dimension of the size distribution of languages in the better way. According to the 2^{n} rule, the 107 languages fall into eight classes by size (**Table 3**). In the top level, one language, that is, Chinese, and the total of Chinese‐speaking population is 885 million; in the second level, two languages, English and Spanish, with total population 654 million, and so on. The number ratio is defined as *r _{n}*= 2. The corresponding size ratio is around

*r*= 2.025. Thus, the fractal dimension can be estimated as

_{p}*D*= ln(

*r*)/ln(

_{n}*r*) = ln(2)/ln(2.025) = 0.983, which is close to the reciprocal of Zipf exponent, 0.949. A regression analysis yields a hierarchical‐scaling relation between language number,

_{p}*N*, and average population size,

_{m}*S*, such as

_{m}**Note:** The source of the original data: http://www.nationmaster.com/. The number ratio is 2. The first class is exceptional, and the last class is a lame‐duck class, which is defined by Davis (1978) [26]. By the way, there is subtle difference of English population between **Tables 2** and **3**, but this error does not influence the conclusions.

The squared correlation coefficient is *R*^{2} = 0.997, and the fractal dimension is about *D* = 1.012, which is close to the above‐estimated value, 0.983 (**Figure 8**). The results suggest that the languages by population and cities by population follow the same hierarchical‐scaling laws.

## 4. Questions and discussion

### 4.1. Hierarchical scaling: a universal law

A complex system is always associated with hierarchy with cascade structure, which indicates self‐similarity. A self‐similar hierarchy such as cities as systems and systems of cities can be described with three types of scaling laws: *fractal laws*, *allometric law*, and *Zipf’s law*. These scaling laws can be expressed from Eqs. (24) to (30). Hierarchical scaling is a universal law in nature and human society, and it can be utilized to characterize many phenomena with different levels. Besides fractals, it can be used to depict the routes from bifurcation to chaos [3]. In geomorphology, the hierarchical scaling has been employed to describe river systems [28–31]. In geology and seismology, it is employed to describe the cascade structure of earthquake energy distributions [32, 33]. In biology and anatomy, it is used to describe the geometrical morphology of coronary arteries in human bodies and dogs [34–36]. In urban geography, it is used to describe central place systems and self‐organized network of cities [3, 7]. In short, where there is a rank‐size distribution, there is cascade structure, and where there is cascade structure, there is hierarchical‐scaling relations.

Next, hierarchical scaling is generalized to describe fractal complementary sets and quasi‐fractal structure, which represent two typical cases of hierarchical description besides fractals. The basic property of fractals is self‐similarity. For convenience of expression and reasoning, the concept of self‐similarity point should be defined. A fractal construction starts from an initiator by way of generator. If a fractal’s generator has two parts indicative of two fractal units, the fractal bears two self‐similarity points; if a fractal’s generator has three parts, the fractal possesses three self‐similarity points, and so on. For example, Cantor set has two self‐similarity points, Sierpinski gasket has three self‐similarity points, Koch curve has four self‐similarity points, and the box growing fractal has five self‐similarity points. The number of self‐similarity points is equal to the number ratio, that is, the common ratio of fractal units at different levels. A real fractal bears at least two self‐similarity points, this suggests cross‐similarity of a fractal besides the self‐similarity. Self‐similarity indicates dilation symmetry, where cross‐similarity implies translation symmetry. However, if and only if a system possesses more than one self‐similarity point, the system can be treated as a real fractal system, and this system can be characterized by fractal geometry. A fractal bears both dilation and translation symmetry. The systems with only one self‐similarity point such as logarithmic spiral can be described with hierarchical scaling. However, it cannot be characterized by fractal geometry. In this case, we can supplement fractal analysis by means of hierarchical scaling.

### 4.2. Hierarchies of fractal complementary sets

A fractal set and its complementary set represent two different sides of the same coin. The dimension of a fractal is always a fractional value, coming between the topological dimension and the Euclidean dimension of its embedding space. Certainly, the similarity dimension is of exception and may be greater than its embedding dimension. The dimension of the corresponding complement, however, is equal to the Euclidean dimension of the embedding space. Anyway, the Lebesgue measure of a fractal set is zero; by contrast, the Lebesgue measure of the fractal complement is greater than zero. Let us see the following patterns. **Figure 9(a)** shows the generator (i.e., the second step) of Vicsek’s growing fractal set [37], which bears an analogy with urban growth; **Figure 9(b)** illustrates the complementary set of the fractal set (the second step). It is easy to prove that the dimension of a fractal’s complement is a Euclidean dimension. If we use box‐counting method to measure the complement of a fractal defined in a two‐dimension space, the extreme of the nonempty box number is

where *C _{m}* denotes the nonempty box number for fractal complement, the rest notation is the same as those in Eqs. (1) and (2). Thus, the dimension of the fractal complement set is

which is equal to the Euclidean dimension of the embedding space.

However, a fractal set and its complement are of unity of opposites. A thin fractal is characterized with the fractal parameter, and the value of a fractal dimension is determined by both the fractal set and its complement. Without fractal dimension, we will know little about a fractal; without fractal complement, a fractal will degenerate to a Euclidean geometrical object. This suggests that the fractal dimension of a fractal can be inferred by its complement by means of hierarchical scaling. For example, in fractal urban studies, an urban space includes two parts: one is fractal set and the other fractal complement. If we define a fractal city in a two‐dimensional space, the form of urban growth can be represented by a built‐up pattern, which comprises varied patches in a digital map. Further, if we define an urban region using a circular area or a square area, the blank space in the urban region can be treated as a fractal complement of a city. Certainly, a self‐organized system such as cities in the real world is more complicated than the regular fractals in the mathematical world. The differences between fractal cities and real fractals can be reflected by the models and parameters in the computational world.

A set of exponential functions and power laws can be employed to characterize the hierarchical structure of fractal complementary sets. Suppose the number of fractal units in a generator is *u*, and the corresponding number of the complementary units in the generator is *v*. For example, for Cantor set, *u* = 2, *v* = 1 (**Figure 1**); for Koch curve, *u* = 4, *v* = 1; for Sierpinski gasket, *u* = 3, *v* = 1; for Vicsek fractal, *u* = 5, *v* = 4 (**Figure 9**); and so on. Thus, a fractal complement can be described by a pair of exponential function as below:

where the parameter *u* = *r _{n}*. That is, the number of fractal units in the generator is equal to the number ratio of the fractal hierarchy. Obviously, Eq. (39) is proportional to Eq. (1), while Eq. (40) is identical to Eq. (2). From Eqs. (39) and (40), it follows

in which *c* = (*v*/*u*)*L*_{1}^{D} is the proportionality coefficient, and *D* = ln(*r _{n}*)/ln(

*r*) is the fractal dimension. This suggests that we can estimate the dimension value of a fractal by means of its complement. For a fractal defined in a two‐dimensional embedding space, the dimension of the complementary set is

_{l}*d*= 2. However, we can calculate the fractional dimension of the fractal through the scaling exponent of the complement. For instance, the exponent of the hierarchical‐scaling relation between scale and number in different levels of the complement of Sierpinski gasket is

*D*= ln(3)/ln(2) = 1.585, which is just the fractal dimension of Sierpinski gasket itself. The other fractals can be understood by analogy (

**Table 4**).

Studies on fractal complement hierarchies are useful in urban and rural geography. In many cases, special land uses such as vacant land, water areas, and green belts can be attributed to a fractal complement rather than a fractal set [38]. However, this treatment is not necessary. Sometimes, we specially evaluate the fractal parameter of vacant land, water areas, green belts, and so on. In particular, the spatial state of a settlement may be reversed: the fractal structure evolves into fractal complementary structure and *vice versa*. The concepts of fractals and fractal complements can be employed to model the evolution process of a settlement. If a fractal settlement is defined in a two‐dimensional space, its fractal dimension comes between 0 and 2 [2, 4, 39]. According to the spatial state and fractal dimension, the settlement evolution can be divided into four stages. The first stage is fractal growth. In this stage, the geographical space is unstinted, and settlement growth bears a large degree of freedom. Typical phenomena are the new villages and young cities. The second stage is space filling. In this stage, small fractal clusters appear in the vacant places. Typical phenomena are the mature cities, towns, and villages. The third stage is structural reverse. Settlement growth is a process of phase transition, which can be explained by space replacement dynamics. In this stage, the fractal structure of a central part in the settlement is replaced by a fractal complementary structure. The space dimension is near 2, which is a Euclidean dimension. Typical phenomena are the old cities, towns, and villages. Gradually, the central part becomes aging, degenerate, and finally has to be abandoned. Thus, the settlements become hollow cities or hollow villages, from which inhabitants move away. The fourth stage is fractal regeneration. After a period of desolation, the buffer space becomes large, and the central area is suitable for reconstruction. Thus, some people try to settle there by rebuilding houses. In this stage, the fractal structure may become more complex and should be characterized by multifractal parameters.

### 4.3. Logarithmic spiral and hierarchical scaling

The logarithmic spiral is also termed equiangular spiral or growth spiral, which is treated as a self‐similar spiral curve in the literature and is often associated with fractal such as the Mandelbrot set. The logarithmic spiral was first described by René Descartes in 1638 and later deeply researched by Jacob Bernoulli, who was so fascinated by the *marvelous spiral* that he wished it to be engraved on his tombstone. Hierarchical scaling can be employed to describe logarithmic spiral. Where geometric form is concerned, a logarithmic spiral bears an analogy with fractals, while where mathematical structure is concerned, the logarithmic spiral is similar to rank‐size rule. Sometimes, the logarithmic spiral is treated as a fractal by scientists [40]. In fact, a logarithmic spiral is not a real fractal because it has only one self‐similarity point. For the section around the original point, the part of the logarithmic spiral is strictly similar to its whole. However, there is only self‐similarity but there is no cross‐similarity (**Figure 10**).

Though a logarithmic spiral is not a fractal, this curve bears the similar mathematical model to simple fractals. A logarithmic spiral can be expressed as below:

where *x* denotes the distance from the origin, *φ* is the angle from the abscissa axis, *θ* is a constant, and *α* = ln(*a*) and *β* = sin(*θ*) are two parameters. Integrating *x* over *ϕ* yields

where *L*(*φ*) refers to a cumulative length. Thus, we have

in which *L _{m}*(

*φ*) denotes the length of the curve segment at the

*m*th level. From Eqs. (42) and (44), we can derive two common ratios

This suggests that the two common ratios are equal to one another, that is, *r _{l}*=

*r*. From Eqs. (45) and (46), we can derive an allometric‐scaling relation such as

_{x}where *κ* refers to a proportionality coefficient, and *b* to the scaling exponent. The allometric‐scaling relation indicates a special geometric measure relation. In fact, the allometric‐scaling exponent is

This result suggests a special allometric relation between the two measurements of the logarithmic spiral. The above mathematical process shows that the logarithmic spiral as a quasi‐fractal curve can be strictly described by hierarchical scaling.

In urban studies, the logarithmic spiral study is helpful for us to understand the central place theory about human settlement systems and the rank‐size distribution of cities. Central place systems are composed of triangular lattice of points and regular hexagon area [9]. From the regular hexagonal networks, we can derive logarithmic spiral [41]. On the other hand, the mathematical models of hierarchical structure of the logarithmic spiral based on the systems of golden rectangles are similar to the models of urban hierarchies based on the rank‐size distribution. The logarithmic spiral suggests a latent link between Zipf’s law indicating hierarchical structure and Christaller’s central place models indicative of both spatial and hierarchical structure. Maybe, we can find new spatial analytical approach or spatial optimization theory by exploring the hierarchical scaling in the logarithmic spiral.

## 5. Conclusions

The conventional mathematical modeling is based on the idea of characteristic scales. If and only if a characteristic length is found in a system, the system can be effectively described with traditional mathematical methods. However, complex systems are principally scale‐free systems, and it is hard to find characteristic lengths from a complex system. Thus, mathematical modeling is often ineffectual. Fractal geometry provides a powerful tool for scaling analysis, which can be applied to exploring complexity associated with *time lag*, *spatial dimension*, and *interaction*. However, any scientific method has its limitation. Fractal description bears its sphere of application. In order to strengthen the function of fractal analysis, hierarchical‐scaling theory should be developed. Fractal analytical process can be integrated into hierarchical‐scaling analysis. In this work, three aspects of studies are presented. *First, hierarchical scaling is a simple approach to describing fractal structure*. Fractal scaling is used to be expressed with power laws. Based on hierarchical structure, a power law can be transformed into a pair of exponential laws, and the analytical process is significantly simplified because the spatial dimensional problems can be avoided. *Second, fractal analysis can be generalized to quasi‐fractal phenomena such as logarithmic spiral*. A real fractal possesses more than one self‐similarity point, while logarithmic spiral has only one self‐similarity point. Using hierarchical scaling, fractals and quasi‐fractals can be modeled in its right perspective. *Third, spatial analysis can be associated with hierarchical analysis*. Spatial dimension is one of obstacles for mathematical modeling and analysis. It is more difficult to make spatial analysis than hierarchical analysis. By hierarchical scaling, a spatial network can be transformed into a hierarchy with a cascade structure, and the spatial analysis can be equivalently replaced by hierarchical analysis. According to the abovementioned ideas, we can develop an integrated theory based on fractal and hierarchical scaling to research complex systems such as cities. What is more, fractals reflect optimum structure in nature. A fractal object can occupy its space in the most efficient way. Using concepts from fractals and hierarchical scaling, we can optimize human settlement systems, including cities, towns, villages, and systems of cities and towns.