Open access peer-reviewed chapter

Fractal Analysis of Time-Series Data Sets: Methods and Challenges

By Ian Pilgrim and Richard P. Taylor

Submitted: June 11th 2018Reviewed: October 11th 2018Published: November 20th 2018

DOI: 10.5772/intechopen.81958

Downloaded: 343

Abstract

Many methods exist for quantifying the fractal characteristics of a structure via a fractal dimension. As a traditional example, a fractal dimension of a spatial fractal structure may be quantified via a box-counting fractal analysis that probes a manner in which the structure fills space. However, such spatial analyses generally are not well-suited for the analysis of so-called “time-series” fractals, which may exhibit exact or statistical self-affinity but which inherently lack well-defined spatial characteristics. In this chapter, we introduce and investigate a variety of fractal analysis techniques directed to time-series structures. We investigate the fidelity of such techniques by applying each technique to sets of computer-generated time-series data sets with well-defined fractal characteristics. Additionally, we investigate the inherent challenges in quantifying fractal characteristics (and indeed of verifying the presence of such fractal characteristics) in time-series traces modeled to resemble physical data sets.

Keywords

  • fractal
  • spatial fractal
  • time-series fractal
  • fractal analysis
  • fractal dimension
  • self-similarity
  • self-affinity
  • topological dimension
  • embedding dimension
  • similarity dimension
  • box-counting dimension
  • covering dimension
  • variational box-counting
  • Hurst exponent
  • variance method
  • Dubuc variation method
  • adaptive fractal analysis
  • power-law noise
  • Brownian motion
  • fractional Brownian motion

1. Introduction

In this chapter, we explore a species of fractals known as “time-series” fractals. Such structures generally may be conceived (and visualized) as functions of independent variables whose plots exhibit shapes and patterns that are evocative of the more familiar spatial fractals. However, lacking well-defined spatial characteristics, time-series fractals call for analytical tools that depart from those of the world of spatial fractals. To lay the foundation for a discussion of such analytical tools, we begin with an overview of fractal structures and traditional fractal analysis techniques. We then introduce time-series fractals and investigate the unique analytical tools necessitated by such structures. Finally, we investigate the relative fidelity of these analytical tools, as well as the shortcomings inherent in performing fractal analysis on time-series fractals of limited length and/or fine-scale detail.

2. Motivating the fractal dimension

Mathematician Benoit B. Mandelbrot often is credited with introducing the notion of a fractional, or fractal, dimension in his 1967 paper, “How long is the coast of Britain?” [1]. In fact, however, the curious nature of coastline measurements had been discussed by Lewis Fry Richardson 6 years prior in the General Systems Yearbook [2]. Richardson, a pacifist and mathematician, sought to investigate the hypothesis that the likelihood that war would erupt between a pair of neighboring nations is related to the length of the nations’ shared border. As Richardson and Mandelbrot note, such a hypothesis is difficult to evaluate, since individual records of the length of Britain’s west coast varied by up to a factor of three. Indeed, as the precision of such measurements increases—that is, by decreasing the length of the “ruler” used to trace the profile—the measured total length appears to increase as well. This quality reflects the fact that the outline of the British coastline is an example of a “self-similar” structure—that is, a structure that exhibits the same statistical qualities, or even the exact details, across a wide range of length scales. In light of this apparent fundamental indeterminacy, Mandelbrot posits that familiar geometrical metrics such as length are inadequate for describing the complexity found in nature.

Recognizing Richardson’s prior investigations, Mandelbrot notes that Richardson had indeed produced an empirical relation between a measured coast length Land the smallest unit of measurement G: LG=MG1D, where Mis a positive constant and D1—but observes that “unfortunately it attracted no attention” [1]. In Ref. [1], building upon Richardson’s observations, Mandelbrot introduces the formalism of a fractional, or fractal1, dimension to quantify the nature of such shapes.

Following Mandelbrot’s example, to generalize the concept of a geometrical dimension, we may begin by examining the scaling behavior of such trivially self-similar objects as a line, a square, and a cube. For example, consider a line segment of length L, which can be separated into Nnon-overlapping subsets of length L/N, each of which is identical to the whole segment but for a scaling factor rN=1/N. Analogously, a square with side length Lmay be decomposed into N2facsimiles of side length L/N, each of which is scaled down from the original by a factor rN=N1/2, and a cube of side length Lcan be decomposed into N3facsimiles of side length L/Nwith corresponding scaling ratio rN=N1/3; see Figure 1. To generalize this pattern, we may observe that the scaling ratio rNfollows the relationship rN=N1/D. In this relationship, D=logN/logrNis known as the similarity dimension of the structure in question.

Figure 1.

A line, a square, and a cube are examples of trivially self-similar Euclidian shapes. A Euclidian shape in D dimensions may be said to contain N = L / L 0 − D exact copies of itself scaled by a factor of L / L 0 . Image provided by R.D. Montgomery.

Applying the concept of a similarity dimension to less trivial shapes is straightforward in the case of exactly self-similar structures, such as structures that are constructed via iteration of a generating pattern. As an example, consider the Koch curve, illustrated in Figure 2. The Koch curve is constructed as follows: Beginning with a line segment of unity length, replace the middle third of the segment with an equilateral triangle whose base has a length of 1/3 and overlies the original line segment, then remove this overlapping base segment. The resulting figure thus consists of four line segments, each of which has a length of 1/3. Iterating this process for each new line segment yields a sequence of figures that exhibit increasingly fine structure, with the limiting state of this series exhibiting exact self-similarity, in the sense that a nontrivial subset of the shape is exactly identical to the whole. This exact self-similarity is illustrated in Figure 2, which shows that the full Koch curve may be described as being formed from four exact copies of itself, each scaled down by a factor of 1/3. Thus, we can apply the above relation to find that the Koch curve has a similarity dimension of D=log4/log1/31.26.

Figure 2.

The Koch curve is an example of an exact self-similar figure with a non-integer similarity dimension.

The similarity dimension described above represents but one example of a plurality of dimensions that can be defined and calculated for a given figure. Indeed, the utility of the similarity dimension is limited by the fact that it applies only to figures that exhibit exact self-similarity; by contrast, the complexity witnessed in natural systems such as coastlines generally exhibits self-similarity only in the statistical sense. As an example, Figure 3 illustrates a structure that exhibits statistical self-similarity. Specifically, Figure 3 illustrates an example of a modified Koch curve formed by randomizing the orientations of the line segments as the structure is generated.

Figure 3.

Introducing randomness into the generating algorithm of the Koch curve produces a statistically self-similar fractal structure.

As a tool for quantifying the nature of such fractal structures that do not exhibit exact self-similarity, we now turn to the (roughly self-explanatory) “box-counting dimension,” also known as the “covering dimension.” Given a structure that extends in two dimensions2, the box-counting dimension may be determined as follows: First, superimpose a square grid with individual boxes of size ×over the figure in question, and count the number of boxes Nwithin which some portion of the figure in question is present (see Figure 4). Next, repeat this procedure while varying the box size and construct a plot of logNvslog1/; for a self-similar structure, the data should follow a linear trend with a gradient equal to the box-counting dimension D. Such a plot is generally known as a scaling plot.

Figure 4.

Applying the box-counting method to the Koch curve. The number of boxes of side length ℓ occupied by some portion of the curve follows N ℓ ∝ ℓ − D , where D is the box-counting dimension of the curve.

The box-counting method also may be described in more geometrically intuitive terms. For example, and as shown in Figure 4, one may observe that the set of all occupied boxes at a given length scale collectively serves as an approximation of the total structure as “observed” at the length scale . Stated differently, the set of ×boxes that overlap some portion of the base structure may be seen as representing a snapshot of the base structure as viewed at a resolution corresponding to the length . In general, however, the set of boxes covering the base structure cannot be expected to represent the geometric details of the structure at any length scale. For example, as shown in Figure 4, is evident that the incompatibility of the straight edges of the square boxes and the jagged boundary of the Koch curve leads to a markedly crude representation of the structure at all length scales, as each occupied box will always contain details that cannot be fully represented by that box.

While the box-counting method of estimating fractal dimension is conceptually straightforward, some care must be taken to preserve the utility of the method. For example, one must select an appropriate range of box sizes over which to examine the scaling trend, given that any observed fractal scaling trend will not persist over all possible length scales. That is, for any finite structure, it is possible to encompass the structure in a box of size L×L, for an appropriate value of L. In such a case, applying the box-counting method with boxes of size Lwill always return a value N=1—only one box can be filled when the box size contains the entire structure—thus resulting in an apparent fractal dimension of zero. As another example, when considering a range of box sizes L, nearly all such boxes will be counted as filled, and the box count Nwill scale as the square of the inverse box size 1/. In this case, the box-counting method will return an apparent fractal dimension of D=2, and we may say that the pattern “looks two-dimensional” when examined at this coarse scale. When dealing with patterns found in nature, the opposite extreme of possible length scales merits consideration as well. For a mathematically-generated fractal figure, such as a figure that exhibits structure at arbitrarily fine length scales, the box-counting method may be applied with arbitrarily small box sizes . However, naturally occurring fractal structures invariably exhibit a smallest length scale to which a scaling trend may extend. For example, while the scaling trend certainly must cease at the molecular and atomic scales, such fractal scaling behavior generally diverges at length scales many times larger than this. In such cases, applying the box-counting method at length scales smaller than a smallest feature size observed in the structure yields a number of filled boxes Nthat scales linearly with the inverse box size 1/; thus, the figure “looks one-dimensional” to the box-counting analysis at these scales.

Such conditions necessitate careful determination of the appropriate range of length scales over which to assess fractal scaling behavior. This determination may be made empirically, such as by observing the range of length scales over which the scaling plot is sufficiently linear. Alternatively, this determination may be made by convention, such as may be based on statistical arguments. In practice, it is generally not known a priori whether a structure under consideration should even be expected to be a fractal, and hence whether it should be expected to produce a scaling plot with a linear trend between cutoffs defined by appropriate physical and/or measurement limitations. Accordingly, it is preferred to adopt conventions with some degree of universality and that do not presuppose the existence of the fractal scaling behavior under investigation. More specifically, it is common to adopt the following conventions, noting that the ranges may be bounded by physical and/or measurement limitations. The coarse-scale analysis cutoff generally corresponds to a limit of the range of length scales measured, which in turn generally is related to the coarse-scale size of the structure itself. This limit is conventionally set at =L/5, where Lis the side length of the smallest square that may circumscribe the structure, thus guaranteeing that the grid includes no fewer than 25 boxes. Turning to the fine scale, the physical limit is determined by the smallest (nontrivial) feature size that is observed in the structure, while the fine-scale measurement limit is conventionally chosen to satisfy the requirement that each box contains no fewer than five data points. In practice, the more restrictive of these two limits is chosen (i.e., the larger of the physical fine-scale limit and the fine-scale measurement limit).

As a further consideration in optimizing the performance of the box-counting method, one must select the position and orientation of the box grid relative to the structure in question. To the extent that the box-counting method seeks to probe an inherent quality of a structure, the observed fractal dimension should not be affected by a spatial translation or rotation of the grid with respect to the structure, since the structure itself has no preferred orientation. However, consider the case shown in Figure 5, in which the box-counting method is applied to a fractal profile. In the box-counting scheme discussed above, all boxes that contain any portion of the structure under examination are counted toward the total; applying this to the structure of Figure 5, we find that 35 boxes are filled using this box size . Suppose, however, that one is able to reposition the boxes semi-independently of one another, by translating a set of adjacent ×boxes within each column of width . Doing so, we find that a careful repositioning of the boxes within these columns results in the box count Ndropping to 29. This apparent inconsistency serves to motivate a refinement of the box-counting analysis as described above. Specifically, the “variational box-counting method” includes shifting the boxes in columns as described above so as to minimize the number of ×boxes needed to entirely cover the figure in question. The variational box-counting method thus serves to eliminate some of the apparent ambiguity of the traditional box-counting method. Of course, some ambiguity still remains in this amended method, given that the rotational orientation of the columns relative to the examined structure remains arbitrary. To eliminate this residual ambiguity, one may repeat the above-described variational method at a variety of rotational orientations of the grid with respect to the figure and choose the angle that minimizes Nfor each value of . However, in practical applications, incorporating this additional variation does not significantly affect the measured dimensions.

Figure 5.

An example of applying the variational box-counting method. When the boxes are constrained in a grid (left), we find a box count N ℓ = 50 ; however, when the ℓ × ℓ boxes are allowed to shift vertically within columns of width ℓ (right), the measured box count N ℓ drops to 47.

3. Time-series fractal structures

The fractal structures discussed above generally represent examples of spatial fractal structures—that is, structures with spatial extent and whose fractal characteristics are embodied in their spatial form. However, many observable structures and phenomena exhibit fractal behavior while lacking spatial form. Another important class of structures to which fractal analysis may be directed is that of “time-series” structures—that is, structures that may be represented as a single-valued function of a single independent variable. As suggested by their name, a time-series structure may refer to some variable quantity—say, stock market prices, or atmospheric pressure—that fluctuates in time, but for the purposes of this work we intend for the term to refer to any data set or plot consisting of a dependent variable that may be represented as a single-valued function of an independent variable.

As with the spatial structures considered above, a time-series structure may exhibit fractal scaling properties in either a statistical or an exact sense, which may be quantified using the formalism of fractal dimensions. Unfortunately, the box-counting methods described above for measuring a fractal dimension are ill-suited to time-series structures. Simply put, this limitation arises from the fact that box-counting methods assess the fractal dimension of shapes that extend in space, while the spatial “shape” of a time-series structure is inherently undefined. That is, since the two axes of a plot representing a time-series data set generally represent variables with distinct units, the geometric aspect ratio of such a plot is fundamentally undefined.

As an example, consider the data set displayed in Figure 6, which plots the daily closing price of a certain technology stock over a period of roughly 16 years. Specifically, Figure 6 illustrates three representations of the same data set, with the respective y-axis of each illustration scaled by a distinct factor. In qualitative terms, one may be tempted to conclude that the data in the top panel appear the most linear and that the data in the bottom panel appear the most space-filling. Accordingly, given that a box-counting fractal analysis technique essentially assesses the space-filling properties of a structure, applying a box-counting analysis to each plot would yield distinct results for each plot.

Figure 6.

Daily closing prices for a single stock from December 1980 to October 1996. Each of the three plots displays the same data, but the y-axis of each plot is scaled by a distinct factor. A box-counting fractal analysis would return unique results for each plot, despite each plot representing the same data set.

The difficulty here lies in the fact that a box-counting fractal analysis necessarily treats a figure as a spatial entity whose orthogonal dimensions have the same units. By contrast, a time-series trace such as the one displayed in Figure 6 lacks this property, but may still exhibit fractal characteristics in the form of either statistical or exact self-affinity. As discussed above, exact and statistical self-similarity describe structures whose precise details or statistical properties (respectively) are repeated as its orthogonal dimensions are rescaled by a similar factor. By contrast, exact and statistical self-affinity refer to structures whose precise details or statistical properties (respectively) are repeated as its two orthogonal dimensions are resized by independent quantities [4].

Due to the incommensurability of the orthogonal axes defining a time-series trace, such structures cannot exhibit self-similarity, only self-affinity. As an example, Figure 7 displays the data set shown in Figure 6 alongside a subset of the data set. When this subset is appropriately rescaled in each of the x- and y-axis, the resulting plot shares the general statistical properties of the original trace, and hence exhibits statistical self-affinity.

Figure 7.

Statistical self-affinity in a fractal time-series trace. Choosing a subset of the stock price data shown in Figure 6 and rescaling the x- and y-axes yields a trace that shares statistical properties with the original.

It also is possible, albeit less common, for a time-series trace to exhibit exact self-affinity. As an example, Figure 8 illustrates three experimentally measured data sets in which rescaling the x- and y-axes of the traces by carefully chosen factors produces structures that share the characteristics of the original traces [5].

Figure 8.

Magnetoresistance fluctuations (MCF) recorded in an electron billiard device can represent examples of exact self-affinity in time-series structures. Each of the three columns in this figure represents a single MCF observed at a coarse scale (bottom) and a fine scale (top). From [5].

4. Fractal analysis of time-series traces: beyond box-counting

As discussed above, when applying a box-counting method to a time-series structure, the measured scaling properties of the structure will depend on the aspect ratio with which the data are presented, which is in turn an arbitrary choice. Accordingly, applying a box-counting method to a time-series trace will return a fractal dimension that is essentially arbitrary. Thus, it is necessary to develop fractal analysis techniques that are insensitive to such artificial geometric parameters. In the following, we survey a sampling of such techniques proposed in the literature.

Returning to the example of Figure 5, above, this figure in fact illustrates the variational box-counting method as applied to fractal profile in the form of a time-series fractal. Indeed, fractal analyses of such time-series fractal structures have traditionally been performed using the variational box-counting method [6, 7], which does offer performance improvements over the traditional fixed-grid box-counting method. Nonetheless, the variational box-counting method still suffers from a fatal flaw. To see why this is so, consider the plots shown in Figure 9.

Figure 9.

Visualizing a variational box-counting method applied to the stock price data of Figures 6 and 7 with a “resolution” of ℓ = 200 trading days. Displaying the data with a price range of 0–100 USD yields a box count of 37. Displaying the data with a price range of 0–1000 USD yields a box count of 20.

Figure 9 illustrates the stock price data of Figures 6 and 7 represented in two plots with the price axes respectively scaled by two different factors, as well as a visualization of a variational box-count method applied at a “length” scale =200trading days. When the prices shown range from 0–100 USD (top of Figure 9), we find that a minimum of 37 boxes are needed to entirely cover the trace. However, when the price range is expanded to 0–1000 USD (effectively increasing the domain: range aspect ratio of the data; bottom of Figure 9), the number of boxes needed to cover the trace falls to 20. Indeed, the number of boxes Nneeded to cover the “compressed” plot will be proportional to 1/for all values of such that the boxes are “taller” than the range of values found within any of its L/columns. That is, as long as each box is “taller” than the vertical extent of the trace within each column, the trace will “look” one-dimensional.

Of course, the fundamental issue is that the concept of an דbox” on a time-series trace is meaningless, since the enclosed “area” has units of (in this case) days times dollars. While it is entirely reasonable to overlay a spatial figure with boxes of a well-defined area in the case of a box-counting analysis of a spatial fractal, the concept of a square drawn on a plot with incompatible and independently scalable axes is ill-defined. In some cases, this inadequacy is resolved by adopting conventions that eliminate such ambiguity. For example, a time-series trace may be normalized in its x- and y-axes such that the domain and range of the plot each run from 0 to 1, and the structure may be analyzed via a box-counting analysis that utilizes a square grid that just circumscribes the trace. While such a normalization convention may provide a consistent method for investigating the relative scaling properties among a set of related time-series traces, the absolute values of the dimensions produced by such analyses would remain essentially arbitrary.

Developing a fractal analysis technique that is appropriate for time-series structures generally amounts to taking one of two approaches: (1) to treat the time-series structure as a geometric figure without a well-defined aspect ratio, or (2) to treat the time-series structure as an ordered record of a process that exhibits a quantifiable degree of randomness. Following the latter approach, Harold Edwin Hurst introduced a formalism for quantifying the nature of self-affine time-series structures in a 1951 paper on the long-term storage capacity of water reservoirs [8].

In Ref. [8], Hurst introduces the concept of the “Hurst exponent” H, which may be understood as quantifying the character of the randomness exhibited in a time-series structure via an autocorrelation measurement. Specifically, a Hurst exponent of H=0.5describes a process that is purely random, such that the value of the trace at time tiis entirely independent of the value at time tj, ij. By contrast, Hurst exponents in the range 0.5<H<1represent traces exhibiting positive autocorrelations, while Hurst exponents in the range 0<H<0.5represent traces exhibiting negative autocorrelations. Intuitively speaking, a positive autocorrelation may be understood as representing a trace in which a “high” value (say, relative to the mean) is more likely than not to be followed by additional “high” values, while a negative autocorrelation may be understood as representing a trace in which “high” and “low” values alternate at short time scales; see Figure 10.

Figure 10.

Examples of time-series traces characterized by Hurst exponents of (bottom to top) H = 0.25, 0.50, and 0.75. A trace with H = 0.5 represents purely random process, whereas traces with H = 0.25 and H = 0.75 represent processes whose subsequent increments are negatively and positively correlated, respectively.

The Hurst exponent of a data set may be calculated by examining the scaling properties of a “rescaled range” of the data, as follows. Consider a data set xtt=123T, and let xixi+1xi+τ,τT,i=1,2,3,,Tτrepresent any sequence of τ+1points within the data set. The rescaled range (R/S) statistic is then defined as:

RSτ=1sτsupiti+τk=itxkx¯i,τinfiti+τk=itxkx¯i,τ,E1

where

x¯i,τ=1/τt=iτxtE2

is the sample mean and

si,τ=1/τt=iτxtx¯i,τ21/2E3

is the sample standard deviation. The quantity

RSτiE4

is then proportional to τH, such that the gradient of a plot of logR/Sτivslogτis equal to the Hurst exponent H.

The Hurst exponent also may be described as a measure of long-range correlations within a data set, such that measuring these correlations as a function of interval width may provide another measurement of the Hurst exponent. As an example of such an analysis, the “variance method”3 calculates the scaling properties of the trace’s autocorrelation as a function of time interval4 via calculation of the quantity

VΔt=xt+Δtxt2tE5

for a range of values of Δt. This quantity is then related to the Hurst exponent as VΔtΔt2Hsuch that a plot of logVΔtvslogΔtis expected to be linear (within an appropriate range of values of Δt) with slope. 2H. In practice, however, the variance method is found to produce a poor estimate of Hurst exponent.

As another means of quantifying the fractal properties of time-series traces, we now turn our attention to a method proposed by Benoit Dubuc in a 1989 paper [9] on the fractal dimension of profiles. Dubuc’s proposed “variation method”5 is conceptually similar to the variational box-counting method described above, but improves upon this method by resolving the fundamental arbitrariness of drawings boxes on a time-series trace. In short, Dubuc’s variation method probes the “space-filling” characteristics of a time-series trace through measurement of the scaling behavior of the amplitude of the trace within an ϵneighborhood as ϵis varied.

In practical terms, Dubuc’s variation method may be implemented is as follows: Consider a time-series data set xtt=123T. For a given value of ϵ, define the functions uϵtand bϵtas follows:

uϵt=suptRϵtxt,bϵt=inftRϵtxt,E6

where

Rϵt=s:tsϵands1T.E7

That is, for a given value of ϵand for each point tiin the trace, examine the set of points xtwithin ϵdata points of ti, and let uϵtiand bϵtibe (respectively) the maximum and minimum values of xtfound in this range. Thus, uϵtand bϵtmay be understood as traces that represent (respectively) the upper and lower envelopes of oscillation of a trace at a particular scale set by ϵ. At large values of ϵ, the traces uϵtand bϵtwill be slowly varying relative to the variation present in the original data set; reducing the value of ϵwill produce traces uϵtand bϵtthat each resemble the original data set with increasing fidelity (see Figure 11).

Figure 11.

Visualizing the application of Dubuc’s variation method at two distinct values of ϵ . The trace under consideration is a fractional Brownian motion (fBm), whose properties are discussed below.

Having constructed the traces uϵtand bϵt, we then define vϵt=uϵtbϵtand calculate

Vϵ=1ϵ2tvϵt.E8

Conceptually, Vϵmay be regarded as representing the (crucially, not necessarily integer) number of ϵ×ϵ“boxes” whose total “area” would be equal to that of the envelope bounded by uϵtand bϵt. Of course, the concept of “area” is ill-defined in this context, but this is of no concern, given that we have not implied a geometrical relationship between the x and y dimensions. In continued analogy with spatial box-counting analyses, the fractal dimension of the trace is then determined via the relationship Vϵ1/ϵD, such that a plot of logVϵvslog1/ϵis expected to follow a linear trend (within an appropriate range of values of ϵ) with a slope corresponding to the fractal dimension D.

As a final means of quantifying the fractal properties of time-series traces, we consider a technique known as “adaptive fractal analysis” (AFA) [10]. Similar to Dubuc’s variation method, AFA may be broadly described as investigating the geometrical properties of a time-series trace (in contrast to the aforementioned analyses that are best understood as probing numerical correlations). For example, and as discussed above, Dubuc’s variation method may be described as quantifying the generalized “area” needed to cover a time-series trace as analyzed at different characteristic time scales; in the case of AFA, approximations to the time-series trace are generated at varying resolutions, and the fidelity of such approximations is recorded as the resolution is varied. The AFA algorithm may be executed as follows: Again, consider a time-series data set xtt=123T. Next, choose a window with a width equal to an odd integer w=2n+1,w<T, and partition the data set into overlapping subsets of length w such that each pair of adjacent subsets overlap by n+1data points. Within each window, the linear best-fit line to the data within that window is calculated, resulting in a series of disconnected straight lines. That is, the series of disconnected best-fit lines overlap in pairs such that each index in the domain of the original data set is matched with respective points on each of two subset fit lines (with the exception of the n data points at either end of the trace). Next, these best-fit lines are “stitched” together to form a single, smoothly continuous curve in the following manner: Label the windows that span the trace with consecutive integers, and label the windows’ corresponding best-fit lines as yjll=12n+1. Then, within each window j, construct the curve

ywl=w1yjl+n+w2yj+1l,E9

l=1,2,,n+1, where w1=1l1/nand w2=l1/n. Conceptually, each value ywlmay be thought of as representing the weighted average of the values of the two best-fit lines with values at that index, weighted so as to be inversely proportional to the distance between that index and the midpoint of the window. Repeating this procedure across all windows produces a trace ywtthat is continuous and differentiable, and that may be understood as representing an approximation to the trace xtat a length scale, or “resolution,” defined by w (see Figure 12).

Figure 12.

Examples of applying the procedure of AFA at several values of N (corresponding to the window width w discussed in the text). The light blue trace (bottom) is a 16,384-point fractal trace with H = 0.375 , while the red (top), green (second from top), and purple (third from top) traces represent approximations produced by the AFA technique at N = 1000 , N = 500 , and N = 50 , respectively. Traces are vertically offset for clarity. Note that smaller values of N yield approximations that are increasingly similar to the trace under consideration.

As w is decreased, ywtbecomes a better approximation to xt; the scaling behavior of this fidelity as w is varied is used to determine the Hurst exponent. Specifically,

Fw=1Ti=1Tywtixti21/2wH,E10

such that a plot of logFwvslogwwill be linear (over an appropriate range) with slope H.

5. Evaluating fractal analysis techniques

Each of the fractal analysis techniques discussed above is best understood as providing an estimate of the fractal dimension or Hurst exponent that characterizes a given time-series data set. The sections that follow present a method for evaluating the fidelity of these estimates that was developed and applied by the authors to the fractal analysis techniques under consideration. To objectively and quantifiably evaluate the fidelity of each of these techniques, it is desirable to investigate the accuracy of each technique when applied to traces with known Hurst exponents/fractal dimensions. To introduce a method for producing such “control” traces, we begin with a general discussion of noise traces.

A noise trace, as an example of a time-series structure, may be described as a single-valued function of a single independent variable. A variety of methods exist for quantifying the statistical properties of noise traces. For example, in addition to the aforementioned measurements of space-filling characteristics and long-range correlations, a spectral analysis of a noise trace may offer a natural quantification of the trace’s statistical properties.

Power-law noise represents a significant and broad class of noise traces. Specifically, a power-law noise trace has a power spectral density given by Pf1/fβ. A noise trace characterized by β=0thus represents noise whose spectral power density is a constant across all frequencies, while β=1corresponds to the “1/fnoise” that characterizes many natural systems, and β=2is known as “brown noise.” In principle, βcan assume any value; however, we begin our investigation by considering the β=2case.

A “brown noise” trace characterized by β=2is so termed owing to its relation to Brownian motion, which describes the net motion of a particle whose individual steps are random and independent. Brownian motion generally may refer to a process extending in any number of dimensions; however, we restrict our attention to brown noises that may be understood as a time-dependent plot of the position of a particle undergoing Brownian motion along one dimension. (As used herein, “Brownian motion” and “brown noise” will be used interchangeably to describe a Brownian motion in one dimension.) Given that a Brownian motion may be described as the cumulative sum of a series of random, independent steps, it is straightforward to generate a Brownian motion trace as a cumulative integral of a white noise trace. For our purposes, we define a white noise trace as a series of values with zero mean taken from a normal distribution (i.e., a Gaussian noise trace; see Figure 13). As a result, a brown noise trace is characterized by a Hurst exponent of H=0.5.

Figure 13.

The cumulative sum of Gaussian white noise results in Brownian motion.

Relaxing the restriction that the Gaussian noise trace consists of statistically independent increments permits consecutive increments to be positively or negatively correlated, such that the plot formed by the cumulative sum of the noise trace may be characterized by a Hurst exponent that deviates from H=0.5. Such a trace is termed a “fractional Brownian motion” (fBm). Mandelbrot and Van Ness [11] provide a formalism for quantifying the properties of such structures as follows: Consider a conventional Brownian motion trace Btω, where tdenotes time and ωrepresents the particular realization of the random function that generated the specific Brownian motion. The data set Btωis thus a function whose increments Bt2ωBt1ωhave a mean of zero and a variance of t2t1, and whose non-overlapping increments Bt2ωBt1ωand Bt4ωBt3ωare statistically independent. A “reduced fractional Brownian motion” BHtω, then, is further characterized by the parameter H, 0<H<1, and satisfies

BH0ω=b0,BHtωBH0ω=1ΓH+12{0tsH1/2sH1/2dBsω+0ttsH1/2dB(sω).E11

A fractional Brownian motion trace is thus self-affine in the sense that

BHt0+τωBHt0ωhHBHt0+ωBHt0ω,E12

where

XtωYtωE13

denotes that the two random functions Xtωand Ytωhave identical finite joint distribution functions [11]. Thus, on average, when an interval on an fBm trace is expanded by a factor of h, the difference of the values at the endpoints of the interval BHt0+ωBHt0ωincreases by a factor of hH. This property represents an example of statistical self-affinity, in which the observed statistical properties within the intervals are preserved when the xand yaxes are scaled by distinct factors (specifically, hand hH, respectively).

Quantifying self-affinity using the formalism of the Hurst exponent motivates drawing a parallel between the Hurst exponent and the fractal dimension, as follows. Following the argument of Ref. [4], consider an fBm trace VHtthat extends over a total time span Δt=1and a total vertical range ΔVH=1. Dividing the time span into n increments of width 1/n, we expect the vertical range of the portion of the trace within each interval to scale as ΔtH=1/nH(see Figure 14). Accordingly, on average, the portion of VHtpresent in a given interval may be covered by ΔVH/Δt=1/nH/1/n=n/nHsquare boxes of side length 1/n. Thus, the total number of square boxes of side length 1/nneeded in order to cover the entire trace is expected to be nn/nH=n2H. If we recall that the spatial box-counting method relates number of square boxes of side length needed to cover a trace to the fractal dimension of the trace as N1/DF, we may conclude that6 DF=2H.

Figure 14.

Deriving a relationship between the Hurst exponent and fractal dimension. A Brownian motion trace V H t ( H = 0.5 ) is normalized in both dimensions to be circumscribed inside a unit square, and subsequently is divided into n intervals of width 1/n. The self-affinity of an fBm trace leads to an estimation of the number of square boxes needed to cover the trace at a given length scale, motivating a relationship between H and DF. See text for details.

The relationship DF=2His appealing in its simplicity, and indeed is frequently found in the literature; however, Ref. [4] is quick to acknowledge the inherent difficulty in assigning a fractal dimension to a self-affine structure, given that such a construction is predicated upon assigning an arbitrary rescaling relationship between incompatible coordinates. Mandelbrot, too, notes the apparent relation DF=2H[12] and clarifies that this relation holds in the fine-scale limit. This disparity serves to highlight a general distinction between the Hurst exponent and the fractal dimension as descriptors of a time-series trace. Specifically, the Hurst exponent may be understood as a descriptor of global correlations, while the fractal dimension may be understood as describing a trace’s local fine-scale structure [13].

6. Relationship between fractal dimension and spectral exponent

We may continue this exercise of comparing our various statistical parameters by considering the spectral exponent βas a means of quantifying the nature of a fractal trace. In practice, it is impractical to utilize a spectral analysis to evaluate the fractal properties of a time-series structure, due to the imprecision (relative to the aforementioned fractal analysis techniques) of applying a power law best-fit curve to characterize a spectral decomposition of a trace. Nevertheless, we may investigate the relationship that exists between the spectral exponent β, the fractal dimension DF, and the Hurst exponent H, so long as we recognize the imprecisions of these comparisons. In particular, the spectral exponent βtypically is said to relate to the Hurst exponent as β=2H+1, implying the relationship DF=5β/2. This relationship may be derived by observing that the two-point autocorrelation function

GVτ=VtVt+τVt2τβ1E14

for a trace Vtis related to the quantity VtτVt2as

VtτVt2=2V2GVτ;E15

comparing this result to the aforementioned relationship

Vt+τVt2τ2HE16

leads to the expression β1=2H[14]. However, systematic study [15] demonstrates that such a relationship is generally not very robust. Indeed, it is straightforward to test this robustness: In analogy to the investigation performed in Ref. [15], we investigated the relationship between spectral exponent and fractal dimension by generating a set of 20 noise traces, each with a length of 16,384 points and with a βvalue between 0 and 2. Applying each of the previously discussed time-series fractal analysis techniques to each of these traces produced a corresponding set of fractal dimensions (for the variational box-counting analysis and Dubuc’s variation analysis) or Hurst exponents (for the variance analysis); these data are shown in Figure 15, with the Hurst exponents “converted” to fractal dimensions via DF=2H. Plotting these measured parameters as a function of the well-defined spectral exponent used to generate each trace, we see that the relationship DF=5β/2breaks down for DFclose to 1 or 2.

Figure 15.

Measured fractal dimensions of colored noise traces generated with well-defined power spectral densities β . Each data point represents the average value of D F measured with the respective fractal analysis method for the set of 20 traces at the corresponding value of β . Each error bar represents one standard deviation from the mean value of D F recorded for each set of 20 traces. Lines connecting the data points are provided as a guide to the eye. The dashed line corresponds to the relationship D F = 5 − β / 2 .

7. Generating fractional Brownian motions and characterizing fractal analysis techniques

The framework of the investigation summarized in Figure 15 may be applied to a more thorough investigation of the fidelity of each fractal analysis technique discussed above. That is, if we generate a fBm trace with a well-defined Hurst exponent and subject such a trace to the analysis techniques under consideration, we may evaluate the robustness of each analysis technique. In so doing, we may evaluate not only the fidelity of each analysis method, but also may explore how the analysis methods (individually and/or collectively) respond to less-idealized data sets. That is, by generating fBm traces with well-defined Hurst exponents and modifying the traces to better resemble real-world data sets, we may gain insight into how best to interpret our analytical results of experimentally derived data. Specifically, in addition to testing these analysis techniques on “full-size” 16,384-point fBm traces (with 16,384 arbitrarily chosen as a “sufficiently large” number), we additionally tested these analyses on traces of reduced length and/or reduced spectral content, which may better represent experimentally measured data sets.

A variety of methods exist for generating a fractional Brownian motion trace that exhibits a well-defined predetermined Hurst exponent. Examples of such methods include random midpoint displacement, Fourier filtering of white noise traces, and the summation of independent jumps [14]. This chapter considers randomly generated fBm traces that were created using a MATLAB program that generates a fractional Gaussian noise trace with the desired Hurst exponent via a Fourier transform and subsequently computes the cumulative sum of the noise trace to yield a fractional Brownian motion trace with a specified well-defined Hurst exponent.

While such computer-generated fBm traces are accurately described as exhibiting a well-defined Hurst exponent, the inherently finite nature of these traces precludes the traces from being fully “fractal.” That is, as with any natural structure with finite extent, the generated fBm traces necessarily exhibit a fine-scale resolution limit (owing to the point-wise granularity of the traces) as well as a coarse-scale size limit (owing to the finite total length of the traces). With this in mind, we must be content to forge ahead with the simplifying assumption that the effects of these particular limitations on our estimates of the underlying fractal scaling properties are negligible when considering a computer-generated fBm trace whose total length exceeds its step increment by several orders of magnitude. Accordingly, for the purposes of this analysis, we assume that an fBm trace generated with a predetermined Hurst exponent “Hin” and with a total length well in excess of its resolution limit is a suitable representative of a pure fractal structure characterized by Hin. Thus, we assume that such a trace may fairly be used as a control against which the fidelity of the above-mentioned analysis techniques may be evaluated.

The procedure for evaluating each of these analysis techniques is thus as follows: We first generated a set of 50 16,384-point fBm traces as well as 50 512-point fBm traces at each of 39 input Hurst exponents Hinbetween 0.025 and 0.975. In this manner, we sought to evaluate not only the fidelity of each fractal analysis technique in returning the expected results for the longer 16,384-point traces, but also the effect of performing the same analyses on data sets of limited length. Next, we applied each analysis technique under consideration to each of these traces, returning either a measured Hurst exponent Houtor a measured fractal dimension Dout. In the case of the Dubuc variation analysis, which returns a measured fractal dimension, this value was “converted”7 to a Hurst exponent via the relation Hout=2Dout. Having extracted these values of Houtfor each sample fBm trace and for each analysis technique, we produced a plot of HoutvsHinrepresenting all fBm traces analyzed with each analysis technique; these results are displayed in Figures 16 and 17 for randomly-generated fBm traces with lengths of 16,384 points and 512 points, respectively. In each of Figures 16 and 17, each data point represents the average Houtvalue measured via the corresponding analysis method. Each corresponding logarithmic scaling plot was fit to a straight line between a fine-scale cutoff of five data points and a coarse-scale cutoff of 1/5 of the full length of the trace. Each error bar represents one standard deviation in the measured values averaged to yield the corresponding data point. The dashed black line represents the ideal relationship Hout=Hin; that is, data points representing traces whose measured Houtvalues exactly match their generating Hinvalues would fall on this line.

Figure 16.

Plotting H out vs. H in for randomly-generated-16,384-point fBm traces as measured by the variational box-counting method (yellow), adaptive fractal analysis (green), Dubuc’s variation analysis (red), and the variance analysis (blue).

Figure 17.

Plotting H out vs. H in for randomly-generated 512-point fBm traces as measured by the variational box-counting method (yellow), adaptive fractal analysis (green), Dubuc’s variation analysis (red), and the variance analysis (blue).

In the ideal case of a perfectly fractal fBm trace subjected to an analysis technique that produces a precise and accurate value of the Hurst exponent, a plot of Houtvs. Hinis expected to be linear with unity slope. Based on the results of the analyses summarized in Figures 16 and 17, our results may be summarized as follows: the variational box-counting method tends to over-estimate Hexcept in the case of high Hvalues; the variance analysis tends to under-estimate H; the Dubuc variation analysis performs well only for H0.5; and AFA provides an accurate estimate of H throughout the range of H values. In the case of the shorter, 512-point traces, the deviations from the ideal relationship Houtvs. Hinare more pronounced. Additionally, the precision of the estimated H values for these shorter traces suffers as well, as seen in the relatively large error bars on the data points corresponding to the shorter traces.

We also investigated the effect on the measured H values resulting from another common deviation from ideal fractal behavior. Specifically, in experimentally measured time-series data sets, the smallest-scale measured features often are significantly larger than the resolution limit of the trace. Such is very often the case for experimentally measured data sets that are asserted to represent fractal behavior, in which the finest-scale features may exhibit a characteristic scale that is well over an order of magnitude larger than the point-wise resolution of the trace. To probe the effect of this limitation on a fractal analysis of such a trace, we repeated the above technique on a set of randomly-generated 512-point fBm traces that had been spectrally filtered via Fourier transforms to exhibit a well-defined minimum feature size (i.e., a well-defined maximum frequency component). Specifically, each trace was subjected to a Fourier filter that eliminates all frequency components corresponding to periods shorter than 10 data points, such that the resultant traces have a minimum feature size of 10 points. Figure 22 illustrates a characteristic result of this filtering procedure by comparing the original and Fourier filtered versions of an fBm trace with Hin=0.5.

Performing a fractal analysis of time-series traces with limited spectral content requires a reassessment of the length scales over which one expects to observe the fractal scaling properties. Whereas our analysis of fBm traces whose spectral content extended to the resolution limit of the traces examined scaling properties to a minimum length scale of five data points, we now cannot expect to see such scaling properties at length scales smaller than our minimum feature size of 10 data points. Given this well-defined minimum feature size, it may be tempting to set our fine-scale analysis cutoff at 10 data points and expect to observe the desired scaling properties at all length scales greater than this. In practice, however, the effect of such spectral filtering is manifest in a fractal analysis even at length scales significantly greater than that of the minimum feature size.

The results of passing the 512-point Fourier filtered fBm traces through the fractal analysis techniques under consideration are displayed in Figures 19 and 20, which illustrate the results obtained when applying fine-scale cutoffs of 10 data points (i.e., the traces’ minimum feature size) and 20 data points, respectively. In each of Figures 19 and 20, each data point represents the average Houtvalue measured via the corresponding analysis technique using the aforementioned cutoffs at the fine scale limit and 1/5 of the entire trace as the coarse scale cutoff limit. Each error bar represents one standard deviation in the measured values that were averaged to yield the corresponding data point. The dashed black line represents the ideal relation Hout=Hin, as discussed above.

Examples of the logarithmic scaling plots that yielded the data summarized in Figures 1617 and 1920 are provided in Figures 2124. For purposes of illustration, each of these figures shows the logarithmic scaling plots produced by applying the corresponding fractal analysis technique to the specific pair of fBm traces illustrated in Figure 18. That is, each fractal analysis technique under consideration quantifies the fractal characteristic of the input trace by determining the slope of a best-fit line to a log–log scaling plot; Figures 2124 provide examples of these logarithmic scaling plots.

Figure 18.

Comparison of a 512-point fBm trace with H in = 0.5 before (red) and after (blue) Fourier filtering to a minimum feature size of 10 points.

In each of Figures 2124, the vertical dashed lines indicate the cutoffs between which the scaling plot is fitted with a straight line whose slope is measured to determine Hout. For both traces in each of these figures, the coarse-scale analysis cutoff corresponds to the location of the line labeled “1/5 of trace.” The fine-scale analysis cutoff for the raw trace (red points) corresponds to the location of the line labeled “5 points” (corresponding to the data in Figure 17), while the fine-scale analysis cutoff for the filtered trace (blue points) may be chosen as 10 data points (corresponding to the data in Figure 19) or 20 data points (corresponding to the data in Figure 20), as represented by respective dashed vertical lines in Figures 2124.

Figure 19.

Summarizing the fidelity of four fractal analysis methods in measuring the H value for randomly-generated 512-point fBm traces with a minimum feature size of 10 points. The scaling properties were observed over 1.01 orders of magnitude in length scale.

Figure 20.

Summarizing the fidelity of four fractal analysis methods in measuring the H value for randomly-generated 512-point fBm traces with a minimum feature size of 10 points. The scaling properties were observed over 0.71 orders of magnitude in length scale.

Figure 21.

Comparison of scaling plots produced by the variational box-counting method applied to a 512-point fBm trace with H in = 0.5 before (red) and after (blue) Fourier filtering to a minimum feature size of 10 points.

Figure 22.

Comparison of scaling plots produced by the variance method applied to a 512-point fBm trace with H in = 0.5 before (red) and after (blue) Fourier filtering to a minimum feature size of 10 points.

Figure 23.

Comparison of scaling plots produced by the Dubuc variation method applied to a 512-point fBm trace with H in = 0.5 before (red) and after (blue) Fourier filtering to a minimum feature size of 10 points.

Figure 24.

Comparison of scaling plots produced by the adaptive fractal analysis method applied to a 512-point fBm trace with H in = 0.5 before (red) and after (blue) Fourier filtering to a minimum feature size of 10 points.

8. Conclusions

Contrasting the trends displayed in Figures 19 and 20 with those displayed in Figures 16 and 17 highlights the inherent challenge in assessing the fractal properties of time-series structures that suffer from limited total length and/or limited resolution/spectral content. Indeed, accommodating the impact of a minimum feature size that is significantly in excess of the trace’s resolution limit generally necessitates restricting a fractal analysis to length scales larger still than even this observed minimum feature size. This in turn often restricts an analysis of scaling properties to a consideration of relatively few orders of magnitude in length. For example, performing a fractal analysis of a 512-point Fourier filtered trace using analysis cutoffs corresponding to 10 data points and 1/5 of the trace length corresponds to an analysis of the scaling behavior over barely more than one order of magnitude in length scale; attempting to increase the accuracy of the measurement by raising the fine-scale cutoff to 20 data points further reduces the scaling range to 0.71 orders of magnitude.

Moreover, Figures 2124 demonstrate the difficulty in identifying an appropriate fine-scale cutoff for fractal analysis of a time-series trace, even when the minimum feature size found in the trace is easily identifiable and/or well-defined. The examples of Figures 2124 further highlight an important distinction between the application of fractal analysis techniques to spatial and time-series fractals. In the case of spatial fractals, it often is reasonable to expect to observe fractal scaling behavior between the length scales corresponding to physical constraints (and in particular at length scales sufficiently far from these cutoffs). By contrast, and as seen in Figures 2124, the effect of imposing (or observing) a finite minimum feature size on a time-series trace is evident at all scales, not just at those smaller than the minimum observed period. Accordingly, and as further illustrated in Figures 2124, this effect may impact the slope of a best-fit line to a logarithmic scaling plot (and, hence, the measured fractal dimension) even when this slope is evaluated between cutoffs that are expected to compensate for the fine-scale limitation.

In light of these results, one must take care when applying these analysis techniques to data sets limited in length or spectral content, as it may be difficult to make a compelling argument for the empirical presence of fractal behavior when examining such a narrow range of length scales. Nevertheless, it is instructive to examine the behavior of fractal analysis applied to known fractal structures such as fBm traces that have been artificially subjected to such constraints. For example, one may argue that an fBm trace that is Fourier filtered to exhibit a coarser minimum feature size is analogous to a natural structure or phenomenon that has been subjected to exterior influences such as weathering effects or measurement limits: both may be considered examples of structures that are legitimately generated via processes associated with fractal behavior, but whose true fractal nature has been obfuscated by secondary considerations. In the eyes of the authors, such effects do not necessarily render the resulting structures “less fractal” than their idealized counterparts. Nevertheless, such effects demand careful consideration when choosing an analysis method and an acknowledgment of the inherent limitations thereof.

Acknowledgments

The authors wish to thank Drs. Adam Micolich, Rick Montgomery, Billy Scannell, and Matthew Fairbanks for fruitful discussions. Generous support for this work was provided by the WM Keck Foundation.

Notes

  • Though Mandelbrot discusses the concept of fractional dimension in this 1967 paper, he did not introduce the term “fractal” until 1975 [3].
  • While the box-counting method is typically applied to structures embedded in two dimensions, it is straightforward to generalize the technique to higher- or lower-dimensional systems.
  • Not to be confused with the variational box-counting method.
  • In all discussions of time-series traces, we refer to the independent variable as “time” as a matter of convention unless otherwise specified. Additionally, as a matter of convention, we refer to an interval of the independent variable as a “length” unless otherwise specified.
  • Not to be confused with the variational box-counting method or the variance method.
  • Note that this relation only applies to time-series fractals, since the notion of a Hurst exponent is undefined for spatial fractals.
  • As discussed above, such a conversion is at best an approximation. Nonetheless, utilizing this conversion serves as a self-consistent means of evaluating the response of this analysis technique when applied to fBm traces of a known Hurst exponent, as well as deviations from this behavior.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Ian Pilgrim and Richard P. Taylor (November 20th 2018). Fractal Analysis of Time-Series Data Sets: Methods and Challenges, Fractal Analysis, Sid-Ali Ouadfeul, IntechOpen, DOI: 10.5772/intechopen.81958. Available from:

chapter statistics

343total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Fractal Geometry: An Attractive Choice for Miniaturized Planar Microwave Filter Design

By Hadi T. Ziboon and Jawad K. Ali

Related Book

First chapter

Complexity Concepts and Non-Integer Dimensions in Climate and Paleoclimate Research

By Reik V. Donner

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More about us