Tabular of results for box counting method application.

## 1. Introduction

The current evolution of both texture analysis algorithms and computer technology made boosted development of new algorithms to quantify the textural properties of an image and for medical imaging in recent years. Promising results have shown the ability of texture analysis methods to extract diagnostically meaningful information from medical images that were obtained with various imaging modalities such as positron emission tomography (PET) and magnetic resonance imaging (MRI). Among the texture analysis techniques, fractal geometry has become a tool in medical image analysis. In fact, the concept of fractal dimension can be used in a large number of applications, such as shape analysis[1] and image segmentation[2]. Interestingly, even though the fact that self-similarity can hardly be verified in biological objects imaged with a finite resolution, certain similarities at different spatial scales are quite evident. Precisely, the fractal dimension offers the ability to describe and to characterize the complexity of the images or more precisely of their texture composition.

## 2. Fractals

### 2.1. Fractal geometry

A fractal is a geometrical object characterized by two fundamental properties: *Self-similarity* and *Hausdorff Besicovich dimension*. A self-similar object is exactly or approximately similar to a part of itself and that can be continuously subdivided in parts each of which is (at least approximately) a reduced-scale copy of the whole. Furthermore, a fractal generally shows irregular shapes that cannot be simply described by Euclidian dimension, but, fractal dimension (

Nature presents a large variety of fractal forms, including trees, rocks, mountains, clouds, biological structures, water courses, coast lines, galaxies[3]. Moreover, it is possible to construct mathematical objects which satisfy the condition of self-similarity and that present fd (Figure 1).

The objects in Figure 1 are self-similar since a part of the object is similar to the whole and the fractal dimension can be calculated by the equation:

where

In mathematics, no universal definition of fd exists and the several definitions of fd may lead to different results for the same object. Among the wide variety of fd definitions that have been introduced, the Hausdorff dimension

### 2.2. Hausdorff dimension D H

Hausdorff dimension

Hausdorff formulation[3] is based on the construction of a particular measure,

Intuitively we can sum up the construction as follows: let be

We define the Hausdorff measure as the function

with

We obtain an approximate measurement of*course-grained volume*[4].

In the one-dimensional case (*coastline paradox*[3].

Hence, when

Therefore,

with

The *course-grained volume* defined by Eq. 3 normally presents a scaling like:

that provides a method to estimate the dimension

In the uni-dimensional case

from which we derive

## 3. Methods

Although the definition of Hausdorff dimension is particularly useful to operatively define the fd, that presents difficulties when implementing it. In fact, determining the lower bound value of all coverings, as defined in Eq. 5, can be quite complex. For example, let’s consider the uni-dimensional case, in which we want to compute the fd of a coastline (Koch Curve). According to Eq. 3 in the case of

This discussion implies that our coastline (ex. Koch Curve) will have a fd value more than one-dimensional and less than two- dimensional. For this reason, the fd is considered as the transition point (the lower bound value in Eq. 5) between

Several computational approaches have been developed to avoid the need of defining the lower bound at issue. Therefore many strategies accomplished the fd computation by retrieving it from the scaling of the object’s bulk with its size. In fact, object’s bulk and its size have a linear relationship in a logarithmic scale so that the slope of the best fitting line may provide an accurate estimation of this relationship. By using this log-log graph, called *Richardson’s plot*, the requirement of knowing the infimum over all coverings is relaxed.

Several approaches have been developed to estimate fractal dimension of images. In particular, this section will introduce two fractal analysis strategies: the *Box Counting Method* and the *Hand and Dividers Method*.

These methods overcome the problem by choosing as covering a simple rectangle fixed grid in order to obtain an upper bound on

Five algorithms for a practical fd calculation based on these methods will also be presented.

### 3.1. Box counting method

The most popular method using the best fitting procedure is the so-called *Box Counting Method*[5][6]. Given a fractal structure

The number

The box counting algorithm hence counts the number

Figure 2 shows the Box counting method for the Koch Curve.

Several algorithms[7][8][9] based on box counting method have been developed and widely used for fd estimation, as it can be applied to sets with or without self-similarity. However, in computing fd with this method, one either counts or does not count a box according to whether there are no points or some points in the box. No provision is made for weighting the box according to the number of points belonging to the fractal and inside the current box.

### 3.2. Hand and dividers method

Useful features and information can be deducted from the contours of structures belonging to an image and there is a number of techniques that can be used when estimating the boundary fractal dimension.

The most popular methods are all based on the *Hand and Divers Method* which was originally introduced by Richardson[10] and successively developed by Mandelbrot[11].

The Richardson method employs the so-called *walking technique* consisting of "walking" around the boundary of the structure with a given step length.

The actual structure boundary is so approximated by a polygon whose length is equal to:

In a nutshell, it corresponds to the length of the single step multiplied by the number of steps needed to complete the walk.

The process is then reiterated for different step lengths:

With

The object’s boundary fd

where

The perimeter length of the boundary depends on the step length used so that a large step provides a rough estimation of the perimeter whereas a smaller step can take into account finer details of the contour.

Consequently, if the step length

In practice, the perimeter length is obtained by constructing a generally irregular polygon which approximate the border. Let

as close as possible to

The reached point then becomes the new starting point and is used to locate the next point on the boundary that satisfies the previous condition. This process is repeated until the initial starting point is reached.

The sum of all distances

A number of different perimeters for each polygon at each fixed step length are used to build the Richardson’s plot and the slope of its best linear fit is exploited to estimate the fd.

## 4. Algorithms

All Hand and dividers techniques rely on the same identical principle that attempt to approximate the border perimeter with a different polygons. However, since the point coordinates belonging to border set are discrete, all the implemented methods differ in the choice of which point in the set has a distance that better approximate the step length.

The following two methods are the implementations of two different choices about how to overcome this particular issue.

### 4.1. HYBRID algorithm

The HYBRID algorithm is a computer implementation of Hand and Dividers method developed by Clark[12]. Let

Given an arbitrary *starting point* *current point*, *running point*

Therefore the program searches for a specific running point having a distance from

Afterward, the computed distance between these two points

The procedure continues until the initial starting point is reached. Obviously it is likely that after a complete walking the starting point

### 4.2. EXACT algorithm

The EXACT algorithm was proposed for the first time by Clark in 1986[12]. As it will be shown, this method requires a longer computational time by providing a simpler solution to the choice of the best current points.

Similarly to the previous method the entire perimeter estimation is displayed in the flow chart of Figure 5.

The procedure is very similar to the one used for the previous method. As before (see Figure 5), the end of the step may not coincide with the digitized coordinates of the boundary.

The way the EXACT method attempts to overcome this problem relies on the assumption of piecewise linearity, meaning that all the points on the contour can be joined by a series of straight line[13, 14] (see Figure 6 (a)).

The location of the next current point

The procedure starts from an arbitrary starting point *current point*, *running point*

The distance from the current point to each point on the contour line is then calculated until the step length

The exact position of the point

The process is stopped when we come back to the initial starting point

The point

The perimeter length of the polygon is found by adding the final incomplete step length to the sum of the other step lengths needed to entirely cover the boundary.

The procedure is then repeated for different step lengths[15].

The results, i.e., perimeter lengths versus step lengths, are plotted on a log-log Richardson’s Plot. From the slope of the fitting line on the Richardson’s plot we obtain the fd of the examined boundary[17, 12, 16, 18, 19, 1, 20, 21, 4]

### 4.3. Box-counting algorithm

The Box-counting algorithm implementation of box-counting method relies on the basic idea of covering a given digital binary image with a set of measuring boxes of sizes

Figure 8 shows the flow chart for box-counting fd estimation and for different box sizes. Moreover, since the procedure of size scaling (

Therefore the final image *padarray* matlab function.

### 4.4. Differential Box-counting algorithm (DBC)

The box counting method is an extremely powerful tool for fd computation; in fact, it is easy to implement as well as flexible and robust.

However, a major limitation lies on the fact that the counting process of nonempty boxes implies its use only for binary images rather than gray scale ones. An extension of the standard approach to gray scale images is called the *Differential Box Counting (DBC)* and has been proposed in 1994 by N. Sarkar and Chaudhuri[8].

In the DBC method, a gray level image

As for the standard box counting, the

Then, the scale of each block is

Let numbers

The number of boxes covering this block is calculated as:

In Figure 9 for example

Extending to the contribution from all blocks:

The Eq. 16 is computed for different box size

A matlab implementation of DBC can make use of functions such as

The DBC procedure has some weak points in the method used to select an appropriate box height[7], since the values of

Secondly, the box number calculation may lead to overestimate the number of boxes needed to cover the surface. Let

According to DBC procedure, the two pixels are assigned in boxes 2 and boxes 3. The distance between

Hence, when calculating Eq. 15, the block can be covered by a single box but its pixels with minimum and maximum gray levels fall into two different boxes.

To solve the aforementioned problems some modifications was proposed by J. Li, Q. Du and C. Sun[7]. Given a digital image

In particular, let

As a result, the errors introduced using

Moreover, the use of

with ceil(. ) denoting the function rounds the elements of the quantity into (. ) to the nearest integers greater or equal to it.

Eq. 18 relies a new way to count the number of boxes that cover the

As an example, suppose that the

According to Eq. 18 the number of counted boxes is

As in standard box counting method, after having determined the number nr(i,j) for each block, the total number of boxes

## 5. Applications and discussion

Each described method has been implemented in Matlab 2010a and applied to either well-known fractals or biomedical images.

The results on the hand and dividers methods are shown in the table 1. The computed values are also compared to the theoretical fd values. The computational time for a 2.50 GHz 5i CPU is also shown.

The value ranges for the step size are not displayed but they were automatically chosen based upon the computation of the structure’s maximum caliber diameter which is defined as the major axis of an ellipse in which the structure can be embedded. The range was then running from the 40

In practice, both EXACT and HYBRID methods computed the different step sizes by scaling each time the maximum step by

The parameter’s estimation uncertainty is also shown in the table 1; that is calculated from the fitting accuracy based upon standard linear regression.

The number of data points used in the Richardson’s plot was about 60 and two examples of that computation using EXACT and HYBRID are shown in Figure 12.

On the table 2 the computation results for the box counting method are also shown. The type of the displayed values are similar to the previous ones with the exception of Box counting uncertainty. In fact, the way an image can be partitioned into several boxes may affect the final computation of the number of nonempty boxes.

To investigate the variability of the fd for different box partitioning layouts, random box subdivisions have been applied. Therefore, the results on the table 2 show the standard deviation of the different computed fds and the mean values for each fractal at issue. In general, that variability is more pronounced in images having rougher resolution.

Apollonian Gasket | 1.3057 | 1.408 | 1.5 | 0.001 | 2000 |

Sierpinski | 1.5849 | 1.587 | 0.3 | 0.005 | 1000 |

Dragon | 2.0000 | 1.747 | 7.2 | 0.006 | 3670 |

Hexaflake | 1.7719 | 1.640 | 1.6 | 0.011 | 1050 |

Twin Dragon Hybrid | 1.5236 | 1.466 | 8.6 | 0.006 | 117005 |

Twin Dragon Exact | 1.5236 | 1.465 | 11.5 | 0.006 | 117005 |

Dragon Hybrid | 1.5236 | 1.474 | 11.1 | 0.005 | 115665 |

Dragon Exact | 1.5236 | 1.462 | 12.8 | 0.004 | 115665 |

Koch Hybrid | 1.2619 | 1.276 | 31.2 | 0.004 | 786433 |

Koch Exact | 1.2619 | 1.260 | 154.9 | 0.003 | 786433 |

Gosper Hybrid | 1.1292 | 1.133 | 3.8 | 0.001 | 23280 |

Gosper Exact | 1.1292 | 1.128 | 4.7 | 0.001 | 23280 |

In general, the EXACT and the HYBRID methods appeared to be more precise than the box counting method but on the other hand they have a less wide range of applicability. However, this is also the reason of the fortune of the box counting methods compared to the others. Also, HYBRID technique is computationally less expensive than EXACT especially when the number of border points is quite large. The use of a variable step length which can be shorter or longer than the fixed step size leads to a larger variability and so to a Richardoson’s plot having a less accurate fitting. That has effects on the uncertainties of the parameter to estimate. Because of that, a more careful choice of the step size range is needed in the case of HYBRID method.

Importantly, it is quite clear that the choice of the starting point may also affect the perimeter value as the following currents points will depend upon this. A test on 80 random starting points for the Gosper Island fractal revealed that the fd computation performed with the HYBRID method appeared to be more stable than the one with EXACT.

As for walking method, in box counting the process of scaling from the maximum box size is limited by the pixel size so in principle a gross resolution might be the reason of a bad estimate of fd. It is noteworthy that the tests performed do not show any correlation between resolution and fd accuracy; that may be also caused by the fact that some fractals such as dragon does not reproduce the real fractal at small scales.

An application of the DBC method on a x-ray image is also shown in Figure 13 where breast cancer mammography image has been processed. The method uses a sliding technique as implemented in

The second DBC method shows higher contrast in the area of the cancer and consequently lower fd values. Due to the enormous amount of linear fitting performed for an image size of 3450

## 6. Conclusions

In this chapter some of the most widely used and robust methods for fractal dimension estimation as well as their performances have been described. For few of them a detailed description of the algorithm has been also reported to make much easier for a beginner to start and implement his own Matlab code. Computational time is not excessively long to necessitate compiled functions such as C-mex files but that can be an advantage when using very high resolution images. The use of the described algorithms is obviously not restricted to the sole field of the image processing but it can be applied with some changes to any data analysis.