Very High Resolution satellites (1999-2006), with their spatial resolutions and spectral bands. Note that there is not any available satellite with VHR in multispectral bands.
The requirements for advanced knowledge on forest resources have led researchers to develop efficient methods to provide detailed information about trees. Since 1999, orbital remote sensing has been providing very high resolution (VHR) image data. The new generation of satellite allows individual tree crowns to be visually identifiable. The increase in spatial resolution has also had a profound effect in image processing techniques and has motivated the development of new object-based procedures to extract information. Tree crown detection has become a major area of research in image analysis considering the complex nature of trees in an uncontrolled environment. This chapter is subdivided into two parts. Part I offers an overview of the state of the art in computer detection of individual tree crowns in VHR images. Part II presents a new hybrid approach developed by the authors that integrates geometrical-optical modeling (GOM), marked point processes (MPP), and template matching (TM) to individually detect tree crowns in VHR images. The method is presented for two different applications: isolated tree detection in an urban environment and automatic tree counting in orchards with an average performance rate of 82% for tree detection and above 90% for tree counting in orchards.
- Tree crown detection
- VHR image
- Template matching
- Marked point process
- Watershed segmentation
- Local maxima
- Region growing
Inventories on forest communities are performed with the objective of providing support to the management and conservation activities in rural or urban forests or even in tree plantations. The traditional method of obtaining information on forest communities is to use systematic or random sampling or by sampling stands, so that the final parameters for the population are obtained on the basis of statistical extrapolation [1, 2]. Usually, the following parameters are determined for each tree included in the sampling: location, diameter at breast height (DBH), basal area (BA), height, identification of the species, crown size, and crown closure. Based on these measurements, other parameters such as volume of wood and biomass can be derived for the community stand. This renders the field survey techniques for forest inventories expensive, time consuming, and unsuited for large areas.
Remote sensing with high spatial resolution is a cost-effective and reliable way to obtain information about trees. It may be the only practical manner to assure sustainable management of forests with the necessary information, such as biochemical and biophysical data on the vegetation in a synoptic and repetitive manner for large areas and over long periods of time . The tree crown is the basis of the data required for the inventory, for it allows to determine not only its size but also its position, crown closure, and, in some cases, the species. It also allows the derivation of parameters such as the density of the population, the health condition of the trees, the volume, the biomass, and the carbon sequestration rates [3–6]. This information is crucial to a series of applications such as the inventory and management of forested areas as well as in parks and urban forests. It can also be used for counting and monitoring trees in orchards or under power lines to prevent damage and accidents.
The study of individual trees with remote sensing started with the use of aerial photography with very high spatial resolution (scale greater than 1:10.000), driven mainly by the use of stereoscopy techniques. The task was performed by photointerpreters trained to recognize individual tree species, extract a series of measurements, or evaluate different types of damage . The use of orbital optical remote sensing data for forest studies began in the 1970s, with the development of techniques to separate forested from non-forested areas . The spatial resolution of these satellite images was the main limiting factor for more detailed studies about the forests, and as a result, the studies remained focused on the disturbances affecting forests (such as land clearing, burning, diseases, and pest) or to estimate some biophysical parameters of the vegetation [3,8]. It was only toward the end of the 1990s that orbital remote sensing began to provide very high resolution (VHR) data with a spatial resolution under 1 m, allowing the study of individual trees. Launched in 1999, Ikonos was the first of what is now a series of VHR satellites (Table 1), consolidating the use of orbital data for the study of individual trees. However, the increase in spatial resolution was not always accompanied by an increase in spectral resolution for VHR data which is often restricted to a single panchromatic band.
The increase in spatial resolution changed the focus of many remote sensing studies, which started to analyze not only classes of objects but also each object individually . Branches and irregularities within the crowns became visible, and as a result, the spectral response of a tree is influenced by variations in the shape of the crown (differential illumination) and background effects. This causes an increase in the intra-class variance and often results in a reduced accuracy when using conventional pixel-based classification . This had a significant effect on the image processing techniques for forest studies and generated the development of new forms of information extraction.
Within the study of individual objects, the automatic detection and delineation of tree crowns using remote sensing VHR imagery have attracted much attention from researchers in forestry and computer vision [4,7]. Researchers have developed several automatic and semiautomatic methods for extracting individual trees and their characteristics using digital aerial photos of various types and VHR satellite images . The applications range from the identification of tree crowns to their delineation and classification and are often based on image segmentation algorithms and other advanced image processing and analysis techniques [9,12]. Most of these algorithms were specifically developed for the detection and delineation of trees in temperate forests based on the assumption that the trees are cone shaped and round (mostly coniferous) in the images, with the apex of the tree having the highest reflectance of the crown area .
|Satellite||Launch Year||Px resolution* (m)||Mx resolution* (m)||Multispectral bands|
|Ikonos II||1999||0.82||3.2||Blue, Green, Red, Near IR (4)|
|QuickBird||2001||0.65||2.62||Blue, Green, Red, Near IR (4)|
|Geoeye-1||2008||0.46||1.84||Blue, Green, Red, Near IR (4)|
|WorldView-2||2009||0.46||1.85||Coastal, Blue, Green, Yellow, Red, Red Edge, Near IR, Near IR2 (8)|
|Pleiades 1A||2011||0.5||2.0||Blue, Green, Red, Near IR (4)|
|Pleiades 1B||2012||0.5||2.0||Blue, Green, Red, Near IR (4)|
|Kompsat-3||2012||0.7||2.8||Blue, Green, Red, Near IR (4)|
|SkySat-1||2013||0.9||2.0||Blue, Green, Red, Near IR (4)|
|WorldView-3||2014||0.31||1.24||Coastal, Blue, Green, Yellow, Red, Red Edge, Near IR, Near IR2 (8)|
|SkySat-2||2014||0.9||2.0||Blue, Green, Red, Near IR (4)|
|Kompsat-3A||2015||0.55||2.2||Blue, Green, Red, Near IR (4)|
|WorldView-4||2016||0.34||1.36||Not available at time of printing|
The analysis of individual trees based on remote sensing images is a complex problem. Images of trees with varied crown size increase the difficulty of the analysis. What is detected as a single object may in fact represent a separate branch or even a group of trees . Other sources of error are caused by the proximity between neighboring trees, trees located under other trees, trees in the shade, or trees that have a low spectral contrast with the background . Consequently, high-level complex algorithms are necessary to exploit this contextual information .
This chapter provides an overview of the state of the art in individual tree crown detection based on optical VHR remote sensing data. An original method developed by the authors is also presented as an alternative approach to the problem of tree crown detection. In Part I, we present the main algorithms developed for the detection of individual trees, be it for tree identification or delineation. The principle of each approach is presented as well as its potential and limitations. Part II is dedicated to outlining the original MPP–TM approach, a hybrid method that combined two methods used in pattern recognition: marked point process and template matching. The results are shown for tree detection and delineation in an urban environment and for tree counting in orchards.
2. Part I – Review of tree crown detection methods
We present six of the main algorithms used in individual tree detection in high spatial resolution images. The algorithms are summarily described individually, but it should be noted that many approaches use hybrid methods for the detection and delineation of tree crowns. For instance, some authors might use one algorithm for detecting the trees and another to delineate them; some may even use one approach as a first approximation and another to fine-tune the results.
2.1. Local maxima filtering
Local maxima (LM) filtering is a technique used for identifying tree crowns in high spatial resolution imagery which is based on the recognition of the points with the greatest brightness within a search window that scans the entire image [4,14]. The search window, with a fixed size, defines which pixel has the greatest reflectance compared to all the other pixels inside the window. The pixels with the highest digital number are identified as possible tree locations. This method is adequate for trees which have the greatest reflectance at their top, surrounded by lower intensity pixels, and due to its concept, it is widely used for detecting conifers.
When the kernel window passes over the image, it does not take into account the presence of trees with different crown sizes, and the success of the LM tree recognition depends on the careful selection of the size of the search window. If it is too small, errors of commission occur by selecting nonexistent trees or multiple radiance peaks for an individual tree crown; if it is too big, the algorithm is likely to miss some trees (omission errors) .
The identification of trees by LM is affected by false bright pixels, which are not part of the brightest part of the crown. An effective method for dealing with the problem is to apply a Gaussian filter to the image. This allows the low-pass filter function to grant more weight to the crown center pixels (surrounded by much lower values) compared to those located toward the crown edge which might belong to other bright objects or noise. Applying a Gaussian filter directly affects the number of local maxima identified and causes the smoothing off of the brightness values on the tree crown edges .
In order to minimize the problem of the window size with LM, reference  used windows of varying sizes based on the assessment of the spatial structure of the image obtained by analyzing the local semi-variogram with different pixel lags and different window sizes. This results in a personalized window for each pixel, leading to greater accuracy when compared to using a single fixed window size. Reference  used LM to identify the centroid of eucalyptus trees in Australia. The search for the trees is carried out based on the maxima in four linear kernels pertaining to the four main directions (0°, 45°, 90°, and 135°) of the image and by summing the individual maxima found in each pass (Figure 1).
2.2. Template matching
Template matching (TM) is a technique used for object recognition widely cited in the specialized literature which uses quantitative descriptors, such as length, area, and texture to describe recurring patterns in an image [17,18]. Based on a synthetic model or a sample extracted from the image, the correlation coefficient between the model and the image is calculated in order to determine the strength of the match between the two matrices. The object is assumed to be located where the measurement of the match reaches a maximum .
where z is the vertical axis of the center of the tree crown in its origin, a is half the height of the ellipsoid, b is half the radius, and n is the parameter of the shape of the tree crown. Subsequently, the model is illuminated using the acquisition parameters of the image (sun elevation and azimuth) and the characteristics of crown absorption and reflection of light in the chosen spectral band.
Because it is based on a physical model (rather than a complex mathematical concept), TM is considered a user-friendly method. Its limitation is mainly due to the need to use a library of models if many types of trees are present in the image, which may involve a complex phase for generating the models. Figure 2 shows examples of synthetic tree models and an application in a orchard.
References [20 and 21] used this technique to identify tree crowns in aerial images. Other researchers used this technique to recognize individual tree crowns, using templates made from small sub-images of the actual scene to identify the trees [22, 23]. Reference  proposed an improved version by generating separate models for trees and their shade in VHR images of unmanned aerial vehicles (UAV). The authors explored the relation between the tree and shade models separately and then joined them to generate a more powerful object detector.
Valley-following (VF) is a crown delineation method which identifies the shaded areas between the trees. This methodology was initially described in reference  and makes an analogy with topographic data, where the shades of gray of the pixels represent local lows in the third dimension. In this analogy, the bright tree crowns would be the hills and the darker zones around the trees the valleys (Figure 3). This darker zone is the one which typically helps human interpreters to separate one tree crown from the other. In this approach the shaded areas are eliminated, making it possible to separate the trees in the image. This was not sufficient to separate all of the trees, so the authors developed an approach based on a series of rules (e.g., no discontinuity, checking directions, context, gap filling, etc.) to accurately describe the boundaries of each tree, one at a time .
This approach performed well in images with a combination of low solar elevation angle and conical trees. Conversely, the approach failed to produce good results when the canopy was composed of trees of very different sizes, or when the tree crowns were very large and have internal shadows. The latter case resulted in subdividing the individual tree into two or more parts. Smaller trees, in contrast, tended to be grouped together. Reference  found that this approach causes many false positives (FPs) in open areas (clearings). As a solution, they suggested the exclusion of these areas by retaining only the high-value pixels in the normalized difference vegetation index (NDVI).
Like VF, the watershed segmentation (WS) is a technique related to thresholding that uses the gray levels in the images as if it were a topographic surface . It is used not only for the delineation of individual tree crowns but also for generic segmentation of images. The watershed concept is based on a 3D image representation, with the third dimension being provided by the intensity of gray. The main objective of the watershed algorithm is to find the “drainage” divide lines. The “relief” in the image is inverted (high gray values become valleys) and progressively filled with a virtual liquid, and when the liquid is almost overflowing from one basin to another, a virtual dam is built, to create the watershed. These lines are considered the limits of each segment. The simplest approach to the construction of the dam is the use of morphological dilation of the minima, without merging the regions .
The images are usually preprocessed before the WS is applied. In fact, this segmentation is frequently applied to the gradient of an image, and not to the image itself. This is due to the relative homogeneity of the gray values of objects that do not provide sufficient contrast for an effective segmentation. In this formulation, the regional minimum value of the catchment basins usually correlates well with the lower gradient values that match the contours of the objects of interest [17,28]. The direct application of the WS algorithm generally leads to over-segmentation due to noise or other local irregularities of the gradient (Figure 4a). One of the approaches used to limit the number of regions is to use markers. The selection of markers can be based on simple procedures, intensity and connectivity between pixels, or even complex descriptors, such as size, format, location, relative distances, texture, and others. The use of markers provides prior knowledge to support the segmentation process .
The approaches that use WS for the delineation of the tree crowns normally use markers representing the center of the tree crown, to assist the segmentation process. For instance, reference  used WS to detect and delineate tree crowns in a VHR forest image in Canada but divided the approach into two phases, namely using LM to detect the crown and applying WS for the delineation. The LM image with the detected crowns was produced by using a Laplacian of Gaussian edge detection operator. The tree crowns were modeled based on their geometry and radiometry, resulting in an image of markers. This image then served to guide the WS in delineating the crowns. Reference  developed a bitemporal procedure for the automatic segmentation and reconciliation of groups of pixels (called blobs) within the forest using WS. By using two dates, they were able to increase the probability of properly defining the tree contours. Many problems were encountered in the segmentation process of the individual trees. For instance, trees with spread branches were sometimes split into two or more segments or contrarily by including several crowns in the same segment when trees were not sufficiently separated.
2.5. Region growing
Region growing (RG) is another segmentation technique that groups pixels or groups of pixels based on predefined growth criteria in an attempt to separate and recognize objects in the image . Like WS, RG is used as a generic segmentation method and can be adapted for the delineation of individual tree crowns (Figure 4b). Starting with some seed pixels (which can be random if no other information is provided), the neighboring pixels are examined one by one and added to the growth region if their predefined properties are similar to those of the seeds (such as specific intervals of intensity or color) . When no more pixels can be added or some predefined limit is reached (e.g., number of pixels), these pixels are labeled as belonging to the specific region of the seed pixel. Additional criteria can increase the power of an RG algorithm by introducing a higher concept like size and similarity between candidate pixels and the pixels selected or even the format of the region [17,31].
Reference  used RG integrated with LM to identify and delineate tree crowns in Australia. The LM method served to find the center of potential trees, which were then used as seeds for the RG. Reference  tested two different types of segmentation by RG, one by Brownian motion and the other by random walk, to detect conifers in a boreal forest. The methods were capable of detecting about 80% of the illuminated portion of the crowns, with a better performance found in larger crowns (Figure 5).
2.6. Marked point processes
The marked point process refers to a probabilistic method which has been used in recent years for the recognition of objects in high spatial resolution imagery [5,11,32–35]. In an MPP, sets of random points in a given space (x, y) are provided with a mark which is complete and separable, allowing the definition of a topology (defined by the mark) and the attribution of a label. An image is considered a random model where the gray tones are the realization of a random point process . This random configuration of gray levels in the images is then modeled based on geometric figures (ellipses, circles, rectangles, and lines), respecting certain geometric (nature of the objects) and radiometric constraints (type of image).
The laws of density and probability distinguish various types of point processes, which can be Poisson, Strauss, Markov, or Gibbs, among others. The Markov or Gibbs point processes have been used for the recognition of tree crowns by a number of authors [5,32,33]. These processes are defined by a density function using a form of energy expressed as a sum of the a priori energy and the local energy. The process seeks to minimize the global energy of the model, by iterating it with some optimization scheme (Markov random fields, algorithm of multiple births and deaths, and Monte Carlo-Monte Carlo simulations).
Reference  proposed two different models to serve as marks in an MPP, one in 2D for detection of trees in densely forested zones (Figure 6) and the other in 3D for scattered or isolated zones, based on aerial photos of high spatial resolution in the infrared band. The MPP was integrated with a reversible jump Markov Chain Monte Carlo in a simulated annealing method. Reference  used an MPP to automatically detect the tree crowns in high spatial resolution images, based on the modeling of the crowns as 2D circles on high-resolution spatial images. The method was successfully tested on mangrove forests and eucalyptus plantations.
In the previous section, we have presented some of the most common algorithms used in the detection of individual trees, be it for their identification, delineation, or both. Table 2 presents a summary of these principles through their main characteristics and limitations.
Trees may differ in shape, size, spectral properties, height, foliage type, and density, and their spatial context varies with illumination, ground type, and inclination. They can also be surrounded by many other objects, especially in an urban setting. As such, the task is not trivial and can become highly complex depending on the number of parameters involved. Conversely, in planted forest and orchards where trees have the same age and species, tree crown extraction can take advantage of their relative uniformity.
|Algorithm||Usage||Principle||Researches||Characteristics / Limitations|
|Identification of tree crown||Identification of|
locally as corresponding
to the apex of a
treetop within a
|Wulder et al. (2000)|
Pouliot (2002), Wang et al. (2004)
|Appropriate for conifers, with a conical shape and high reflectance point at the top of the tree.|
Simple method to use.
Results are affected by the spatial distribution of trees, variation of tree crowns size, search window size (increased omission errors in larger windows and commission errors in smaller windows).
|Pattern recognition||Quantitative descriptors used to describe patterns. Calculate the correlation between the image and the model. Model may be a sample extracted from the image or not.||Pollock (1996)|
Larsen and Rudemo (1998)
Quackenbush et al. (2000)
Hung et al. (2012)
|Enables analysis of the tree crown from its spectral, textural and structural characteristics.|
Allows neighborhood analysis of the tree crown by considering its shadow.
Needs a template library, making it unpractical in complex forests.
Recognition errors increase with irregularity of the tree crowns.
Easier to detect larger trees than smaller ones.
Performance reduced in very dense environments.
|(VF)||Delineation of tree crown||Derives from an analogy with a topographical surface, programmed to identify the shaded portion between the tree crowns (valleys).||Gougeon (1995, 1999)|
Leckie and Gougeon (1998)
Gougeon and Leckie (2003)
Gougeon and Leckie (2006)
|Appropriate for trees with conical shape that create shadow areas between individuals.|
Most successful to delineate populations of the same age without intertwined tree crowns.
Best performance for images in mid-low solar elevation angle.
Performance reduced when trees are asymmetrical, of from different species, with different tree crown sizes or when shadows of trees protrude over each other.
Tendency to group smaller trees together and split larger trees into multiple segments.
|Delineation of tree crown||Performed from the|
image gradient. Image is
seen as an inverted topographic surface flooded to determine watershed divides. Commonly uses markers to limit the number of segments.
|Wang et al. (2004)|
Lamar et al. (2005)
|Performs best when applied after selection of markers to control segmentation process. More suitable for conifers, which allow preselection of treetops by using another approach (usually LM).|
Over-segmentation occurs when applied directly to the image or without the use of markers.
Can separate tree crowns in different segments when the branches are too spread, or may include several trees in the same segment when there is no spatial separation between them.
|Delineation of tree crown||Groups pixels or sub-regions based on predefined criteria for the growth of region in order to separate and recognize objects in the image.||Culvenor (2002)|
Pouliot et al. (2002)
Bunting and Lucas (2006)
Pu and Landry (2012)
|More complex shapes of trees are better delineated.|
Method more complex as it requires different rules for different environments.
Tends to create more than one segment when the tree has branches with dark portions, and tends to group different trees if they are very similar.
|Marked Point Processes|
|Pattern recognition||Stochastic process in|
which unordered points
in a space are provided
with marks. Marks are modeled from geometric and radiometric characteristics of objects.
Larsen et al.
|Performs best with plantations of trees of same species and age and in images of isolated trees.|
It is less effective to detect trees in more complex environments.
Reference  compared six different algorithms (valley-following, region growing, template matching, scale-space theory, marked point processes, and Markov random fields) in six different aerial images, ranging from a homogeneous plantation and an area with isolated tree crowns to an extremely dense deciduous forest type. The authors found that none of the algorithms can by itself reach a high rate of success in all of the tested images and concluded that there is no single optimum algorithm for all types of images and forests. They also emphasized that for complex types of forests, monoscopic images are insufficient for a consistent detection of tree crowns, even for human interpreters.
3. Part II – A hybrid approach integrating marked point process and template matching
As shown in our brief review, many methods have been developed for trees in temperate forest environments. In an exploratory research , three algorithms in urban tropical environments were tested: region growing, watershed, and template matching. Better results were generally obtained by combining region-growing segmentation and geographic object-based image analysis (GEOBIA) for classification. Although highly effective, the approach requires much parameter setting and experience and is not especially dedicated to the problem of tree crown detection.
Studies that use marked point processes have triggered our attention and made us consider that they could benefit from using marks modeled from 3D objects in a different approach than from that developed by reference . We propose to use a geometrical-optical tree model in a manner resembling that of template matching that uses some form of correlation between image and model to identify candidate pixels. An MPP taking advantage of a geometrical optical 3D model and measurements of similarity to seek tree crowns could represent a significant improvement to using simpler marks. Considering such a hypothesis, we developed an algorithm for tree crown detection that combines elements from MPP, TM, and tree crown geometrical-optical modeling for the automatic detection and (simplified) delineation of trees in VHR satellite imagery. We have named our algorithm MPP–TM.
In our approach, the TM did not scan the whole image like it was initially conceived but rather uses an MPP approach to select random locations within the image. Additionally, the 3D marks receive a random diameter between a predetermined range depending on the type of environment. The geometrical-optical model includes both the sunlit and shaded areas of the crown and a portion of the projected shadow to allow a better match between model and image. Some statistical and spectral parameters were also included in the model-matching phase.
MPP-based algorithm for pattern recognition usually alternates between phases of birth and death during which the objects are created (placed) and destroyed when they do not comply with the matching rules. This is also a characteristic of MPP–TM, but we have somewhat deviated from the original concept where the destruction phase also incorporated a random process.
The following subsections are devoted to describe the construction process of the 3D geometrical-optical model and the functioning of the algorithm.
3.1. Description of MPP–TM approach
3.1.1. A geometrical-optical 3D tree crown model
The parameters that determine the radiance pattern of a tree crown are direct and indirect radiation, shape of tree, branch pattern, leaf reflectance, multiple reflectances within the canopy, etc. . In creating a valid 3D geometrical-optical model, we have chosen a simplified version in which the crown is represented by a dome of varying skewness, a Lambertian reflectance model with ambient light, and a projected shadow on the ground (or on another tree). Equations 2 and 3 give the formulation of our model in which each pixel is treated as a singular surface.
where is the local solar incidence angle, is the solar zenith angle, is the slope of the object surface, is the solar azimuth and is the aspect of the object surface.
where is the maximum reflectance of the model, "amb" represents the diffuse ambient lighting. The geometrical-optical model is adjusted according to the specific illumination parameters of the image, and the size of the trees present on the scene. Figure 7 shows two examples of tree models with similar reflectance but different solar elevations.
A parameter of projected shadow clipping has also been added to account for the fact that it was not beneficial to use the whole shadow in situations where it was projected onto another tree and not on the ground. The height of the tree also affects the size of the shadow so that it did not appear wise to set the height to a fixed value. To illustrate this, Figure 8 shows a comparison between the tree model and an actual tree from the image both with whole and clipped shadows.
3.1.2. Algorithm description
According to reference , using MPP to extract objects consists in searching for the “best” possible object configuration in a scene, the one that will respect a certain number of properties both of the objects being sought and the radiometric properties of the image. In our algorithm, the “best” configuration be it geometric or radiometric is given by the model.
The process consists in alternating phases of birth and death. The MPP starts with a birth phase during which tree crowns represented by circles of varying size (a randomized interval) are inserted on a matrix of equal size to the image being processed. Tree crowns are only inserted where no other crowns are present. Once all the circles have been inserted (determined by a density parameter ), a similarity () value between the image and a version of the model fitted on each circle is computed and stored in a list along with the parameters of the model. A routine then sorts the list by decreasing values of Sm. During the death phase, the circles that do not comply with the acceptance restrictions (and minimum and maximum standard deviation threshold) are successively deleted from the matrix and the list. At the end of each death phase, the overall parameters of the pixel distribution of the remaining crowns are updated. The crowns that have been found are definitively kept but are re-thrown in the bundle of crowns of the next iterations. If after an iteration 10 crowns are kept out of 100, then the next iteration will randomly place 90 more crowns, and the new set of 100 crowns are evaluated and sorted for the next iteration. All tree crowns are considered “found” when one of the three possibilities is encountered: 1) the number of trees found is equal to the number given by the density parameter, 2) one of the interruption criteria has been attained, or 3) the maximum number of iteration has been reached.
The value is computed as the subtraction of two parameters: cross-correlation and the normalized absolute difference as defined by the following relation (Equation 4):
where is the cross-correlation between image and model, is the normalized sum of absolute differences between them, and is a constant weight factor (normally approx. 0.5). The cross-correlation and absolute difference are calculated as follows (Equations 5 and 6).
In Equation 5, the cross-correlation is calculated between the model matrix () and the portion of the image that corresponds to the circle of the same radius (). In other words, two matrices of same dimensions are always compared. and are their respective means. The values of gamma range between −1 and 1. The same logic is used in Equation 6 which computes a normalized difference value between the same two matrices.
In the death phase, tree crowns are kept if their similarity is larger or equal to a pre-set threshold. Because we found that such a threshold represented a weak element in our algorithm, we implemented a strategy by which it needs not be predetermined with a fixed value but rather adjusts itself as the number of iterations grows. The threshold is set very high at the beginning but then starts to decay when a certain number of iterations do not find any "new" tree crown (typically 100 iterations). Additionally, if more than a certain amount of iterations (say 1000) still does not add any new tree crown, then the process is stopped. Ultimately, it will be stopped if the maximum number of iterations is reached. A flowchart of our algorithm is presented in Figure 9 and schematically described in Table 3.
|1. Task: Tree crown detection in Very High Resolution images.|
2. Set parameters:
|a. 3D model: maximum reflectance, ambient light, sun elevation, sun azimuth, tree shape, clip factor.|
b. Descriptors of the objects: minimum and maximum radius, minimum and maximum standard deviation (δ) threshold,maximum and minimum similarity(sm) threshold, trees density.
c. Change the process: maximum iterations for decrease similarity
d. Interruption of the process: total iterations, maximum iterations without find new trees.
|3. Approach to tree crown detection:|
|a. While the number of searched trees is not achieved or some of the interruption process (total iterations or minimum threshold for similarity).|
|4. Starts the birth phase:|
|a. Randomly pick a radius within model catalogue|
b. Randomly pick i and j coordinates within the image space
c. Check if crown is already present
d. If not:
|i. Fill area with circle of radius r|
ii. Extract corresponding area in the image matrix
iii. Compare image and model matrices
iv. Calculate and store values: i,j, average, standard deviation and Sm
|5. Starts the death phase:|
|a. Input parameters: birth image matrix; crown statistics (Sm sorted); Sm threshold; tree models catalogue with radius between maximum and minimum radius|
b. While smcrown < smthreshold:
|i. Zero crown pixels in birth image matrix|
|c. While δ crown < min δ threshold and δ crown > max δ threshold:|
|i. Zero crown pixels in birth image matrix|
|6. Update object and global statistics|
7. Update number of crowns eliminated for next birth phase
8. When the process finish: reports, graphs and image with individual tree crowns
Figure 10a shows the state of the crown matrix after a single birth phase with 163 circles of random radius (between 3 and 15 m) and randomly located within the image matrix. After the death phase, using a similarity threshold of 0.98, only one tree crown was kept (Figure 10b).
3.1.3. A modified approach for orchards
Because trees in orchards are often individually distinguishable and have similar shape and size, they are perfect candidates for TM with a 3D geometrical-optical model. By using a GOM, the effects of varying illumination (sun elevation and azimuth) become an advantage rather than an obstacle especially when the background is homogeneous. In terms of data, VHR image data such as a large proportion of Google Earth images have sufficient resolution for identifying orchard trees. In this case, however, illumination parameters are not readily available and must be determined.
The objective of this modified approach is to introduce an adaptation of the algorithm described earlier to detect and count trees in orchards of different types. Because it was aimed at a more regional or even global application, Google Earth images were used in an attempt to simulate a generic operational framework. The modified approach uses a similarity measurement between the GOM and the image to calculate the probability of being the center of tree and then places trees in nonoverlapping positions (unless some overlapping is allowed). The algorithm also incorporates a module to determine the illumination parameters from a sample.
The algorithm is based on three principles. First, it assumes that the trees have a dome-like shape approximated with a GOM and the right illumination parameters. Second, there is little or no overlapping between trees, and third, the pixel with the highest similarity represents the most likely central position of the tree.
The GOM is a simple dome model for which the height is estimated at 1.5 times the diameter of the crown, and to simplify the problem we have assumed a unique diameter for all trees in the orchard (this can easily be modified to incorporate a range of diameters). The algorithm responsible for the detection of trees are best explained through a list of steps.
Step 1. Get user parameters: percent overlapping allowed minimum similarity value, tree diameter, and coordinate of sample tree. These parameters cannot be estimated automatically and are entered by the user. The illumination parameters can optionally be entered by the user, else they will be estimated by the program using the tree sample.
Step 2. If sun elevation and azimuth are not provided by the user, the parameters are automatically estimated by the program using the coordinates of a single-tree sample. The program then computes the similarity between the sample and all possibilities of illuminations parameters in steps of 10 degrees.
Step 3. Calculate the similarity value for each pixel.
Step 4. Sort pixels by decreasing similarity and store coordinates. If the value is lower than the minimum allowed, the pixel is not stored.
Step 5. Place a temporary tree “stamp” (flat template) at the next pixel location with highest similarity value.
Step 6. Verify if space is already occupied by a tree. If some overlapping is allowed, make sure that the number of nonzero pixel is smaller than the percentage of overlapping allowed. An output image is created to receive a permanent "stamp" of the GOM shape with the value of similarity associated.
Step 7. Validate the results. Validation is performed by estimating the overall number of trees using the density of a representative sample and comparing with the number of trees found.
3.2. Testing the MPP–TM approach
3.2.1. Urban trees
Urban trees play an important role in the welfare and quality of life in cities. They contribute to improving air and water quality, mitigate the carbon dioxide and other pollutants, moderate the microclimate and air temperature, help control soil erosion, reduce the flow of rainwater, and provide biodiversity [37–39]. A good knowledge of the species planted in cities and their health contributes to the inventory and management of these trees. To fulfill their role in the urban environment, trees need to be looked after through maintenance practices such as pruning and monitoring them for pests and diseases.
A WorldView-2 (WV-2) image of the campus of the Universidade Federal de Minas Gerais (UFMG) (and surroundings) in Belo Horizonte, Brazil, was used as our test data (Figure 11). The scene was already orthorectified and radiometrically corrected. Although WV-2 offers nine different spectral bands, only the panchromatic band (λ ≈ 450–800 nm) with a ground resolution of 50 cm was used since all other bands have a ground resolution of 2 m.
Three WV-2 sub-images were selected to test the performance of MPP–TM algorithm (Figure 10). These images were chosen from different contexts with both isolated and grouped trees and with other objects present in the scene. A wide variety of crown radii is also present in these images. The first two images (Figure 12a and b) are from the university campus of UFMG, and the last is from an urban park (Figure 12c).
To assess the quality of the results produced by MPP–TM, validation was done by comparing our results with a visual interpretation of the trees in the image. For these, only tree counting was used as validation. For the crown counting validation, we considered the following situations: 1) true positives (TP) for found trees, 2) false positives (FP) when a detected object is not a tree, and 3) false negatives (FN) for trees not encountered. The success score was computed as follows (Equation 7):
where represents the total detected trees and is the total number of trees.
|Image||Number of Trees||Trees Detected||False Positive||False Negative||Overall Accuracy (%)|
|WV image 1||47||43||3||8||72.73|
|WV image 2||50||59||8||5||92.73|
|WV image 3||175||161||5||20||80.00|
In the two images of the campus, the program was able to find 73% and 93% of the trees, respectively, with very few errors in isolated trees (Figure 13a and b). The presence of other objects (buildings, streets, and sidewalks) did not hinder the identification of trees and few false positives (3 and 8, respectively) were found. In both images, MPP–TM was able to find most grouped trees, but the crown diameter was often slightly off. It should be noted that some cases are even difficult to correctly identify and delineate visually. Mostly, the errors came from dividing a single crown into two, or including two different crowns as a single object.
The WV-2 image 3 is from a protected urban park area with predominantly isolated trees and relative homogeneous crown size of about 6 m (Figure 13 c). A total of 161 objects were detected with only 5 false positives and 20 false negatives for an overall success of 80%. Although most deciduous trees were selected, the crown size was often incorrect but given the highly irregular shape of many of these trees, this was somewhat expected, and similar problems have been reported by reference .
The behavior of the overall similarity during the iterations tend to increase as the image is progressively occupied by found trees and this is why the overall similarity increases. The standard deviation, however, is very different for each image and is mostly related to the amount of contrast in the original image. Images with highly contrasting objects (e.g., building tops) will tend to show a progressively decreasing standard deviation. Images of low contrast will tend to see it increasing as the trees are progressively added because of the double illumination nature of the trees.
Orchards are collections of individual trees often arranged regularly for which the MPP–TM algorithm could easily be adapted. Tree counting in orchards can be very useful for inventory and management purposes. For instance, the European Union (EU) Common Agricultural Policy (CAP) regulations (EC 73/2009) provide support for permanent crops such as hazelnuts, almonds, walnuts, and fruits in general [40–42]. Eligible orchards need to have a certain size and tree density depending on the type of crop. It has been estimated that orchard fruit production represents approximately 3– 4% of the total arable land , so the task of estimating fruit production needs tools for counting trees in a timely fashion. Furthermore, the task can take advantage of the near-global high-resolution image cover provided by Google (Google Earth and Google Map) and other Internet-based image services.
Orchards are plantation of trees of the same species and often of the same age. Consequently, trees of orchards usually have similar size and shape and are regularly spaced. Image processing can easily be adapted to such a task providing VHR images are available. To illustrate the adapted MPP–TM algorithm (which no longer is a real MPP), we have tested over three different types of orchards: a mango plantation in Brazil near Juazeiro, a walnut plantation in France near Grenoble, and an olive plantation in Italy near Bracciano. The three images were directly extracted from Google Earth and had a relatively bad quality as they appeared to have been enhanced for sharpness. To validate the results, we have asked three geography students to manually interpret and mark the trees belonging to orchards for the three test images, and we have evaluated the results in the following way:
the total number of trees (NT) was determined by the interpreters;
matched trees were computed as true positive and are defined by the number of trees found by the algorithm minus the false positives;
unmatched trees (present on the image but absent from the results) were computed as false negative (FN);
trees marked by the algorithm but not by the interpreters were marked as false positive;
the final accuracy was computed as TP / (NT + FN).
To be fair, the interpreters were told not to mark the trees that seem too small or too big for the orchards. In addition, valid trees that were found by the algorithm but did not pertain to an orchard were not computed as false positive. As a further improvement, restricting the search within the boundaries of the orchards would increase the accuracy and enable the similarity parameter to be relaxed. The addition of other spectral bands should also improve the results.
|Test Image||Number of trees||True positives||False positives||False negatives||Overall accuracy|
Table 5 shows an overview of the results for the three test images, and Figure 14 shows the graphical results. The top row shows the original images, the center row shows the results of the tree identification (as well as false positives and negatives), and the bottom row displays a detailed section of the image on which the results were overlaid. The Grenoble test image (Figure 14 left column) was characterized by densely arranged walnut trees, which have a large round crown so that the model was well correlated with trees on the image, but the fact that the trees are close to one another produced a relatively large number of “miss” (103). This forced to relax the similarity threshold and caused a few false positives (69). In the case of the Bracciano image (Figure 14 center column), the olive trees are more ill- shaped than the walnut trees, and the relaxation of the similarity threshold caused a large number of false positives, especially in the nearby forested areas. Conversely, very few trees were missed. Finally, the last test image from Juazeiro (Figure 14 right column) is populated by mango trees that, like the walnut trees, have large round crowns. Still, the algorithm produced a fair amount of both false positives and false negatives mainly because of the variation of tree crown size and the particular situation of the dirt road at the top of the image that created a pattern of light and shade similar to the trees (approximately one-third of the false positives came from that road). The three very different images still produced similar accuracy results between 90 and 93%.
4. Final considerations
The detection of individual tree crown in images of very high resolution is a growing and challenging field of research within the remote sensing community. In addition to the structural complexity of the forest, many other factors such as the characteristics of the scene (topography, illumination, and other environmental variables) and forest type (season and biodiversity) make the task difficult. To reference , the ability to achieve individual tree crown delineation of all trees in a forest was recognized as an unrealistic expectation.
In an effort to provide the reader with an overview of the current state of the research in tree crown detection, Part I presented a brief review of some of the most common computerized techniques for detecting and delineating trees in optical VHR images. Part II describes the concepts and implementation of a novel approach based on two mathematical/pattern recognition concepts integrated to improve performance. MPP–TM was developed based on concepts from marked point processes and template matching for the former to take advantage of a mark built from a geometrical-optical model.
MPP–TM was highly effective in finding trees in urban environment with images from the WorldView-2 satellite (ground resolution of 50 cm). A total of 263 trees out of 272 were found (96%), and taking false positives into account, a success rate over 90% was still achieved. The algorithm was also adapted for a tree counting application such as is often needed in large orchards. To count trees in orchards, the approach works very well when the trees are easily distinguishable. Results from three datasets of different crops show an average success better than 90%. Out of 5806 trees, 5537 were found excluding all false positives.
The growing availability of VHR images from commercial satellites or even from web mapping services opens a wide field of applications especially that VHR multispectral images are becoming increasingly common. Multi-temporal studies will further strengthen these applications for monitoring purposes.
Finally, we should mention that Lidar (light detection and ranging) data are also becoming widely available, and its integration with VHR images promises to further improve the results of tree detection algorithm. By adding a third dimension to the images, Lidar reduces the probability of errors by strengthening the evidence around the digital representation of trees.
We are grateful to François Gougeon, Donald Leckie, Guillaume Perrin and Mats Erikson for having kindly provided the rights of reproduction of their figures.