Open access peer-reviewed chapter

High-Resolution Object-Based Building Extraction Using PCA of LiDAR nDSM and Aerial Photos

Written By

Alfred Cal

Submitted: 06 February 2020 Reviewed: 23 April 2020 Published: 01 September 2020

DOI: 10.5772/intechopen.92640

From the Edited Volume

Spatial Variability in Environmental Science - Patterns, Processes, and Analyses

Edited by John P. Tiefenbacher and Davod Poreh

Chapter metrics overview

940 Chapter Downloads

View Full Metrics


Accurate and precise building extraction has become an essential requirement for various applications such as for impact analysis of flooding. This chapter seeks to improve the current and past methods of building extraction by using the principal components analysis (PCA) of LiDAR height (nDSM) and aerial photos (in four RGB and NIR bands) in an object-based image classification (OBIA). This approach uses a combination of aerial photos at 0.1-m spatial resolution and LiDAR nDSM at 1-m spatial resolution for precise and high-resolution building extraction. Because aerial photos provide four bands in the PCA process, this potentially means that the resolution of the image is maintained and therefore building outlines can be extracted at a high resolution of 0.1 m. A total of five experiments was conducted using a combination of different LiDAR derivatives and aerial photos bands in a PCA. The PCA of LiDAR nDSM and RGB and NIR bands combination has proved to produce the best result. The results show a completeness of 87.644%, and a correctness of 93.220% of building extraction. This chapter provides an improvement on the drawbacks of building extraction such as the extraction of small buildings and the smoothing with a well-defined building outline.


  • building foot prints
  • LiDAR
  • nDSM
  • principal components analysis
  • object-based image classification

1. Introduction

Over the years, buildings extraction at a high resolution has become an essential requirement for various applications such as flood modeling, urban planning, and 3D building modeling. In flood modeling scenarios, one of the most important structures at risk are buildings. Buildings houses people and other valuable assets, therefore, proper representation of buildings is very important for flood managers. Currently, building extraction methods are being done using a mixture of different data sources and various algorithms. The use of high-resolution aerial imagery and LiDAR are commonly integrated for more accurate building extraction. Aerial image provides spectral information, while LiDAR data provides height and intensity information. By fusing 2D aerial images and 3D information from LiDAR, complementary information can be exploited to improve automatic building extraction processing and the accuracy of the building roof outline [1].

There are several techniques used for building extraction in the remote sensing field. One such technique is called image fusion. Image fusion is the combination of two or more different images to form a new image by using an algorithm to obtain more and better information about an object or a study area [2]. Multispectral data such as aerial photos has spectral and high spatial resolution, meanwhile, LiDAR data has height and intensity information. Thus, buildings can be extracted based on their height from LiDAR and spectral information from aerial photos to improve the spatial resolution of roof edges [3].

There are many image fusion methods that are available, these include intensity hue-saturation (IHS), Brovey transform (BV) and principal component analysis (PCA) [2]. PCA transformation is a technique to reduce the dimensionality of multivariate data whilst preserving as much of the relevant information as possible. It also translates correlated data set to uncorrelated dataset [2]. In this study, a fusion of aerial photos and LiDAR datasets using PCA can be beneficial to accurately detect and extract buildings at a high spatial resolution. The advantage of using PCA as an image fusion technique for feature extraction is that all resulting variables are independent of each other while still retaining the most valuable parts of the input variables. Thus, other type of transformations, such as IHS destroys the spectral characteristics of the image data which is important for feature extraction and Brovey Transform depress the image values during image fusion [2].

For building extraction, some form of image classification technique is needed. Pixel-based (spectral pattern recognition), and object-based (spatial pattern recognition) are the two groups of common image classification techniques. Traditionally, image classifications are done with pixel-base using different classifiers in supervised and unsupervised classification (e.g., K-Means, Maximum Likelihood, etc.). These pixel-based procedures analyze the spectral properties of every pixel within the area of interest, without taking into account the spatial or contextual information related to the pixel of interest [4]. Meanwhile, object-based classification techniques start by grouping of neighboring pixels into meaningful areas. Object-based feature extraction is a relatively modern technique for the extraction of objects in urban environments such as buildings and roads, where its advantage lies in the classification of objects represented by a group of pixels. More specifically, image objects are groups of pixels that are similar to one another based on a measure of spectral properties (i.e., color), size, shape, and texture, as well as context from a neighborhood surrounding the pixels [5].

The goal of this chapter is to improve on past and existing methods of building extraction by introducing the use of principal component analysis (PCA) of LiDAR height (nDSM) and aerial photos (RGB and NIR) in an OBIA. This approach was evaluated by comparing the accuracy and quality of building extraction on 5 PCA datasets, this incudes (1) PCA combinations of RGB and nDSM, (2) PCA of RGB, NIR, nDSM and slope, (3) PCA of RGB, nDSM and NDVI, (4) PCA of RGB, NIR, nDSM and NDVI, and (5) PCA of RGB, NIR and nDSM.

By evaluating the combinations of bands using the PCA approach for building extraction, the author seeks to answer the research question of this study. To investigate which PCA parameter has the most influence for the detection and extraction of buildings and to determine which band combinations can produce a satisfactory building extraction in terms of their completeness, correctness, and quality. This approach provides the extraction of buildings at a high spatial resolution of 1-m, this allow the building outline to be extracted at the same high spatial resolution. This study introduces a novel approach of using PCA for precise and high-resolution building footprints extraction in an object-based image classification technique in a semi-automated process. Furthermore, this approach was validated by comparing the resultant building footprints using a quantitative and qualitative statistical analysis discussed in the results section.

The proposed method has been tested on a 1 km2 area of Vista del Mar, Ladyville village, Belize Central America. The area chosen has a relatively flat landscape that has a combination of different building sizes and shapes, vegetation cover and waterbodies. The datasets used in this study are LiDAR point cloud and aerial photos. The LiDAR datasets and aerial photos for this study were made possible from the Ministry of Works (MOW), Belize and described in more detail in Chapter 3.

The rest of the chapter is organized as follows: Section 2 presents a background of building extraction from LiDAR data and multispectral images. Section 3 details the data and methods used for extraction. The results are discussed in Section 4. Finally, concluding remarks are offered in Section 5.


2. Background and related works

There have been many studies in building extraction techniques with different approaches. Some studies use only LiDAR data, others use only multispectral images and then there are those that use a combination of LiDAR and multispectral images to extract building outlines for various applications. Several methods have been presented for building extraction from LIDAR data during the last decades. Based on the used data, building extraction methods can generally be divided into three categories: 2D (two-dimensional) imagery based, fused 2D-3D information based, and 3D LiDAR based [1].

Studies that use only LiDAR data for building extraction includes [1, 6, 7, 8, 9]. A typical step of combining geometry features to extract building is, firstly to filter the DTM from LIDAR data, then derive the DSM data into ground points and non-ground points (including vegetation and building) by height difference [7]. DSM is normally used in flood modeling applications with the combination of DEM to derive a difference image, also called normalize height image. The difference image is a result of subtracting the DEM from the DSM to get the absolute height of buildings and trees in the study area. Digital Surface Models (DSMs) offer the possibility of extracting the elevations of surface features to leave the ground surface DEM [9]. Airborne LiDAR Laser Scanning devices can provide digital surface models that can be used to separate surface features from the ground for modeling flood inundation from rivers in urban and semi-urban environments [9]. However, 3D information provided by LiDAR cannot solve all automated building extraction problems. A typical example is that to separate nearby trees and buildings, extra information, such as color or brightness, is needed to separate features [8]. In the extraction process, only those buildings that are classified as buildings in the point cloud data are extracted, in some cases tiny or small buildings that are not classified in the point cloud data will not be extracted. Another drawback is that this method still has some deficiency to extract out some very small building information. Further improving and updating is still necessary [7]. However, since the method uses LIDAR data alone, the planimetric accuracy is limited by the LIDAR point density. At present, the method does not incorporate smoothing of the boundaries of extracted planar segments [6]. And it is hard to obtain a detailed and geometrically precise boundary using only LIDAR point clouds [10].

Studies that used only images for building extraction includes [3, 11, 12, 13]. In remote sensing, building extraction from high resolution imagery has been a common field of research. So far, many algorithms have been presented for the extraction of buildings from satellite images and aerial photos. These algorithms have mainly considered radiometric, geometric, edge detection and shadow criteria approaches [13]. Although promising results have been obtained from these 2D information-based methods, shadows and occlusions leading to significant errors, especially in densely developed areas, cannot be avoided. Consequently, these methods are considered to be insufficiently automated and reliable for practical applications [6].

The third category of building extraction is using a combination of LiDAR data with multispectral images in an image fusion technique. This third approach exploits the mutual benefits of both datasets for accurate building extraction. By fusing 2D images and 3D information from LiDAR, complementary information can be exploited to improve automatic building extraction processing and the accuracy of the building roof outline [1]. This method has been widely studied in [14, 15, 16]; Building detection techniques integrating LIDAR data and imagery can be divided into two groups. Firstly, there are techniques that use the LIDAR data as the primary cue for building detection and those which use both the LIDAR data and the imagery as the primary cues to delineate building outlines [15]. In this approach LiDAR height and intensity are usually used along with aerial imagery to improve the classification of buildings. However, the challenges are how to integrate the two data sources for building boundary extraction still arises; few approaches with technical details have thus far been published [14].


3. Data and methods

The study area for this chapter is Ladyville Village, Belize, Central America. Ladyville was once a small coastal settlement separated from other communities, but over the years it has seen an increase in development and in population. Development has caused the village to become a sizable town and is sometimes considered a suburb of Belize City. Belize City, the largest city in Belize is only a few minutes’ drive away from Ladyville with the Belize River separating the two settlements. Ladyville is north of Belize City, along the Belize River, along the coast, and along the Philip Goldson Highway and it is in the lower reach of the Belize River watershed. The study area map is provided in Figure 1.

Figure 1.

Study area location map.

The topography of Ladyville is mostly flat. It is part of Belize’s coastal lowland and it is a part of the Belize River natural floodplain. Its natural vegetation is mostly broadleaf lowland forests and marshlands with meandering creeks, lagoons, and mangrove forest along the coast. The Ladyville area was also a location where excavations were done to gather fill for sites in Belize City. Ladyville was chosen as a study area because it is one of the most vulnerable communities to natural disasters and because of its strategic importance. Ladyville is located between the Belize River and the Caribbean Sea. This means it is highly vulnerable from both river flooding from the Belize River and storm surge flooding from the Caribbean Sea from hurricanes or oceanic events.

The datasets used in this study are LiDAR and aerial photos. LiDAR point cloud data was provided in LAS format version 1.4. LAS is a standard data exchange format for LiDAR point cloud data established by (ASPRS) the American Society for Photogrammetry and Remote Sensing. The data has a point average spacing of 0.3 m and it was classified into ground, low vegetation, medium vegetation, high vegetation, buildings and noise. Aerial photos were taken within the same time period of LiDAR airborne surveys in August of 2017. The image has a high spatial resolution of 0.1 m (10 cm). Aerial photos have four bands, Red, Green, Blue and NIR (Near Infrared).

To complete the semi-automated building extraction process, the workflow was developed. The workflow shown in Figure 2 below includes the following steps: LiDAR Pre-processing, PCA, OBIA, Segmentation, Feature Extraction, Training Sites, Image Classification, Rule base Classification, Accuracy Assessment and Regularize Building Outline.

Figure 2.

OBIA building extraction workflow diagram.

3.1 LiDAR nDSM pre-processing

LiDAR pre-processing involves the filtering of ground points from non-ground points. As a result, two files with the digital elevation model (DEM) ground points only and the digital surface model (DSM) non-grounds points were created with a spatial resolution of 1 m. From the DEM and DSM, an nDSM was created by subtracting the DEM from the DSM. The normalized digital surface model (nDSM) represents the absolute height of objects in the study area such as buildings and trees. Then LiDAR nDSM as a separate band was combined with four aerial photos bands. LiDAR height information from the nDSM was added to the aerial photos which is an essential building characteristic for extraction from other features.

3.2 PCA

Principal component analysis is a technique used to reduce the dimensionality of multivariate and multispectral datasets such as images with the aim of preserving as much of the relevant information as possible. PCA provides a method for the reduction of redundant information apparent in multi-dimensional databases. PCA represents any object with a much fewer information compared to the original image. Minimization of the correlation of multidimensional bands is performed by mathematically transforming the multi-band into another vector space with a new basis [17]. PCA was performed on the aerial photos in combination with LiDAR nDSM raster. The result is a single multiband raster, this means that the result of the LiDAR nDSM and aerial photos is a raster with 5 bands in a single raster dataset [18].

3.3 Object-based image classification (OBIA)

Object-based image classification (OBIA) is seen as an advancement in land cover classification, where its advantage lies in the classification of objects represented by a group of pixels. OBIA approaches for analyzing remotely sensed data have been established and investigated since the 1970s. Object-oriented methods of image classification have become more popular in recent years due to the availability of software [19]. Object-based classification techniques start by the grouping of neighboring pixels into meaningful areas. This means that the segmentation and subsequent object topology generation is controlled by the resolution and the scale of the expected objects. In an object-based classified image, the elementary picture elements are no longer the pixels, but connected sets of pixels [20].

3.4 Segmentation

The segmentation process in OBIA is used to recognize, differentiate and separate features within the image. This method involves the grouping of pixels into regions or areas based on their similar spectral reflectance, texture and area. Segmentation is defined as the delineation of the entire digital image into a number of segments or set of pixels, the goal is to enhance the present objects of the image into something more meaningful and required [21]. The segmentation process is dependent on the scale, shape, and compactness of objects. Several tests are needed to determine the best scale to use for image segmentation.

3.5 Feature extraction

The feature extraction process is performed after the image is segmented, this involves the searching of meaningful objects within the image such as roads, vegetation and buildings. This process allows us to isolate and extract only the object features that we need or that we are interested in. The computation of feature extraction can be statistical such as mean height, geometrical such as shape, elongation, rectangularity, and compactness. These parameters play an important role in the final output of extraction. The spatial and spectral properties are the two important factors for extraction [21]. The features extracted from the image bands or channels are used in the supervised classification of buildings.

3.6 Training sites

The training site section involves the selection of training sites for the building classification, the building features that are selected are those that have different characteristics such as color and shape. During the training site selection, other buildings can be selected that will be used for the accuracy assessment of the OBIA process. Buildings selected in the training site selection cannot be selected again for the accuracy assessment process.

3.7 Classification

Classification involves a supervised classification of the buildings for example using the support of vector machine (SVM). SVM has recently been given much attention as a classification method. In recent studies, Support vector machines were compared to other classification methods, such as Neural Networks, Nearest Neighbor, Maximum Likelihood and Decision Tree classifiers for remote sensing imagery and have surpassed all of them in robustness and accuracy [22].

3.8 Accuracy assessment

After classification, accuracy assessment is needed to determine the reliability of the classification process. This can be done by creating an accuracy assessment report or visually inspecting the results of the classification using the original image of the study area.

3.9 Rule-based classification

There is no classification or extraction process that is 100% accurate, therefore improvements can be made using rule-based classification. This involves making improvements to the results of the extraction process by using the attributes of the segmented layer. Geometrical rule-based classification involves selecting the desirable shape, compactness, rectangularity and elongation of objects, meanwhile statistical rule-based classification, involves selecting the mean height or mean NIR values from the segmented layer to improve the extraction of buildings.

3.10 Regularize building outlines

After the building extraction, the building outlines are observed to be very definitive at a large scale of 1:1000, which is significantly sufficient for various applications and scenarios. Nonetheless, zooming in closer at a very large scale of 1:250, some jagged edges can be seen. These minor rough or jagged edges were eliminated by cleaning the edges of buildings by choosing a standard precision and tolerance value to regularize the building outlines.


4. Results

Several experiments have been completed to determine the best combination of PCA raster data for the building extraction process. A total of five experiments have been completed to determine the best scenario of building/roof extraction on a 1 sq. km area. The aerial photograph of these areas shows a total of 584 buildings; therefore, the accuracy of building extraction was measured using this number. A total of 20 buildings are chosen for the training sites, these buildings sites are used in all five approaches. Table 1 is provided further below that gives a quantitative analysis of the process.

PCA datasetCommission percentageOmission percentageCompletenessCorrectnessQuality
RGB and nDSM55.279%4.229%42.829%95.770%42.033%
RGB, NIR, nDSM and Slope40.336%42.055%59.663%57.944%41.634%
RGB, nDSM and NDVI43.828%2.417%56.171%97.582%54.650%
RGB, NIR, nDSM and NDVI37.608%3.563%62.391%96.436%60.985%
RGB, NIR and nDSM14.097%7.270%87.644%93.220%82.392%

Table 1.

Result of area-based accuracy measures.

For the building segmentation process (Figure 3), a scale of 25, shape 0.5 and Compactness 0.5 was used, this parameter creates much smaller segments for smaller objects such as buildings. The objects (polygon) layer created by the segmentation is accompanied by an attribute table containing a unique identification field for every object. Segmentation is completed only on the first three bands of the PCA raster, the first three bands is the equivalent of the RBG in aerial photos. Figure 4 illustrates the size of the polygons used in the segmentation process, the building segments shown are smaller, in some cases 7 segments represents a building, this allows for a better building extraction with a well-defined building outline.

Figure 3.

Segmentation parameters of the OBIA.

Figure 4.

The size and scale of the segments used in the building extraction.

The first test was conducted using the PCA of RGB and nDSM. This PCA raster has a total of four bands, red, green, blue and the height data from the nDSM. Segmentation was completed on the RGB bands only, however, feature extraction is completed on all bands. Image segmentation is recommended only on the RGB bands which provide, the color, shape, and textures of objects in the study area. After feature attraction, all segments are given attribute information from all the four bands, these include the mean values of RGB and height data form the nDSM. Using this approach, it is observed that most buildings were selected, however, there are many other features that are selected as buildings, these features are those that have similar height of buildings such as vegetation and fences. In addition, the outline of buildings is not well defined, although buildings are correctly classified, however, their shapes are not realistic of building outlines, this would require several editing and adjustments (Figure 5).

Figure 5.

Building footprint extraction using PCA of RGB and nDSM.

The second test involves the PCA of RGB, NIR, nDSM and slope, this raster data contains six bands. Slope is considered and additional parameter that can aid in roof extraction, however the result of this approach is poor as many buildings are not classified and those that are selected, their outlines were not smooth and definitive. It is observed that additional bands in the PCA slightly lowers the spatial resolution of the datasets and therefore objects are not well defined (Figure 6).

Figure 6.

Building footprint extraction using PCA of RGB, NIR, slope and nDSM.

The third approach includes the PCA of RGB, nDSM and NDVI, a total of five bands. The NDVI (normalize difference vegetation index) is used in remote sensing to analyze the health of vegetation from green being healthy to red not healthy. NDVI was included to try to separate the objects that are green which are vegetation from other features that are not green such as buildings. The results (Figure 7) look promising where all buildings are selected, however, other features such as waterbodies and roads are classified as buildings, this observed to be because of the similar NDVI values of roads and waterbodies with the buildings. A closer observation shows that buildings that are close to each other are selected as one building and most of their outline is not well defined.

Figure 7.

Building footprint extraction using PCA of RGB, nDSM and NDVI.

The fourth approach includes the PCA of RGB, NIR, nDSM and NDVI, a total of six bands. The NIR is introduced in the PCA to see if it can improve the extraction of buildings from other features. The result is an improvement from the third approach; however, many building outlines are still not well represented, which would require tedious editing and adjustments. Some editing tasks such as splitting polygons, and reshaping building boundaries will be exhaustive (Figure 8).

Figure 8.

Building footprint extraction using PCA of RGB, NIR, nDSM and NDVI.

The fifth and final approach was conducted with the PCA of RGB, NIR and nDSM, a total of five bands. The results show a huge improvement from all other approaches in terms of selecting all features that are buildings as well as showing a well-defined boundary of building outline with a 92% extraction accuracy. Notice that there are very few other features that were classified as buildings using this approach (Figure 9).

Figure 9.

Building footprint extraction using PCA of RGB, NIR and nDSM.

Table 2 shows a comparison of the five approaches completed. The number of all segments are the total segments of all features within the 1 of area for each approach. The segments classified as buildings are those segments that are assigned as buildings from the total of all segments. It is important to note that on average a total of six segments represents the entire outline of one building, this also depends on the size of the building. Buildings correctly classified are those buildings that are correctly classified as buildings, but their shape and outline are not properly represented. Buildings properly represented are those buildings that are correctly classified, and their shape or outline is completely represented. The percentage of the properly represented buildings was calculated from the total number (584) of actual building within the 1 sq. km area.

PCA datasetNumber of all segmentsSegments classified as buildingsBuildings correctly classifiedBuildings properly representedPercentage with actual number of buildings (584)
RGB and nDSM122,20029,67657824742%
RGB, NIR, nDSM and Slope189,45113,79644316328%
RGB, nDSM and NDVI164,78822,94658432656%
RGB, NIR, nDSM and NDVI18,38123,33258435761%
RGB, NIR and nDSM122,651747158453792%

Table 2.

Comparison of the five building footprint extraction approaches.

From all the approaches discussed above the second approach which includes slope shows to be the worst result with 28% of accuracy of extraction. The slope band does not aid in the building extraction; however, the additional band has slightly lowered the resolution of the raster data. In OBIA, the color, shape, texture, compactness, and high resolution is needed for a smooth and realistic outline of buildings. The resolution of the aerial photos is important to maintain as this was used in the segmentation process. As shown in Figure 10, on the left is PCA of RGB, NIR and nDSM and the image to the right is PCA of RGB, NIR, nDSM and slope. The image on the right has reduced the image resolution, this can be seen around the edges of buildings where it became fuzzy.

Figure 10.

Comparison of PCA images resolution.

Segmentation at a high resolution such as 0.1 m will allow a smoother and a more defined outline of the buildings. Image segmentation completed using 1-m spatial resolution such as the LiDAR nDSM and slope will show a jagged and irregular shape of buildings.

Using a visual binary comparison method for building extraction as shown in Table 1, The PCA of RGB, NIR and nDSM has shown to produce the best result of all five approaches. It shows buildings are correctly classified, properly represented, and has a total of 92% accuracy of extraction from the total number of buildings within the 1 sq. km area. Looking at the table above, it is noticeable that this approach has the least number of segments assigned to buildings with 7471. The smaller number of segments allow for better classification of buildings and present very few fragments of other features that are wrongly classified as buildings.

Another evaluation of the accuracy of the extraction process was conducted using the completeness and correctness method which is also known as Area-based accuracy measures. This method measures the completeness, correctness and quality of the building extraction process. The purpose of area-based accuracy measures is to obtain stable accuracy measurements. The area-based accuracy measures (i.e., correctness, completeness, and quality) are designed for OBIA evaluation [23]. In addition, this method can be used to calculate the commission and omission of building extraction. To complete this method, a reference building polygons and the extracted building polygons are needed. The reference data used is the 584 building polygons within the 1 sq. km area. The equation used is shown in Figure 11.

Figure 11.

Area-based accuracy measures, source: [24].

The completeness is the percentage of entities in the reference data that were detected, and the correctness indicates how well the detected entities match the reference data [25]. The quality of the results provides a compound performance metric that balances completeness and correctness [24]. TP (True Positive) are those areas correctly classified as buildings, FN (False Negative) are those areas that are classified as buildings but are not buildings based on the reference data. FP are those areas that are not classified as buildings during the extraction process, but are actual buildings based on the reference data. Error of commission is the same as FN, which are areas wrongly classified as buildings, and error of omission is the same as FP, which are areas that are buildings, but they are not extracted. Error of commission and omission are commonly used in the evaluation of building classification and are presented as percentages. An error of commission and omission, completeness, correctness and quality were completed for the five approaches of building extraction presented in Table 2. For illustration purposes the area-based accuracy measures was completed below for the PCA of RGB, NIR and nDSM using the equation in Figure 11. The total area for the reference data (584 buildings) is 72,360.357 sq. m. The total extracted area or classified buildings for the PCA of RGB, NIR and nDSM is 76,963.690 sq. m. The figures are illustrated below.

  1. TP = 67454.350 sq. m. This is the correctly classified buildings in the extraction process.

  2. FN = 9509.340 sq. m. This is the areas wrongly classified as buildings during extraction.

  3. FP = 4906.007 sq. m. This is the areas that are buildings but are not detected as buildings.

  4. Completeness = TP/(TP + FN) = 67454.350/(67454.350 + 9509.340) = 0.876 (87.644%).

  5. Correctness = TP/(TP + FP) = 67454.350/(67454.350 + 4906.007) = 0.932 (93.220%).

  6. Quality − TP/(TP + FN + FP) = 67454.350/(67454.350 + 9509.340 + 4906.007) = 0.823 (82.392%).

  7. Commission error = FN/TP = 9509.340/67454.350 = 0.140 (14.097%).

  8. Omission error = FP/TP = 4906.007/67454.350 = 0.072 (7.27%).

The calculation illustrated above was completed for the other four building extraction approaches. The result is shown in Table 1. Using the area-based accuracy measures, the criteria for a complete and correct building extraction are low commission and omission percentage, and high completeness, correctness, and quality percentage rate. From all five approaches, the PCA of RGB, NIR and nDSM display this criterion with low commission and omission percentage and a high percentage of completeness (87.644%), Correctness (93.220%) and high quality of 82.392%. In Table 1, it is noticeable that other approaches have higher correctness value, however their completeness and quality is poor.

There are no classification techniques that are 100% accurate, however, in OBIA, a rule-based classification can be used to improve the classification results. This was completed on the best approach discussed above involving the PCA of RGB, NIR and nDSM. The rule-based approach is only applicable where classification has been completed. It this case, it removes unwanted features that are not buildings by selecting features that have the characteristics of buildings such as size, shape, and height. Using the attribute information of classified buildings, a query is built to complete this step. The example below demonstrates this technique, where the image on the left shows features classified as buildings, looking closer at the image, the fences around these buildings are classified as buildings as well. Using the rule-based approach (Figure 12) this can be improved by selecting buildings within a certain height, as we know in most cases that fences are lower than houses. Therefore, a threshold is set between 6 m as the average height and 12 m as the average maximum height, this eliminates the fence as shown in the image on the right where it stays in red color, and the features that meet the criteria are selected shown in orange color.

Figure 12.

Rule-based extraction of building footprints.

The rule-based classification demonstrated a technique of improving the classification results by removing unwanted features based on their attribute information (Figure 13). However, geometrical information can be used as well, this is important where vegetations are classified as buildings. It is observed that the segments of vegetation are mostly circular in shape and the segments of buildings are rectangular. A threshold value of rectangularity can be used to eliminate vegetations that are wrong classified as buildings using their geometrical characteristics.

Figure 13.

Building footprints extraction using attribute information.

The result of the building extractions was converted to a feature class in ArcGIS where minimal post processing was performed. Zooming very close to a large scale of 1:250 of the building polygon you will notice that there are minor rough edges as shown in Figure 14. These minor rough edges cannot be seen at a scale of 1:1000.

Figure 14.

Minimal rough edges of building outlines after extraction process.

These rough or jagged edges were eliminated using the Regularize Building Footprint tool by setting a tolerance of 0.5 m and a precision of 0.25 m, this parameter was observed to produce the best results of cleaning the edges of buildings. The result is shown in Figure 15 where a well-defined, smooth and realistic building outline polygon is accomplished.

Figure 15.

Regularize building footprint outlines.

The building polygon was overlaid on the aerial photo and the results show a well-defined and accurate building or roof boundary (Figure 16).

Figure 16.

Building footprint outlines overlaid on aerial photos.

Accurate building size and shape is important for damage assessment in flood modeling applications, as this will be used to determine the impact of a flood disaster on these structures. Ladyville village has a combination of medium and small buildings; however, it is observed that the most vulnerable populations are those that live in flood prone areas and those that live in tiny or small buildings. Proper representation of these small structures needs to be accurately represented for proper analysis of the extent of the damages suffered. The images of tiny houses are provided in Figure 17, which gives an illustration of the size of some of the buildings in the study area. What is not shown are tiny buildings that are poorly constructed and in a very dilapidated condition, which may house sometimes a family of 4 or 5 people.

Figure 17.

Example of tiny buildings not classified as buildings in LiDAR.

During field collection and verification, it was observed that some of these small buildings were not classified in the LiDAR data. This means that their roof outline cannot be extracted. However, with OBIA process using a combination of aerial photos and Lidar height information, these small structures were successfully extracted as well. The image on the left in Figure 18 shows small building that were not classified, with red points representing buildings. The image on the right is the result of the OBIA building extraction, which clearly shows that it has extracted the roof shape of these small buildings. Small buildings in the top left illustrate this process.

Figure 18.

Tiny building footprints extraction using PCA and OBIA method.

This approach successfully extracts buildings from the study area, and as discussed above, minimal post processing was required. An average of 2 hours is required to complete this process. With faster computers, this time could be significantly reduced. Most of the time is spent on image preparation, PCA analysis and conversion between different raster types. This approach is a significant improvement where approximately 600 buildings can be properly represented within this period. The building’s shape is well preserved. Even buildings that have a combined roof type as zinc and concrete were well outlined. The extraction process was completed at a high spatial resolution of 0.1 m (10 cm). The high resolution PCA of aerial photos and LiDAR nDSM allows the building to maintain its smooth outline with a completeness of 87.644%, Correctness of 93.220% and a quality of 82.392%.


5. Discussion and conclusion

A semi-automated object-based building extraction with limited post processing using the PCA image fusion technique is presented. The results show a very promising technique for precise and high-resolution extraction of buildings in urban areas using LiDAR derived height information (nDSM) combined with aerial photos (RGB and NIR). These data complement each other by providing mutual benefits in the extraction process. The RGB provided high resolution image with color which is very important in the segmentation process of OBIA to group pixels into segments, the nDSM provide height information to separate elevated structures such as buildings from other features and the NIR provides information to separate vegetation from other objects.

The extraction process was completed at a high spatial resolution of 0.1 m (10 cm). The high resolution PCA of aerial photos and LiDAR nDSM allows the building to maintain its well defined and smooth shape. The result of this study can be applied to various scenarios where accurate size and shape of buildings are important, such as in flood damage assessment.



The author would like to acknowledge the Ministry of Works, Belize C.A for providing the LiDAR datasets and the aerial photographs for the study area. My employer, the Ministry of Natural Resources, Belize C.A for granting me study leave to conduct this research. The Education Abroad Program at Vancouver Island University, for choosing me as a recipient of the Queen Elizabeth II Scholarship. Lastly, I would like to express my sincere gratitude to my research advisor Dr. Michael Govorov. Thank you for your guidance and support throughout this project.


  1. 1. Du S, Zhang Y, Zou Z, Xu S, He X, Chen S. Automatic building extraction from LiDAR data fusion of point and grid-based features. ISPRS Journal of Photogrammetry and Remote Sensing. 2017;130:294-307. DOI: 10.1016/j.isprsjprs.2017.06.005
  2. 2. Gharbia R, Azar AT, El Baz A, Hassanien AE. Image fusion techniques in remote sensing. 2014. [Online]. Available from:
  3. 3. Zeng C. Automated building information extraction and evaluation from high-resolution remotely sensed data. Electronic Thesis and Dissertation Repository. 2014:2076. Available from:
  4. 4. Weih RC, Riggan ND. Object-based classification vs. pixel-based classification: Comparitive importance of multi-resolution imagery. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives. 2010:38
  5. 5. Blaschke T. Object based image analysis for remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing. 2010;65(1):2-16. DOI: 10.1016/j.isprsjprs.2009.06.004
  6. 6. Awrangjeb M, Fraser CS. Automatic segmentation of raw LIDAR data for extraction of building roofs. Remote Sensing. 2014;6(5):3716-3751. DOI: 10.3390/rs6053716
  7. 7. Ren Z, Zhou G, Cen M, Zhang T, Zhang Q . A novel method for extracting building from LIDAR data-Fc-S method. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences: ISPRS Archives. 2008;37:283-288
  8. 8. Wang Z, Schenk T. Extracting building information from LIDAR data. International Archives of the Photogrammetry, Remote Sensing. 1998;32(3/1):279-284. Available from:
  9. 9. Priestnall G, Jaafar J, Duncan A. Extracting urban features from LiDAR digital surface models. Computers, Environment and Urban Systems. 2000;24(2):65-78. DOI: 10.1016/S0198-9715(99)00047-2
  10. 10. Yong LI, Huayi WU. Adaptive building edge detection by combining Lidar data and aerial images. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2007;XXXVII(B1):197-202
  11. 11. Tiwari PS, Pande H, Nanda BN. Building footprint extraction from ikonos imagery based on multi-scale object oriented fuzzy classification for urban disaster management. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2006;34:1-7
  12. 12. Feng T, Zhao J. Review and comparison: Building extraction methods using high-resolution images. In: 2nd International Symposium on Information Science and Engineering, ISISE 2009. 2009. pp. 419-422. DOI: 10.1109/ISISE.2009.109
  13. 13. Attarzadeh R, Momeni M. Object-based building extraction from high resolution satellite imagery. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, ISPRS. 2012;XXXIX-B4(September):57-60. DOI: 10.5194/isprsarchives-xxxix-b4-57-2012
  14. 14. Rottensteiner F, Summer G, Trinder J, Clode S, Kubik KKT. Evaluation of a method for fusing LIDAR data and multispectral images for building detection. In: Joint workshop of ISPRS and DAGM-CMRT05. Vol. 36. ISPRS. 2005. No. Part 3/W24, pp. 15-20. Available from: papers2://publication/uuid/D641830A-1713-4ADE-949A-7C008A05F110
  15. 15. Awrangjeb M, Ravanbakhsh M, Fraser CS. Automatic detection of residential buildings using LIDAR data and multispectral imagery. ISPRS Journal of Photogrammetry and Remote Sensing. 2010;65(5):457-467. DOI: 10.1016/j.isprsjprs.2010.06.001
  16. 16. Gerke M, Xiao J. Fusion of airborne laserscanning point clouds and images for supervised and unsupervised scene classification. ISPRS Journal of Photogrammetry and Remote Sensing. 2014;87(March):78-92. DOI: 10.1016/j.isprsjprs.2013.10.011
  17. 17. Moeller MS, Blaschke T, Urban change extraction from high resolution satellite image. ISPRS Technical Commission. II Symposium Vienna. 2006
  18. 18. F. Analysis and S. Econometrics. Spatial autoregressive analysis; spatial econometrics
  19. 19. Adam HE, Csaplovics E, Elhaja ME. A comparison of pixel-based and object-based approaches for land use land cover classification in semi-arid areas, Sudan. IOP Conference Series: Earth and Environmental Science. 2016;37(1):0-10. DOI: 10.1088/1755-1315/37/1/012061.
  20. 20. Syed S, Dare P, Jones S. Automatic classification of land cover features with high resolution imagery and Lidar data: An object-oriented approach. Image (Rochester, N.Y.). 2005:512-522
  21. 21. Khatriker S, Kumar M. Building footprint extraction from high resolution satellite imagery using segmentation. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences: ISPRS Archives. 2018;42(5):123-128. DOI: 10.5194/isprs-archives-XLII-5-123-2018
  22. 22. Blaschke T, Lang S, Hay GJ. Object-based image analysis. Object-based Image Analysis Spatial Concepts for Knowledge-Driven Remote Sensing Applications. January 2008;2008:V–VIII. DOI: 10.1007/978-3-540-77058-9
  23. 23. Cai L, Shi W, Miao Z, Hao M. Accuracy assessment measures for object extraction from remote sensing images. Remote Sensing. 2018;10(2). DOI: 10.3390/rs10020303
  24. 24. Rutzinger M, Rutzinger M, Rottensteiner F, Rottensteiner F, Pfeifer N. A comparison of evaluation techniques for building extraction from airborne laser scanning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 2009;2(1):11-20. DOI: 10.1109/JSTARS.2009.2012488
  25. 25. Uzar M. Automatic building extraction with multi-sensor data using rule-based classification. European Journal of Remote Sensing. 2014;47(1):1-18. DOI: 10.5721/EuJRS20144701

Written By

Alfred Cal

Submitted: 06 February 2020 Reviewed: 23 April 2020 Published: 01 September 2020