Land Use/Cover Classification Techniques Using Optical Remotely Sensed Data in Landscape Planning

The observed biophysical cover of the earth’s surface, termed land-cover is composed of patterns that occur due to a variety of natural and human-derived processes. On the other hand Land-use is human activity on the land, influenced by economic, cultural, political, historical, and land-tenure factors. Remotely-sensed data (i.e., satellite or aerial imagery) can often be used to define land-use through observations of the land-cover (Brown, et al., 2000; Karl & Maurer, 2010). Up-to-date land-use information is of critical importance to planners, scientists, resource managers, and decision makers.

This chapter evaluates classification methods together with optical remote sensing data, and ancillary data integration to improve classification accuracy of LUC mapping.

LUC classification schemes
Standardization is one of the most discussed issues in LUC classification studies, and scientist and map developers were aware that using a common classification schemes might be more comparable and available.The first standardization works started in USA.Today there are several LUC schemes on the world according to region and scale.This chapter will discuss three largely used schemes; i) USGS (US geological survey) Anderson, ii) CORINE (Coordination of information on the environment) and iii) EUNIS (European Nature Information System) habitat schemes.

USGS Anderson classification schemes
This classification scheme was utilized within large number of models in the context of land physical dynamics and natural risk assessment.USGS classification scheme is based on James Anderson's system.This scheme is included nine main categories and four different levels (Anderson et al., 1976).
Level I is suitable for 1/250.000-1/150.000scale imags like MODIS and Envisat MERIS.Level II is useful for higher spatial resolution satellite sensor images with a scale of 1:80,000.Level III is suitable for 1:20,000 to 1/80,000 scale images such as, Landsat 4-7 .Level IV is the most useful for images at scales larger than 1:20,000 (Ikonos, Kompsat, Rapid eye, Formosat, Geoeye, World view and aerial photos).Categories are designed to be adaptable to the local needs.Sample of Level I categories and forest land levels are showed in table 1.

LEVEL I LEVEL II LEVEL III LEVEL IV
1

CORINE classification scheme
The European council found EEA (European environmental agency) in 1990 to search and discuss the environmental issues all around the Europe.LUC of Europe is one of the most

EUNIS habitat classification scheme
The EUNIS habitat classification is a common reporting language on habitat types at European level, sponsored by the EEA.It originated from a combination of several habitat classifications -marine, terrestrial and freshwater.The terrestrial and freshwater classification builds upon previous initiatives, notably the CORINE biotopes classification (Devillers & Devillers-Terschuren 1991), the Palaearctic habitats classification (Devillers & Devillers-Terschuren 1996), of the EU Habitats Directive 92/43/EEC, the CORINE Land Cover nomenclature (Bossard et al. 2000), and the Nordic habitat classification (Nordic Council of Ministers 1994).The marine part of the classification was originally based on the BioMar classification (Connor et al 1997), covering the North-East Atlantic.The EUNIS habitat classification introduced agreed criteria for the identification of each habitat unit, while providing a correspondence with these earlier classification systems.
The habitat classification forms an integral part of the EUNIS, developed and managed by the European Topic Centre for Nature Protection and Biodiversity (ETC/NPB in Paris) for the EEA and the European Environmental Information Observation Network (EIONET).
The EUNIS web application (http://eunis.eea.europa.eu)(EEA 2012a) provides access to publicly available data in a consolidated database.
The information includes:  Data on Species, Habitats and Sites compiled in the framework of NATURA2000 (EU Habitats and Birds Directives),  Data collected from frameworks such as EIONET, data sources or material published by ETC/NPB (formerly the European Topic Centre for Nature Conservation).


Information on Species, Habitats and Sites taken into account in relevant international conventions or from International Red Lists. Specific data collected in the framework of the EEA's reporting activities, which also constitute a core set of data to be updated periodically.
The resulting system of classification is still somewhat transitional.Down to level 3 (terrestrial and freshwater) and level 4 (marine), EUNIS habitats are now based on physiognomic and physical attributes, together with some floristic criteria.There are 10 main habitat categories in this scheme.Coastal habitats and main categories as an example were presented in this chapter (table 2).Detailed information can be found in revised EUNIS habitat classification report of Davies et al. (2004).

Remotely sensed data sources
Data characteristics are the most important issue to select appropriate available one for a LUC mapping.Both airborne and spaceborne data have various spatial, radiometric, spectral and temporal resolutions.Large numbers of studies have focused on characteristics of remotely sensed data (Barnsley 1999, Lefsky andCohen 2003).Additionally, scan width (cover size in one scene), data availability (accessibility) and lunch date (data archive potential) are the other important factors (Table 3).

LUC mapping techniques
Suitable remotely sensed data, classification systems, available classifier and number of training samples are prerequisites for a successful classification.Cingolani et al. (2004) identified three major problems when medium spatial resolution data are used for vegetation classifications: i) defining adequate hierarchical levels for mapping, ii) defining discrete land-cover units discernible by selected remote-sensing data, and iii) selecting representative training sites.In general, a classification system is designed based on the user's need, spatial resolution of selected remotely sensed data, compatibility with previous work, image-processing and classification algorithms, and time constraints.Such a system should be informative, exhaustive, and separable (Jensen 1996, Landgrebe 2003).In many cases, a hierarchical classification system is adopted to take different conditions into account (Lu and Weng 2007).

Image pre-processing
Image pre-processing includes geometric correction or image registration, atmospheric correction and radiometric calibration essentially.In addition, topographic correction and noise reduction may be applied if necessary.Optical images from current systems have already corrected geometrically (Landsat TM/ETM, MODIS) or can be corrected using freely available software or tools (e.g.BEAM for MERIS and CHRIS and MRT toolbox for MODIS).
Accurate geometric rectification or image registration of remotely sensed data is a prerequisite for a combination of different source data in a classification process.Many textbooks and articles have described this topic in detail (Jensen 1996, Toutin 2004).However, Geometric correction output should have the transformation rms.errors (RMSE) less than 1.0 pixel, indicating that the images are located with an accuracy of less than a pixel.
Atmospheric and radiometric corrections may not be necessary if a single image is used, but multitemporal or multisensor data are needed atmospheric and radiometric correction and calibration.A variety of methods, ranging from simple relative calibration such as, darkobject subtraction to calibration approaches based on complex models (e.g.MODTRAN, 6S,

Classification techniques
There are two basic approaches to the classification process: supervised and unsupervised classification.With supervised classification, one provides a statistical description of the manner in which expected land cover classes should appear in the imagery, and then a procedure (known as a classifier) is used to evaluate the likelihood that each pixel belongs to one of these classes.With unsupervised classification, a very different approach is used.
Here another type of classifier is used to uncover commonly occurring and distinctive reflectance patterns in the imagery, on the assumption that these represent major land cover classes.The analyst then determines the identity of each class by a combination of experience and ground truth (i.e., visiting the study area and observing the actual cover types) (Eastman 2003).Three essential parts are vital in a LUC mapping in classification stage; training, classifying and testing (accuracy assessment).

Classifiers
In

Nonparametric classifiers
No assumption about the data is required.Non-parametric classifiers do not employ statistical parameters to calculate class separation and are especially suitable for incorporation of non-remote-sensing data into a classification procedure.ANN, DT, Support vector machine (SVM), evidential reasoning, expert system.

Per-pixel classifiers
Traditional classifiers typically develop a signature by combining the spectra of all training-set pixels from a given feature.The resulting signature contains the contributions of all materials present in the training-set pixels, ignoring the mixed pixel problems.

Subpixel classifiers
The spectral value of each pixel is assumed to be a linear or non-linear combination of defined pure materials (or endmembers), providing proportional membership of each pixel to each endmember.Table 4.A taxonomy of image classification methods (Lu and Weng 2007).

Model based classifiers (traditional)
Model based classifiers are run using basic statistical theories like mean, variance and standard deviation of the dataset.The most used ones at the literatures are supervised MLC, MD, LDA and unsupervised k-means.
The minimum distance classifier is used to classify unknown image data to classes which minimize the distance between the image data and the class in multi-feature space.The distance is defined as an index of similarity so that the minimum distance is identical to the maximum similarity.If a pixel closer than to mean of a signature pixels, it classifies as same as nearest one.In figure 3, the nearest signature mean to unclassified pixel is settlement, thus it will be assigned to settlement class according to MD classifier.

Fig. 3. MD classifier concept
The MLC procedure is based on Bayesian probability theory.Using the information from a set of training sites, MLC uses the mean and variance/covariance data of the signatures to estimate the posterior probability that a pixel belong to each class.MLC procedure is similar to MD with the standardized distance option.The difference is that MLC accounts for intercorrelation between bands.By incorporating information about the covariance between bands as well as their inherent variance, MLC procedures what can be conceptualized as an elliptical zone of characterization of signature.It calculates the posterior probability of belonging to each class, where the probability is highest at mean position of the class, and falls off in an elliptical pattern away from the mean.
The LDA classifier conducts linear discriminant analysis of the training site data to form a set of linear combination that expresses the degree of support for each class.The assigned class for each pixel is then that class which receives the highest support after evaluation of all functions.These functions have a form similar to that of a multivariate linear regression equation, where the independent variables are the image bands, and the dependent variable is the measure of support.In fact, the equations are calculated such that they maximize the variance between classes and minimize the variance within classes.So that class separation becomes easier.
In k-means unsupervised technique, K-means clustering technique is used to partition a ndimensional imagery into K exclusive clusters.This method begins by initializing k centroids (means), then assigns each pixel to the cluster whose centroid is nearest, updates the cluster centroids, then repeats the process until the k centroids are fixed.This is a heuristic, greedy algorithm for minimizing SSE (Sum of Squared Errors), hence, it may not converge to a global optimum.Since its performance strongly depends on the initial estimation of the partition, a relatively large number of clusters are generally recommended to acquire as complete an initial pattern of centroids as possible (Richards & Jia, 1999).
All of the model based classifiers were compared each other using the same training data set in order to ensure the comparability of each technique.Landsat TM image recorded in August 2010 over the Eastern Mediterranean coastal zone of Turkey was used.Main LUC classes were coniferous tree, deciduous tree, permanent farmlands, temporary irrigated farmlands, temporary non-irrigated farmlands, bulrush, grassland, bareground, water bodies, settlement, sand dunes(figure 4).The ANN is one of several artificial intelligence techniques that have been used for automated image classification as an alternative to conventional statistical approaches.
Introductions to the use of ANNs in remote sensing are provided in (Kohonen 1988), (Bishop 1995) and (Atkinson and Tatnall 1997).Network architecture of an ANN is similar to the small part of a neural network (NN) system of human brain.Essentially, there are 3 parts in a NN as input, hidden and output nodes (figure 5).Input nodes are the image bands (e.g. for Landsat TM 6 nodes except thermal band) in a LUC mapping using optical images.Hidden node count depends on the user or previous experiences.There are two way to detect optimal hidden node count; (i) user may check the literature deals with the similar or same area of study site to find the optimal hidden node counts, (ii) user may apply several possibilities itself to find optimal hidden node count checking the accuracy of each applications.According to literature, if a NN system uses the one hidden layer, it is two or three times more than the input nodes generally (Berberoglu et al. 2009).Output nodes counts are equal the class count.Each output nodes are produced a class probability.

Fig. 5. NN architecture
The learning rate, determines the portion of the calculated weight change that will be used for weight adjustment.This acts like a low-pass filter, allowing the network to ignore small features in the error surface.Its value ranges between 0 and 1.The smaller the learning rate, the smaller the changes in the weights of the network at each cycle.The optimum value of the learning rate depends on the characteristics of the error surface.Lower learning rates are require more cycles than a larger learning rate.
Learning momentum is added to the learning rate to incorporate the previous changes in weight with the current direction of movement in the weight space.It is an additional correction to the learning rate to adjust the weights and ranges between 0.1 and 0.9.DT is a non-parametric image classification technique.A decision tree is composed of the root (starting point), active node or internode (rule node) and leaf (class).The root is starting point of the tree, active node creates leaves and the leaves are a group of pixels that either belong to same class or are assigned to a particular class (figure 6).Leaf defined by the splitting rule.This procedure is recursively repeated for each subset until no more splitting is possible (Ghose et al 2010).
In this chapter, gain ratio, entropy and gini splitting algorithms have been used to find the most accurate one, and entropy accuracy was determined almost 3% more accurate than the gain ratio.Gini resulted the poorest performance for the study area.Stopping criteria and active nodes were determined according to fallowing rule; If a subset of classes determined as pure, create a leaf and assign to interest class.If a subset having more than one class creates active nodes applying splitting algorithm, continue this processes until class leafs became purer.
The SVM represents a group of theoretically superior machine learning algorithms.SVM employs optimization algorithms to locate the optimal boundaries between classes.
Statistically, the optimal boundaries should be generalized to unseen samples with least errors among all possible boundaries separating the classes, therefore minimizing the confusion between classes.In practice, the SVM has been applied to optical character recognition, handwritten digit recognition and text categorization (Vapnik 1995, Joachims 1998).SVM uses the pairwise classification strategy for multiclass classification.SVM can be used linear and non-linear form applying different kernel functions.In this chapter only sigmoidal non-linear kernel were used because, model based classifiers have already worked well if data histogram is linear.All data based models were run non-linearly, and sigmoidal application takes less time than other non-linear kernels.Different kernel functions like radial basis function, linear function or polynomial function may be applied.
Even the accuracy of the SVM classifier may change when used the one kernel.For example, in polynomial kernel function, accuracy of SVM is various according to applied polynomial order (Huang et al. 2002).
All data dependent classifiers which were introduced in this chapter were evaluated in the Eastern Mediterranean environment (figure 7).

Accuracy assessments
A classification accuracy assessment generally includes three basic components: sampling design, response design, and estimation and analysis procedures (Stehman and Czaplewski 1998).Selection of a suitable sampling strategy is a critical step (Congalton 1991).The major components of a sampling strategy include sampling unit (pixels or polygons), sampling design, and sample size (Muller et al. 1998).Possible sampling designs include random, stratified random, systematic, double, and cluster sampling.A detailed description of sampling techniques can be found in previous literature such as Stehman and Czaplewski (1998) and Congalton and Green (1999).
The error matrix approach is the one most widely used in accuracy assessment (Foody 2002).In order to properly generate an error matrix, one must consider the following factors: (1) reference data collection, (2) classification scheme, (3) sampling scheme, (4) spatial autocorrelation, and (5) sample size and sample unit (Congalton and Plourde 2002).After generation of an error matrix, other important accuracy assessment elements, such as overall accuracy, user accuracy, producer accuracy (table 6), and kappa coefficient can be derived.Kappa is the difference between the observed accuracy and the chance agreement divided by one minus that chance agreement (Lillesand and Kiefer 1994).

37
SVM has a reasonable performance than other data dependent classifiers using weak training dataset.However, the largest accuracy was resulted in DT classifier using strong dataset.
SVM classified forestlands, grassland and permanent farmlands more accurate than other classifiers.There was not significant difference in built up areas among classifiers.The most accurate sand dunes, bulrush and irrigated farmland class accuracies were resulted from DT classifier.DT, LDA, SVM showed reasonably well performance with both weak and strong training data sets (figure 8).
In general, data dependent classifiers performed well with weak training dataset.Especially SVM was successful in vegetative area separation.It is clear that if more detailed classification scheme required (e.g.forest tree species) using weak training dataset, SVM might be first option in terms of classification accuracy.On the other hand, application of SVM is time costly when using standard PC and laptops.
Three accuracy calculation methods were shown in table 8, however, major question is which one should be used?Large number of studies have utilized the kappa coefficiencies as an ideal approache for LUC classification.
A number of criteria were selected for the comparison of both model based and data dependent classifiers as (a) Overall accuracy, (b) classification speed, (c) input parameter handling, (d) hardness in application, (e) accuracy with different training sizes and accuracy difference between each class or classification stability (table 9).

Soft (fuzzy) classifiers
Defining "what is in a pixel?" numerically, very important for understanding the earth surface in remote sensing science.Increased spatial information may be valuable in a variety of situations.The forthcoming range of satellite spectrometers (e.g.MODIS, MERIS) provided detailed attribute information at relatively coarse spatial resolutions (e.g.250m, 500m, 1km) (Aplin and Atkinson 2001).
Traditional hard per-pixel classification of remotely sensed images is limited by mixed pixels (Cracknell 1998).Soft classification overcomes this limitation by predicting the proportional membership of each pixel to each class.Mapping is generally achieved through the application of a conventional statistical classification, which allocates each image pixel to a land cover class.Such approaches are inappropriate for mixed pixels, which contain two or more land cover classes, and fuzzy classification approach is required (Foody 1996).
Table 9. (K) kappa, (P) producer, and (U) user accuracies of each LUC using hard classifiers in stud www.intechopen.com Fuzzy logic models constitute the modeling tools of soft computing.Fuzzy logic is a tool for embedding structured human knowledge into workable algorithms.There are two main types of sets.The 'crisp (or classic) sets' and the 'fuzzy sets'.For example, a crisp set can be defined by a membership function: In crisp sets a function of this type is also called characteristic function.Fuzzy sets can be used to produce the rational and sensible clustering.For fuzzy sets there exists a degree of membership μ s (X) that is mapped on [0, 1].In the case of LUC map, every area simultaneously belongs to interest LUC clusters with a different degree of membership (Kandel, 1992). (1) There are several soft classification techniques and these are variable according to training and testing dataset, scale of the study.In this frame, linear mixture modeling (LMM), Regression tree (RT), multi linear regression (MLR) and artificial neural network (ANN) soft classification techniques were evaluated in Eastern Mediterranean area called Upper Seyhan Plane (USP).Berberoglu et al. (2009), was focused on these four soft classification techniques to map percentage of tree cover using ENVISAT MERIS (full spatial resolution 300m) dataset and vegetation metrics.These metrics and more information about ancillary data integration were discussed in section 5.
For the accuracy assessment of a LUC or fuzzy map, we need to get high resolution ground truth data.Crisp data is adequate for the hard classifications, however assessment of a soft classification needs fuzzy ground truth like real forest cover in study scale quantitavely.High spatial resolution Ikonos (4m) satellite images of three selected plots were used to derive training and testing ground data.Ikonos images were classified as forest and nonforest classes and, results rescaled to MERIS spatial resolution.80% of this tree cover dataset was used as training data and 20% were separated for accuracy assessment.Linear (LMM and MLR) and non-linear (ANN and RT) techniques were compared.
LMM is one of the most used fuzzy techniques in the literature (Berberoglu & Satir 2008) and based on the assumption that class mixing is performed in a linear manner and therefore adopts a least squares procedure to estimate the class proportions within each pixel.The idea is that a continuous scene can be modeled as the sum of the radiometric interactions between individual cover types weighted by their relative proportions (Graetz 1990).The form of mixture model is: (2) Where Vi is the value of a pixel in input i, f j the fractional abundance of endmember j in input i, rij is the value of the highest endmember j in input i, ei is the residual error associated with input i and n is the number of endmembers.Equation ( 2) is constrained by the assumption that the sum of the input components in each grid should equate to 1.0 as defined by equation ( 3): ( 3 ) LMM needs pure pixels for each class to define the endmembers.Class membership functions are obtained based on endmember spectral characteristics (figure 9).Where the b0 is the constant value and b1 refers to coefficient of the first variable x1 (waveband).An advantage of linear regression is that it is easy to implement.MLR models are computationally efficient and can also predict confidence intervals for the obtained coefficients and the predicted data.Some of variable was eliminated using stepwise regression models.
The RT method has in recent years become a common alternative to conventional soft classification approaches, particularly with MODIS data (Hansen et al. 2005).The basic concept of a decision tree is to split a complex decision into several simpler decisions that can lead to a solution that is easier to interpret.When the target variable is discrete (e.g.class attribute in a land cover classification), the procedure is known as decision tree classification.By contrast, when the target variable is continuous, it is known as decision tree regression.In an RT, the target variable is a continuous numeric field such as percentage tree cover.Splitting algorithms were introduced in data dependent classifiers section.Splitting rules were contained only crisp equations.However, splitting rules were contained regression equation for each rule additionally in RT.In this study fallowing RT rules were applied to derive tree cover percentage (table 10).10.Regression tree rules for tree cover percentage from MERIS data.
Correlation coefficiencies of each result with testing dataset from LMM, MLR, ANN and RT were 0.68, 0.69, 0.68 and 0.71 respectively.The most accurate result was obtained using RT technique.
These techniques are not only used to map two classes but also can be applied for more LUC class.In this frame, LMM and ANN fuzzy classification techniques were compared in almost same area as RT classification by Şatır (2006) (figure 10).Only forested areas were selected in Şatır's study.Training and testing dataset were derived from Landsat TM/ETM for each LUC.LMM and ANN fuzzy classifications using medium spatial resolution data (300m) resulted reasonable classification outcomes if the training data set is large enough.On the other hand, in general both fuzzy classifications were more accurate than the hard classification results (table 11).
Fuzzy classifications are ideal for LUC mapping using coarse or medium spatial resolution data.However, fuzzy classification is not necessary in LUC mapping using very high spatial resolution data (e.g.0.5m or 1m).High spatial resolution data have the characteristic that group of pixel shows the similar spectral characteristics.Object based classification techniques are suggested in this point.

Object based classification
Many complex land covers exhibit similar spectral characteristics making separation in feature space by simple per-pixel classifiers difficult, leading to inaccurate classification.Therefore, an object-based classification is a potential solution for the classification of such regions.The specific benefits are an increase in accuracy, a decrease in classification time and that it helps to eliminate within-field spectral mixing (Berberoglu et al., 2000).The object-based classification approach involved the integration of vector data and raster images within a geographical information system (GIS) and enabled the knowledge free extraction of image object primitives at different spatial resolutions, the so-called multiresolution segmentation.The segmentation operated as a heuristic optimization procedure which minimized the average heterogeneity of image objects at a given spatial resolution for the whole scene (Bian et al. 1992).The objective was to construct a hierarchical net of image objects, in which fine objects were sub-objects of coarser structures.Due to the hierarchical structure, the image data were simultaneously represented at different spatial resolutions.The defined local object-oriented context information was then used together with other (spectral, form, texture) features of the image objects for classification.At the next stage, supervised per-field classification was performed using the nearest neighbor algorithm utilizing field boundary data generated as a result of the segmentation procedure.Objects are segmented in the image and all objects are created object layer.Two or more object layer is called object hierarchy (figure 12).

Ancillary data integration
Remotely sensed data may not be enough to map all LUC accurately alone.Ancillary data provide additional information on physical land dynamics, vegetation, climate, social geography and surface variability in LUC classification.When suitable ancillary dataset used, classification accuracy would be more accurate.In this chapter, only elevation (physical), texture (surface variability) and vegetation data (vegetation indices) were discussed in USP using DT and RT classifiers.

Physical data integration
Land physical dynamics such as elevation is vital physical input to LUC mapping.Digital elevation models (DEM) can be derived from stereo image pairs (e.g.ASTER) or radar (e.g.SRTM).Especially, vegetation formation and species vary according to elevation, aspect and climate.Using these ancillary data may improve accuracy of LUC maps (Coops et al. 2006(Coops et al. , Şatır 2006)).It is also possible to integrate soil characteristics into LUC mapping, because vegetation distribution and plant species are strongly dependent on soil depth, texture and moisture.
DEM was integrated to the DT and MLC classification in Eastern Mediterranean area discussed in section 4. Overall accuracy of the classification was increased approximately 4% and particularly bulrush, sand dunes and forestlands classified more accurately using DT.If topography vary in a study area, integrating the DEM may improve the LUC mapping accuracy.However, MLC classification overall accuracy was stable with and without DEM information.Most of the ancillary data increased the accuracy when using non-parametric techniques because parametric techniques like MLC uses the statistical equation to calculate distance of each LUC signature mean to the unknown pixel.However, DT creates rules based on the training data ranges, including elevation and spectral wavebands.

Surface texture data
Some of the variables can be produced using image wavebands such as surface texture and vegetation metrics.Surface textures are also used widely in LUC mapping.Many texture measures have been developed (Haralick et al. 1973, Kashyap et al. 1982, He and Wang 1990, Unser 1995, Emerson et al. 1999) and have been used for image classifications (Franklin and Peddle 1989, Narasimha Rao et al. 2002, Berberoglu et. al. 2000).Franklin and Peddle (1990) found that textures based on a grey-level co-occurrence matrix (GLCM) and spectral features of a SPOT HRV image improved the overall classification accuracy.Gong et al. (1992) compared GLCM, simple statistical transformations (SST), and texture spectrum (TS) approaches with SPOT HRV data, and found that some textures derived from GLCM and SST improved urban classification accuracy.Shaban and Dikshit (2001) investigated GLCM, grey-level difference histogram (GLDH), and sum and difference histogram (SADH) textures from SPOT spectral data in an Indian urban environment, and found that a combination of texture and spectral features improved the classification accuracy.The results based solely on spectral features increased about 9% to 17% with an addition of one or two texture measures.Furthermore, contrast, entropy, variance, and inverse difference moment provided larger accuracy and the most appropriate window size was 7X7 and 9X9.
Multiscale texture measures should be incorporated with original spectral wavebands to improve classification accuracy (Shaban and Dikshit 2001, Podest and Saatchi 2002, Butusov 2003).Recently, the geostatistic-based texture measures were found to provide better classification accuracy than using the GLCM-based textures (Berberoglu et al. 2000).For a specific study, it is often difficult to identify a suitable texture because texture varies with the characteristics of the landscape under investigation and the image data used.Identification of suitable textures involves determination of texture measure, image band, the size of moving window, and other parameters (Chen et al. 2004).The difficulty in identifying suitable textures and the computation cost for calculating textures limit the extensive use of textures in image classification, especially in a large area (Lu and Weng 2007).
To test the texture data on classification accuracy, five different GLCM was derived such as, variance, contrast, dissimilarity, homogeneity, entropy.These measurements incorporated with Landsat spectral wavebands in Eastern Mediterranean region.Overall accuracy was unchanged, however accuracies of settlement and agricultural land classes were increased 4-5%.However, accuracy of bareground and sand dunes decreased using DT classifier.

Vegetation indices
Vegetation metrics are another ancillary data for more accurate LUC mapping.Deriving the metrics is dependent on the spectral resolution of an optical image.Besides, there are some indices specifically designed for sensors.For example; Envisat MERIS data has own chlorophyll index called MERIS terrestrial chlorophyll index (MTCI).Additionally, vegetation metrics such as fraction of photosynthetically active radiation (fPAR), leaf area index (LAI) and fraction of green vegetation covering a unit area of horizontal soil (fCover) can be obtained using specific equations from MERIS data.Berberoglu et al. (2009) used this vegetation metrics to improve RT soft classification accuracy using MERIS data.When only MERIS wavebands used to determine the tree cover Hard classifiers were performed inaccurately with coarse spatial resolution images (e.g.MODIS, MERIS, NOAA, SPOTveg) because of mixed pixel problem.Fuzzy classifiers are reduced this problem and provided better accuracy than hard classification.Hard pixel based mapping techniques were successful using medium spatial resolution data (e.g.Landsat TM/ETM, Aster and Alos AVNIR) in regional and local scale, however, for the specific purposes like detailed crop pattern mapping or urban pattern mapping, object based classification approach was recommended for more reliable LUC mapping.Object based classification is appropriate when using very high spatial resolution data (e.g.rapid eye, Ikonos, Aerial photos, Geoeye).In segmentation stage of object based classification, pixels were merged to create each segment or object according to spectral, structural and textural similarities.This method is tolerated the pixel misclassification if there is a pixel noise in an area (Figure 15).
In this chapter ancillary data integration were also discussed using several data from satellite remote sensing sensors.Three types of ancillary data were integrated to the DT hard classifier.DEM resulted the largest improvement in overall classification accuracy among others.Surface texture and vegetation indices were improved the accuracy of specific land cover types.When all data used together, overall classification accuracy were reduced.Additionally, more ancillary data is not important to enhance classification accuracy.Success of the ancillary data varies based on classification target, study area characteristics and remotely sensed data.

FuzzyFuzzy
important role in per-field classification, integrating raster and vector data in a classification.The vector data are often used to subdivide an image into parcels, and classification is based on the parcels, avoiding the spectral variation inherent in the same class.decision about the land cover class that each pixel is allocated to a single class.The area estimation by hard classification may produce large errors, especially from coarse spatial resolution data due to the mixed pixel problem.MLC, MD, ANN, DT, SVM Soft (fuzzy) classificationProviding for each pixel a measure of the degree of similarity for every class.Soft classification provides more information and potentially a more accurate result, especially for coarse spatial resolution data classification.is used in image classification.A 'noisy' classification result is often produced due to the high variation in the spatial distribution of the same class.information is used in classification.Parametric or non-parametric classifiers are used to generate initial classification images and then contextual classifiers are implemented in the classified images.ECHO, combination of para metric or non-parametric and contextual algorithms.

Fig. 4 .
Fig. 4. Model based LUC classification results using strong training dataset and unsupervised K-means classification result in sample study area (in yellow).4.2.1.2Data dependent (machine learning classifiers)Data dependent classifiers are based on non-parametric rules.Particularly, the machine learning classifiers use different approaches according to classifier type.In this chapter, largely used non-parametric classifiers were assessed such as ANN, DT and SVM.
www.intechopen.com(Rumelhart et al. 1986) is the most commonly encountered ANN model in remote sensing (because of its generalization capability).The accuracy of an ANN is affected primarily by five variables: (1) the size of the training set, (2) the network architecture, (3) the learning rate, (4) the learning momentum, and (5) the number of training cycles.Size of the training set is the most important part in all LUC classifications.If training pixel counts are enough, accuracy of a LUC map would be better than less training pixels.

Fig. 6 .
Fig. 6.Decision tree architecture A Decision Tree is built from a training set, which consists of objects, each of which is completely described by a set of attributes and a class label.Attributes are a collection of properties containing all the information about one object.Unlike class, each attribute may have either ordered (integer or a real value) or unordered values (Boolean value) (Ghose et al. 2010).Most of the DT algorithms generally use the recursive-partitioning algorithm, and its input requires a set of training examples, a splitting rule, and a stopping rule.Splitting rules are determined tree partitioning.Entropy, gini, twoing and gain ratio are the most used splitting rules at the literature(Quinlan 1993, Zambon et al. 2006, Ghose et al. 2010).The stopping rule determines if the training samples can split further.If a split is still possible, the samples in the training set are divided into subsets by performing a set of statistical test

Fig. 9 .
Fig. 9. Methodology for application of LMM MLR refers to relating a respo n s e v a r i a b l e Y t o a s e t o f p r e d i c t o r s x i i n t h e f o r m (e.g.Chatterjee and Price, 1991): Y = b0 + b1.x1 + b2.x2 + ….+ bp.xp (4)

Fig. 10 .
Fig. 10.Study area boundary for LMM and ANN fuzzy classifiers comparison.

Fig. 12 .
Fig. 12. Image -object hierarchyBasically, there are three steps in object based classification as segmentation, classification and per field integration.An image was divided segments dependent on pixel spectral similarities, structure of the image and surface texture characteristics.This progress is up to variables like scaling factor, smoothness vs. compactness and shape factors (figure13).

Fig. 13 .
Fig. 13.(a) non-segmented image, (b) segmented image using scale factor 50, (c) segmented image using scale factor 10.Each segments are contained a group of pixels and scaling factor is defined minimum pixel counts which have similar spectral characteristics in a segment.Compactness and smoothness are important for creating pixel groups.Shape factor is deal with boundary of a segment.Scale factor is variable according to the study scale and ideal scale can be found trying different scale factors.When the sensitive LUC analyze is necessary, compactness factor should be high and smoothness should be low (e.g.vegetation classification in CORINE level 3 and more).Shape factor is very important if shape of the LUC objects have a dominant characteristic (e.g.agricultural lands, roads and buildings).

Fig. 15 .
Fig. 15.LDA pixel based and object based classification results of LSPPixel based LDA classifier was failed in onion, sour orange, settlement, bulrush and sand dunes using March and April images.However, June and August images, distance from built up areas, distance from cost line were integrated in rule dependent object based classification and kappa coefficient was increased 28% in general.Sour orange, bulrush, sand dunes and settlement accuracy were raised impressively.One of the advantages in rule dependent classifiers was allowed to add new class during the classification.In this study, saline vegetation and natural grasslands were included to improve classification accuracy (table12).
A vegetation index derived from combination of image wavebands.The most used vegetation indices are normalized difference vegetation index (NDVI), soil adjusted vegetation index (SAVI), normalized difference water index (NDWI), green vegetation index (GVI) and perpendicular vegetation index (PVI) at the literature.Vegetation indices indicate health condition (NDVI) and water content (NDWI) of the vegetation canopy.There are many textbooks and papers about calculation of vegetation indices.These indices provide extra information for LUC classification to discriminate subtle classes.For instead, NDVI calculated using red and near infrared band (NIR) combination as shown in following equation (Rouse et al 1974); NIR -RED / NIR + RED (5) NDVI data was included to Landsat TM data to show the effect of a vegetation index on LUC mapping.Overall accuracy was unchanged significantly, but sand dunes, baregrounds, deciduous classes were classified more accurately.

Table 1 .
USGS classification scheme for level I of forest cover.

Table 2 .
Main EUNIS habitat classes and sample levels of coastal habitats.

Table 3 .
The most used optical sensor specifications in LUC mapping.(*) planned missions.

Table 5 .
Number of training cycles is defined according to training error of a NN system.When the training error became optimal, training cycles are sufficient.Land Use/Cover Classification Techniques Using Optical Remotely Sensed Data in Landscape Planning 33 layer architecture.This NN was included 2 hidden layers.First hidden layer was included nodes two times more than input and the second hidden layer was contained nodes three times more than first hidden layer.Learning rate and learning momentum have defined according to training error (table5).ANN parameters and values Input layer (image bands) Node Hidden layer (various up to input neuron count) Connections Output layer (class count) www.intechopen.com

Table 11 .
Accuracy comparisons of fuzzy classification methods in different classification schemes.

Table 12 .
table 12).Kappa accuracy of each LUC and difference between object and pixel based classifications Land Use/Cover Classification Techniques Using Optical Remotely Sensed Data in Landscape Planning 49 percentage, correlation coefficiency obtained as 0.58.Vegetation metrics and MERIS wavebands enhanced accuracy to 0.67.This chapter has demonstrated various issues in LUC classification including, ability of optical remotely sensed data, different classifiers, training data size and ancillary data in the example of Eastern Mediterranean region.Parametric, non-parametric hard and soft LUC mapping techniques in local scale were assessed.Main findings of this chapter are: Selection of a classification scheme and the optical data are vital for a reliable result in LUC mapping.Remotely sensed data must be defined according to the mapping scale and study purpose.LUC classification scheme and level should be defined based on optical data ability such as spatial and spectral resolution.Image pre-processing such as, geometric registration, atmospheric correction, geometric correction and radiometric calibration are essential parts in change detection studies.Training data size, quality and mapping details are also important to select suitable classifier for LUC mapping.MLC, LDA, and DT techniques are useful for hard classification outputs.On the other hand, to derive a continuous map like cover percentage of each LUC or probability of each LUC needs soft classifiers such as RT and LMM.Training data size and quality affect the classification accuracy and classifier selection.Although model based classifiers has potential when strong training data set was used.In this case, data dependent classifiers can be chosen for better accurate LUC map.Linear techniques are suitable if mixture degree is small in a pixel.LMM is ideal if there are enough training data and pure pixel for each LUC.However, if training pixel size and pure pixels are weak, non-linear techniques like RT or ANN are suggested. www.intechopen.com