Open access peer-reviewed chapter

Sequential Classification of Hyperspectral Images

Written By

Min Zhao and Jie Chen

Submitted: 21 July 2017 Reviewed: 15 December 2017 Published: 01 August 2018

DOI: 10.5772/intechopen.73160

From the Edited Volume

Hyperspectral Imaging in Agriculture, Food and Environment

Edited by Alejandro Isabel Luna Maldonado, Humberto Rodríguez Fuentes and Juan Antonio Vidales Contreras

Chapter metrics overview

1,078 Chapter Downloads

View Full Metrics


Hyperspectral imaging has become increasingly popular in applications such as agriculture, food, and environment. Rich spectral information of hyperspectral images leads to new possibilities and new challenges in data processing. In this chapter, we consider the hyperspectral classification problems in consideration of sequential data collection, which is a frequent setting in industrial pushboom imaging systems. We present related techniques including data normalization, dimension reduction, classification, and spatial information integration and the way to accommodate these techniques to the context of sequential data collecting and processing. The propose scheme is validated with real data collected in our laboratory. The methodology of result assessment is also presented.


  • hyperspectral sorting
  • sequential hyperspectral data processing
  • spatial-spectral information
  • hyperspectral classification

1. Introduction

Hyperspectral imaging is a continuously growing area and has received considerable attention in the last decade. Hyperspectral data provide a wide spectral range, coupled with a high-spectral resolution. These characteristics are suitable for detection and classification of surfaces and chemical elements in the observed images. Rich information in spectral dimension provides solutions to many problems that cannot be solved by traditional RGB imaging or multispectral imaging.

Applications include land use analysis, pollution monitoring, wide-area reconnaissance, and field surveillance, to cite a few. Typical cases related to food quality, agriculture, and environment include as follows:

  1. Food safety plays an important role in our daily life. We often use a combination of appearance, hand-feel, and smell of the product to make a judgment of the quality of fruits or vegetables. But it is not enough to judge if there are abnormalities, deformations, or even visible defects in the fruit or vegetable. Awareness about food safety has exemplified the requirement for a rapid and accurate hyperspectral detection system [1].

  2. Precision agriculture is a farming management concept based on observing, measuring, and inter and intrafield variability in crops. Precision agriculture using hyperspectral remote sensing is acquired and processed to derive maps of crop biophysical parameters, to measure the amount of plant cover, and to distinguish between crops and weeds [2].

  3. Due to the pressures of over consumption, population, and technology, the biophysical environment is being degraded, sometimes permanently. Many of the earth’s resources are on the verge of exhaustion because they are influenced by human impacts across many countries [3]. Many attempts are made to prevent damage or manage the impacts of human activity on natural resources. Hyperspectral classification used in resource recovery can make it rapid and efficient.

One of the most important tasks of hyperspectral image processing is image classification. Rich spectral information of hyperspectral image provides the possibility to classify materials that are difficult to be distinguished by other imagery techniques. In the past decades, different kinds of hyperspectral classification methods have been proposed [4, 5, 6, 7, 8, 9]. However, the existed methods may not be suitable for a real-time material sorting system. Pushboom imaging systems are frequently used in industry sorting, such a system collects columns of an image one after another in a sequential manner (see Figure 1). It is thus necessary to design a framework for online classification tasks and accommodate convectional algorithms to the sequential processing setting.

Figure 1.

Sequential hyperspectral data collecting and processing by a pushboom system. Hyperspectral camera captures data xk at time instant k, which is one of the sequential columns of the entire image, yk is the result after processing (classification label in this case).

In this chapter, we present a scheme of sequential classification for hyperspectral sorting systems. This scheme can be used in various fields, such as measuring food quality and resource recovery. We present the main techniques in this sorting and processing, including data normalization, dimension reduction, classification, and spatial information integration and the way to accommodate these techniques to the context of sequential data collecting and processing.

The rest of this chapter is organized as follows. In Section 2, we propose the main steps of sequential hyperspectral classification processing system. In Section 3, detailed methods are presented for sequential hyperspectral image processing and sorting. Experiment results are then discussed in Section 4. Section 5 concludes the chapter.


2. System overview

Before proceeding to elaborate the proposed sequential hyperspectral image classification method, we first present the notation and the data model used in this work. We consider that the hyperspectral image under study has h pixels in column and w pixels in row, where h is a fixed size that is determined by the spatial resolution of the camera, and w actually increases toward infinity with the moving of the pushboom system. Each pixel consists of a reflectance vector consisting of p contiguous spectral bands. Then, let

  • N=h×w be the total number of pixels.

  • X=X1X2X3XpT be the p×N hyperspectral images.

  • Xkij represents a pixel, where the subscript k denotes the index of the spectral band, i and j represent the location of pixel in the spatial domain.

The data collecting and processing of a real-time hyperspectral sorting system consist of the following major steps.

  1. Sequential image acquisition.

  2. Data preprocessing.

  3. Material classification.

The hyperspectral data used in this work set are collected by the system of GaiaField in our laboratory. The parameters of the used system are provided in Table 1. Our online processing is based on windowed columns. After collecting each column, we use this column with several previous ones to form a window and perform data processing steps within this window. Black-white normalization is used for basic data normalization. Techniques of PCA and hyperspectral decorrelation of fuzzy sets are used for dimension reduction [10]. Typical techniques such as GML and SVM are presented for material classification. Considering the positive effect of spatial information on processing results [11], we also propose to integrate spatial dimension and spectral dimension to achieve an enhanced classification accuracy. Finally, classification accuracy is characterized by metrics such as confusion matrix and κ coefficient. Details of the used techniques and results will be provided later.

Equipment typeGaiaField and GaiaSorter
Moving speed of loading platform4.1 cm/s
Spectral resolution128
Spatial resolution650×348
Distance between lens and samples24 cm
Exposure time3 ms

Table 1.

Equipment parameters.


3. Processing methods

3.1. Data preprocessing

Data preprocessing steps include basic data normalization and spectral decorrelation. They are performed one after another as described later.

3.1.1. Basic data normalization

An important preprocessing is the so-called black-white calibration. This calibration is carried out by recording an image for black and another for white, as described below, to remove the effect of dark current of the camera sensor and avoid the uneven light intensity of each band. At an offline phase, the black image (B) is acquired by turning off the light source and covering the camera lens with its cap. The white image (W) is acquired by adopting a standard white ceramic tile under the same conditions as the raw image. Then, image correction is performed by [12],


where I is the hyperspectral image after normalization, I0 is the original hyperspectral image that is captured in our laboratory, B is the black reference image, and W is the white reference image.

3.1.2. Data dimension reduction

The high-spectral resolution of hyperspectral data enables us to classify materials that are undistinguishable with conventional methods. However, a large number of spectral channels result in difficulties in processing in terms of classifier training (Hughes phenomenon) and computational burdens. Data dimension reduction can be performed due to the above facts and existence of information redundancies across bands. PCA

PCA is one of the most popular methods for data dimension reduction. PCA computes a linear transformation for high-dimensional input vectors, and this transformation maps the data into a low-dimensional orthogonal subspace. For simplicity, we assume that the data samples have zero mean. Otherwise, we can centralize the data by subtracting the mean


The principle analysis is based on the eigenstructure of the data. We therefore calculate the covariance matrix of Y and perform the eigendecomposition on this matrix. The ith eigenvector of matrix Y is denoted by ai with associated eigenvalue denoted by λi.

To reduce the dimension of data, we select an appropriate number of eigenvectors ai corresponding to the value of eigenvalues λi from large to small, to form the representation coefficient matrix A [13].


where Z is the hyperspectral image data after decorrelation. Fuzzy sets

Using fuzzy sets to decorrelate the hyperspectral data is based on a priori knowledge that the adjacent wavelengths of the spectrum are more correlated than the distant pairs, as the spectral information varies smoothly and successively. We consider sampling spectral characteristics by a group of adjacent spectral bands, which can be obtained by dividing the spectra in separate groups to attain the desired spectral selectivity. We propose separating the hyperspectral data into a number of M fuzzy groups where each group covers a range of wavelengths [14]. The contribution of each wavelength is modeled by a membership function Mfiλ. We use a triangular function as the membership function, shown in Figure 2.


where λi is the central wavelength value of the fuzzy set i, and D is the distance of central wavelengths of two adjacent fuzzy sets.

Figure 2.

Triangular function.

The spectral wavelengths of all points have different membership degrees in different fuzzy sets. Each wavelength has different degree of membership in two adjacent fuzzy sets, while the membership degree in the remaining fuzzy sets is 0 (Figure 3).

Figure 3.

Triangular function weighted process.

The energy of each fuzzy set is calculated by weighting the intensity of each spectrum element using membership functions associated with each fuzzy set, i.e.,


where Xi is the energy of each fuzzy set, and Lλ is the intensity of each spectrum element.

Based on the energy values of each fuzzy set, we can obtain useful information about the spectral characteristics. In this way, each hyperspectral image pixel can be defined by a vector containing the energy values of the M fuzzy sets as


3.2. Material classification

In this section, we present the algorithm to classifier/sort the captured data using features (data of reduced dimension) extracted by PCA or fuzzy set method. We first review these two popular classification methods in a general manner. Then we introduce how to incorporate spatial information into the classification. Finally, sequential processing with window-based method will be discussed.

3.2.1. Gaussian maximum likelihood classification

Spectra of distinct material of hyperspectral data form data clusters in a space with the dimension of the feature, and we assume that the data features of each material approximately follow a multivariate normal distribution. To be specific, data features of a material i and the p dimension probability density function in form of:


where μ and Σ are the mean vector and the covariance matrix, respectively. i denotes the label of class [15]. Each pixel in the hyperspectral image is labeled as the class that achieves maximum probability.

3.2.2. Support vector machine

SVM is one of the most effectively and widely used methods in statistical learning. SVM aims to find the best tradeoff of model complexity and learning ability with limited sample information. SVM can effectively solve the Hughes phenomenon caused by insufficient samples in hyperspectral classification.

The goal of training algorithm is to design an optimal hyperplane. The training principle of SVM is to find a linear optimal separating hyperplane [16]. Let x be the input pixel vectors satisfying


This method constructs a hyperplane that maximizes the margins between classes, specified by a (usually small) subset of the data that define the position of the separator. These points are referred to as the support vectors [17]. The decision function is as follow:


where αi is the ith Lagrange coefficient, yi is the corresponding classification label, xi is the ith support vector, x is the input pixel vector, N is the number of support vector, and b is the decision offset coefficient. For two-class hyperspectral classification, fx takes value of either 1 or 0, suggesting the class that the current pixel belongs. For multiclass classification, we can use one versus one, one versus rest, hierarchical support vector machine or other strategy to obtain the multiclass label.

Sometimes, data cannot be separated by a linear classifier. Therefore, kernel methods are used to map data from the original input space to a higher dimension space. Thanks to the kernel trick, we only need to know the form of the inner product in that space instead of using the explicit map [16]. Popular kernel functions include as follows:

Linear kernel:


Polynomial kernel:


where q is the polynomial order.

Radial basis function kernel:


where σ2 is kernel bandwidth.

Sigmoid kernel:


for appropriate values of v and c, so that Mercer’s conditions are satisfied [16].

3.2.3. Incorporating spatial information

Conventionally, hyperspectral data classification algorithms are proposed based on spectroscopic viewpoint, and they ignore the spatial information that embeds in neighboring pixels [18]. Integration of spatial and spectral information may improve the processing performance. We propose to combine spatial dimension and spectral dimension information to improve the classification accuracy. The proposed method investigates the spatial information based on the connection component labeling in the following. We generate the mean image by averaging data after dimension reduction over spectra bands. A component labeling algorithms then applied to the binarized mean image. In our system, if an object is marked by connected component labeling and over 60% pixels are labeled as a class, we consider that all pixels within this connected region actually belong to the associated material. The classification accuracy will be improved using this strategy.

3.2.4. Sequential processing

We use a sliding window to assemble the acquired hyperspectral data, whose columns are collected sequentially one after another. The use of a sliding window facilitates to incorporate the spatial information in processing. The width of the sliding window should be determined by considering the data acquisition rate, data processing speed, and spatial correlation of the observed scenario. In our system, the width of the sliding window (L) is set to 15. Our hyperspectral images are captured by a pushboom system where columns of images are collected sequentially one after another. After collecting each column, we use this column with several previous ones to form a sliding window and perform data processing steps within this window. Let L be the width of the sliding window, and we set L=15 in our experimental (Figure 4).

Figure 4.

Sliding window.


4. Experimental results

We collect the hyperspectral data with our pushboom system of Gaia. The images are acquired in the 400–1000 nm wavelength range, with a spectral resolution of 7 nm, for a total of 128 wavelengths (p=128). Their image resolutions h and w are 650 and 348 (650×348), respectively. The hyperspectral data include four kinds of fruits: tomato, jujube, lemon, and orange. In this study, we use a sliding window of size 15 for online processing of data. Twenty-three sequential hyperspectral images are extracted for classification. The datasets captured are divided into training and testing sets, where 300 pixels of each material are used for training and 30,603 pixels are used for testing.

After data preprocessing, we select 300 pixels of each material from the training set as sample points to form a hyperspectral image. The pixels of the image are converted into row vectors by row or column to form a two-dimensional matrix, which is used for data reduction. The operation of the test set is the same as that of the training set.

After the PCA transformation, the eigenvalue distribution is shown in Figure 5. This scree plot shows that the first eight factors explain most of the variability. The remaining factors explain a very small proportion of the variability and are likely unimportant. We select the principal component, which takes 99% of the eigenvalues, as the data after dimensionality reduction. For fuzzy-set data reduction, we fold 128 bands with a triangular window of length 32, and then we sample the data using at each 16 points, so that the data dimension also reduces to 8. We use eight-connected component labeling method to remove the background of data after dimension reduction.

Figure 5.

Eigenvalue distribution.

We then study the classification results of GML principle and SVM. We classify the data obtained from dimension reduction and background material removal (see Figures 6 and 7). The result of classification with spatial information (connected region labeling) is shown in Figure 8.

Figure 6.

(a) GML classification with PCA and (b) GML classification with fuzzy sets.

Figure 7.

(a) SVM classification with PCA and (b) SVM classification with fuzzy sets.

Figure 8.

Classification with spatial information (connected region labeling) achieves almost 100% accuracy.

4.1. Performance evaluation of results

4.1.1. Confusion matrix

A confusion matrix is a table that is often used to describe the performance of a classifier on a set of test data for which the true values are known. It compares the classification result with the reference image, and we need to determine the labels of each point in the reference image in the classified image. The confusion matrixes of our experiment are shown in Table 2.

ClassActual class
Chinese dateLemonOrangeTomatoRow sum
(1) GML classification with PCA
Predict classChinese date17450001745
Column sum1775981511,654735930,603
(2) GML classification with fuzzy sets
Predict classChinese date17450001745
Column sum2042962811,577735630,603
(3) SVM classification with PCA
Predict classChinese date17441001745
Column sum224510,70610,282737030,603
(4) SVM classification with fuzzy sets
Predict classChinese date17380071745
Column sum326013,5877522623430,603

Table 2.

Confusion matrix of classification results: (1) GML classification with PCA, (2) GML classification with fuzzy sets, (3) SVM classification with PCA, and (4) SVM classification with fuzzy sets.

where mij shows pixels should belong to class i, which is wrongly assigned to class j, and k is the class number of the classification results (Figures 9 and 10).

Figure 9.

(a) Confusion matrix of GML classification with PCA and (b) confusion matrix of GML classification with fuzzy sets.

Figure 10.

(a) Confusion matrix of SVM classification with fuzzy sets and (b) confusion matrix of SVM classification with fuzzy sets.

4.1.2. κ coefficient

κ can reflect the classification error of the whole image and solve the problem that the classification accuracy depends too much on the number of classes and the number of samples. κ is performed by adopting the following equation:


where mi+ is the sum of the line i in the confusion matrix, and m+i is the sum of the column i in the confusion matrix.

κ of GML based on PCA dimensionality reduction is 98.93%, and κ of SVM is 92.55%. κ of GML based on fuzzy-set reduction technique is 98.56%, and κ of SVM is 74.69%. From the results of κ, we can see that the classification based on PCA is better than fuzzy sets, GML is better than SVM, and GML based on PCA is the best method for sequential classification of hyperspectral images.

4.1.3. Other metrics

Other metrics include classification accuracy, product’s accuracy (PA), and omission errors (OEs).

Classification accuracy indicates the correct rate of the classifier, as illustrated in Eq. (16).


PA is used to indicate the rate of the classification result that is correctly classified, as illustrated in Eq. (17). User’s accuracy is used to indicate the rate of the pixels that are correctly divided into class I to the total number of pixels that are divided into I classes, as shown in Eq. (18).


OEs represent the number of pixels in class I that is incorrectly assigned to other class, as shown in Eq. (19). Commission errors (CEs) indicate the percentage of other class pixels that are incorrectly divided into class I, as illustrated in Eq. (20).


Classification accuracy of GML based on PCA dimensionality reduction is 99.26%, and classification accuracy of SVM is 94.80%. Classification accuracy of GML based on fuzzy-set reduction technique is 99.00%, and classification accuracy of SVM is 82.03%. From this evaluation and Table 3, GML based on PCA dimensionality reduction is the proposed solution for sequential classification of hyperspectral images.

Other metricsChinese dateLemonOrangeTomato
(1) GML classification with PCA
(2) GML classification with fuzzy sets
(3) SVM classification with PCA
(4) SVM classification with fuzzy sets

Table 3.

Other metrics: (1) GML classification with PCA, (2) GML classification with fuzzy sets, (3) SVM classification with PCA, and (4) SVM classification with fuzzy sets.


5. Conclusion

The major objective of this chapter is to build a sequential hyperspectral classification method for an industrial material sorting system. We propose hyperspectral images captured by a pushboom system where columns of images are collected sequentially one after another to get sequential hyperspectral images. PCA and fuzzy sets are used for data decorrelation. We study the GML and SVM classification with the data obtained from dimension reduction and background material removal and carry out the performance analysis. The results show that the accuracy rate of GML based on PCA dimensionality reduction is 99.26%, and the accuracy rate of SVM is 94.80%. The accuracy of GML based on fuzzy-set reduction technique is 99.00%, and the accuracy rate of SVM is 82.03%. After combing the spatial and spectral information, the accuracy of classification of hyperspectral images can be 100%.

The designed framework shows several advantages in terms of processing speed, efficiency, and accuracy. It may play an important role in industrial material sorting for agricultural products, food, and industrial waste sorting.



This work was supported in part by National Natural Science Foundation of China under grant 61671382 and in part by National Natural Science Foundation of Shenzhen under grant JCYJ2017030155315873.


  1. 1. Loutfi A, Coradeschi S, Mani GK, Shankar P, Rayappan JBB. Electronic noses for food quality: A review. Journal of Food Engineering. 2015;144:103-111
  2. 2. Cetin, H and Pafford, JT and Mueller, TG. Precision agriculture using hyperspectral remote sensing and GIS. Recent Advances in Space Technologies, 2005. RAST 2005. Proceedings of 2nd International Conference on. 2005. pp. 70-77
  3. 3. Bonifazi G, Serranti S, Bonoli A, Dall’Ara A. Innovative recognition-sorting procedures applied to solid waste: The hyperspectral approach. WIT Transactions on Ecology and the Environment. 2009;120:885-894
  4. 4. Harsanyi JC, Chang C-I. Hyperspectral image classification and dimensionality reduction: An orthogonal subspace projection approach. IEEE Transactions on geoscience and remote sensing. 1994;32:779-785
  5. 5. Ye Z, Prasad S, Li W, Fowler JE, He M. Classification based on 3-D DWT and decision fusion for hyperspectral image analysis. IEEE Geoscience and Remote Sensing Letters. 2014;11:173-177
  6. 6. Camps-Valls G, Gomez-Chova L, Munoz-Mari J, Vila-Frances J, Calpe-Maravilla J. Composite kernels for hyperspectral image classification. IEEE Geoscience and Remote Sensing Letters. 2006;3:93-97
  7. 7. Melgani F, Bruzzone L. Classification of hyperspectral remotesensing. IEEE Transactions on Geoscience and Remote Sensing. 2004;42:1778-1790
  8. 8. Palmason JA, Benediktsson JA, Sveinsson JR, Chanussot J. Fusion of morphological and spectral information for classification of hyperspectral urban remote sensing data. In: Proc. IGARSS. 2006. pp. 2506–2509
  9. 9. Chi M, Bruzzone L. Semisupervised classification of hyperspectral. IEEE Transactions on Geoscience and Remote Sensing. 2007;45:1870-1880
  10. 10. Artzai, Ghita O, Bereciartua A, Echazarra J, Whelan PF, Iriondo PM. Real-time hyperspectral processing for automatic nonferrous material sorting. Journal of Electronic Imaging. 2012;21:013018-1
  11. 11. Wahab DA et al. Development of a prototype automated sorting system for plastic recycling. American Journal of Applied Sciences. 2006;3(7):1924-1928
  12. 12. Serranti S, Palmieri R, Bonifazi G. Hyperspectral imaging applied to demolition waste recycling: Innovative approach for product. Journal of Electronic Imaging. 2015;24(4):043003
  13. 13. Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics and Intelligent Laboratory Systems. 1987;2:37-52
  14. 14. Picón A et al. Fuzzy spectral and spatial feature integration for classification of nonferrous materials in hyperspectral data. IEEE Transactions on Industrial Informatics. 2009;5(4):483-494
  15. 15. Bishop CM. Pattern Recognition and Machine Learning. New York: Springer; 2006. ISBN-10:0-387-31073-8
  16. 16. Demir B, Erturk S. More sparsity in hyperspectral SVM classification using unsupervised pre-segmentation in the training phase. Recent Advances in Space Technologies, 2007. RAST’07. 3rd International Conference on. 2007;271-274
  17. 17. Wu Y. Approximate computing of remotely sensed data: SVM hyperspectral image classification as a case study. IEEE Journal of a Selected Topics in Applied Earth Observations and Remote Sensing. 2016;9:5806-5815
  18. 18. Stockman H, Gevers T. Detection and classification of hyper-spectral edges. In: British Machine Vision Conf. (BMVC); Nottingham, UK. 1999. pp. 643-651

Written By

Min Zhao and Jie Chen

Submitted: 21 July 2017 Reviewed: 15 December 2017 Published: 01 August 2018