Open access peer-reviewed chapter

A Cognitive Digital-Optical Architecture for Object Recognition Applications in Remote Sensing

Written By

Ioannis Kypraios

Submitted: 01 November 2022 Reviewed: 15 November 2022 Published: 01 March 2023

DOI: 10.5772/intechopen.109028

From the Edited Volume

Vision Sensors - Recent Advances

Edited by Francisco Javier Gallegos-Funes

Chapter metrics overview

54 Chapter Downloads

View Full Metrics

Abstract

From coastal landscapes to biodiversity remote sensing can on the one hand capture all the natural heritage elements and on the other hand can help in maintaining protected species. In a typical remote sensing application, a few thousands of super high-resolution images are captured and need to be processed. The next step of the processing involves converting those images to an appropriate format for visual display of the data. Then, the image analyst needs to define the regions of interests (ROIs) in each captured image. Next, ROIs need to be defined for identifying specific objects or extracting the required information. First drawback of this processing cycle is the use of image analysis tools which provide them only with scaling or zooming features. Second, there is no conceptual connection between the image analysis tools and the actual processing cycle. Third, such existing tools do not usually automate any steps in the processing cycle. We combine an optical correlator with a supervised or an unsupervised classifier learning algorithm and show how our proposed novel cognitive architecture is conceptually connected with the image analysis processing cycle. We test the architecture with captured images and describe how it can automate the processing cycle.

Keywords

  • object recognition
  • cognitive digital-optical architecture
  • image analysis
  • knowledge representation and learning
  • remote sensing

1. Introduction

Cultural heritage consists of all the tangible and intangible elements from monuments and cultural traditions to natural landscape, including all the extinct or current biological species. Cultural heritage and specifically tourism activities related to cultural heritage contribute to economic growth, regeneration, education and tourism [1, 2]. A previous report [3] published by the HLF and Visit Britain revealed that the heritage tourism is a £12.4 billion a year industry. £7.3 billion of heritage expenditure is based on visits to built heritage attractions and museums, with the overall £12.4 billion including visits to parks and countryside as well. In a recent article, it was reported that to reduce the risk of extinction for all the threatened species worldwide would cost annually approximately £2.97bn, with an additional £47.4bn required per year to establish and manage protected areas for species known to be at risk from habitat loss, hunting and other human activities [4].

Remote sensing has become one of the main technologies nowadays used for protecting cultural heritage due to its non-invasiveness [5]. Thus, remote sensing has been used in many cases for the conservation analysis of monuments, archaeological site detection and risk protection, together with the protection of natural landscapes consisting of living species. Aerial photography is one of the forms of modern remote sensing. It has been applied in many application areas [6]. For example, aerial photography has been successfully used in locating ancient civilisation structures amongst thick jungle vegetation [7].

One of the key tools in remote sensing image analysis is the object recognition. Advanced machine learning (ML) and artificial intelligence (AI) algorithms can be used for detecting and classifying different classes of objects in different applications, such as environmental monitoring, geological hazard detection, land-use/land-cover (LULC) mapping, geographic information systems (GIS), precision agriculture and urban planning. Still, object recognition in remote sensing images (RSIs) can be very challenging due to the large variations in the visual appearance of objects caused by camera viewpoint variations, occlusion, background clutter, and illumination changes. Thus, in low spatial resolution satellite images such as Landsat the recognition of objects becomes even harder. Therefore, higher resolution satellite images such as IKONOS or Quickbird are preferred since provide researchers and image analysts with more detailed spatial and textural information. In effect, a greater range of object categories can be recognised due to the increased sub-meter resolution.

Nowadays the technological advancements in satellite and aerial RSIs have offered us opportunities for new applications in many image analysis areas. Pixels in an image can be grouped and clustered into regions of interest (ROIs) where then object recognition algorithms can be applied for classifying a range of categories of objects. Supervised and unsupervised such classifiers can be utilised depending on the application. Supervised classification require larger amount of data than unsupervised with manual annotation and labelling of a bounding box which contains the true class object for training purposes. However, such manual annotation suffers from scaling up issues when a very large amount of such data is needed. Moreover, it becomes a significantly difficult task either when the ROIs consist of a cluster of a few pixels in an RSI or the objects are occluded, camouflaged and may include complex textures.

Thus, object recognition algorithms become of great importance in applications of RSIs. Extracting the features of an object either through manually labelling them or through a classifier algorithm becomes essential when aiming to recognise them in image analysis applications. However, when dealing with large or very large (big) data, then this task turns complex with high computational costs for processing those extracted feature vectors. The manual labelling is time consuming and unreliable since duplicates or redundant features are often created. Here, we will be focusing more on automated feature extraction through a classifier algorithm for large or very large data.

In Section 2, we discuss the object recognition systems and their design characteristics. In Section 3, we explain the k-means algorithm. Section 4 discusses the optical correlator classifiers. In Section 5, our cognitive object recognition architecture is described. Then, in Section 6 we discuss the results recorded when applying our cognitive architecture. Section 7 contains the Conclusions and future work.

Advertisement

2. Object recognition systems for automated image analysis

Aerial or satellite survey consists of several steps that need to be followed. Starting from the data acquisition procedures, they include visual observations, the capturing of imagery and the use of metric measurements on the acquired images. Then, this raw information is typically used to produce a set of documents which consists of text and related images. However, the resulting documents are often hard to use since they have been created for a specific application and require expert knowledge to comprehend. In addition, image quality can vary depending on the remote sensing site conditions. Other challenges on image analysis from aerial surveys can include poor lighting conditions, occlusions, and varying depths. Also, limitations on the geospatial information provided with the acquired images can limit their spatial analysis. Restrictions on the field-of-view (FOV) and the image sensor footprint can affect the number of images collected per aerial survey. Then, consequently, a higher number of images require a higher processing time to be completed. Where time of completion is of essence for cost reasons, then someone can expect that automated image analysis tools are necessary for minimising those processing and completion times.

Integral part of such automated image analysis tools plays the object recognition system used which need to be designed in novel architectures, if they are to be applied for solving more complex problems [8]. Recent advances have led to the development of biologically-inspired also known as cognitive architectures [9] of object recognition systems with separate design blocks of a recognition unit and a separate knowledge learning unit [10, 11]. Thus, cognitive architectures need to exploit the non-linearity, the learning and adaptation to the input data, and provide an attentional mechanism for the hybrid system to be able to select certain input information to be included in its learning against other input. Therefore, knowledge representation and learning becomes a central issue in the design and implementation of such hybrid biologically-inspired pattern recognition [12]. Knowledge representation can have altering effects on problem knowledge learning and problem solving [13]. A problem solving system consists of a domain theory which specifies the task to be solved, the initial problem states and the targeted problem goals, and a control knowledge which guides the decision-making process. Thus, knowledge representation can have a direct effect to the efficiency of the problem solving process [14, 15].

2.1 Defining the problem

In this chapter, we describe a new object recognition architecture for improving the speciation of an endangered bird species from aerial surveys. For reasons of confidentiality, we use here the term endangered bird sub-species 1 or, simply, endangered species 1, and the term endangered sub-species 2 or, simply, endangered species 2. In particular, we are looking to improve the accuracy and precision of the recognition of endangered bird sub-species 1and 2 in winter plumage which the task of correctly speciating them becomes harder than in their summer plumage. Our proposed cognitive object recognition architecture incorporates the features of shape, size and colour (3-bands R, G and B) of the endangered sub-species 1 and 2 in the architecture’s knowledge representation, and then apply an unsupervised knowledge learning unit for improving the accuracy and precision in recognising them.

Advertisement

3. Unsupervised clustering

Unsupervised machine learning refers to learning the input but without a reference to known or labelled data. Unlike to supervised machine learning, unsupervised machine learning algorithms cannot be directly applied to a classification problem since there is no prior knowledge of either the number of object classes or each class threshold. Instead, unsupervised learning can be used for discovering the underlying structure or pattern of the data. Thus, the term “clustering” refers to a process of grouping similar things together [14]. Therefore, unsupervised learning can be used for discovering such clusters in the input data.

K-means is an unsupervised algorithm for clustering m objects into k clusters in which each observation belongs to the cluster with the nearest mean [15]. Each centroid is a point in a 2- or N-dimensional space that represents the centre of the cluster. Figure 1 shows an example of K-means clustering algorithm [17]. The algorithm begins with k randomly placed centroids and assigns every item to the nearest one. After the initial assignment, the centroids start being moved to the average location of all the nodes assigned to them, and new assignments of objects to centroids are redone. The process repeats until the centroids stop being moved by the algorithm [18].

Figure 1.

Example of K-means clustering algorithm. There are initially k = {k1} centroids assigned in the object M-dimensional space. As the algorithm goes through the recursive steps the centroids are re-assigned and the clusters boundaries are moved around the objects space. (adapted by [16]).

Bennet et al. [19] described a method for unsupervised classification in multitemporal optical RSIs based on discreet wavelet transform (DWT) feature extraction and K-means clustering is proposed. After pre-processing the optical image, they applied a feature extraction using the DWT for creating the input vectors. Then, the authors applied a feature reduction for selecting the most discriminative features using an energy based selection. Finally, they used K-means clustering for unsupervised learning of the input data clusters and compared the results by labelling the clusters using ground truth data. Shulei Wu et al. [20] introduced a novel classification method based on K-means using hue, saturation, value (HSV) colour features. Their novel method with HSV data produced higher classification accuracy results when tested with Landsat satellite data than K-means method with RGB data. Abbas et al. [21] compared K-means unsupervised clustering method with the Iterative Self-Organising Data Analysis Technique Algorithm (ISODATA) unsupervised method for automatically grouping pixels of similar spectral features from remote sensing images. Vishwanath et al. [22] combined K-means unsupervised clustering with a Laplacian-of-Gaussian and a Prewitt filters for improving the classification and road edge detection in RSIs. Yin et al. [23] applied K-means clustering algorithm on Lidar based 3D object detection and classification tasks in automated driving (AD). Specifically, they used K-means for 3D points cloud segmentation. The authors reported a high-speed 3D object recognition when run using a GPU enabled platform. Huu Thu Nguyen et al. [24] combined deep learning algorithms with K-means clustering for achieving multiple object detection in both sonar images and 3D point cloud Lidar data.

Figure 2 shows the K-means algorithm flowchart [25]. The algorithm is a recursive one where previous steps in the flowchart will be called in another step afterwards. The first basic step of this recursive-type algorithm is to determine the number of clusters K. We assume the centroid of these clusters, which it can be any random objects. Alternatively, we can assign as the initial centroids to be the first K objects in sequence. The algorithm as shown on Figure 2 consists of the following three recursive steps:

  1. Determine the centroid coordinate.

  2. Determine the Euclidean distance of each object to the centroids.

  3. Group the object based on minimum Euclidean distance values.

  4. Repeat steps 1–3 till all the centroids stop being moved in the objects space i.e. the algorithm has converged to a solution.

Figure 2.

K-means clustering algorithm flowchart –recursive steps (adapted by [25]).

Euclidean Distance d between two points p1x1y1 and p2x2y2 in X-Y two-dimensional (2D) space is given by:

dp1p2=x2x12+y2y12E1

We can re-write the above equation for points pn, P is the vector of all the points where n = 1…M, m is the index of cluster points and M is the total number of points, and centroid point τk where k = 1….C is the index of centroid points, C is the total number of centroid points and T is the vector which contains all the centroid points:

dpnτk=τxkpxn2+τykpyn2E2

Then each cluster point pn is assigned to a cluster based on estimating the minimum of the distance argmindist to a centroid τk which is given by:

argminτkTdistτkpn2E3

Then, we can compute the new centroid τk from the clustered group of points by the equation:

τk=1SnpnϵSnpnE4

where Sn is the set of all 2D data points assigned to the kth cluster.

Assume now we have two points in X-Y-Z three-dimensional (3D) space. Then, the Euclidean Distance between those two 3D points p3D1x1y1z1 and p3D2x2y3z2 is given by:

dp3D1p3D23D=x2x12+y2y12+z2z12E5

We can re-write the above Eq. (2) for a 3D centroid point τk3D where k = 1….C is the index of 3D centroid points, C is the total number of 3D centroid points and T3D is the vector which contains all the 3D centroid points:

dp3Dnτ3Dk3D=τxk3Dpxk3D2+τyk3Dpyk3D2E6

where p3Dn are the 3D cluster points, P3D is the vector of all the 3D points where n = 1…M, n is the index of 3D cluster points and M is the total number of 3D points, and centroid point τ3Dk where k = 1….C is the index of 3D centroid points, C is the total number of centroid points and T3D is the vector which contains all the 3D centroid points.

Then, Eq. (3) can be re-written for each 3D data point and for estimating the minimum of the distance argmindist3D to a centroid τ3Dk to a centroid as follows:

argminτ3DkT3Ddistτ3Dkp3Dn3D2E7

Then, Eq. (4) can be re-written for a 3D centroid as:

τ3Dk=1S3Dnp3DnϵS3Dnp3DnE8

where S3Dn is the set of all 3D data points assigned to the kth cluster.

Advertisement

4. Optical correlator classifiers

Since the pioneering work made by VanderLugt on spatial filtering [26, 27, 28], it became possible to construct complex matched filters. Several correlation filters for object recognition have been proposed to improve recognition capability, mostly by modifications of the amplitude or phase of the original matched filter.

4.1 Optical correlators categories

All the optical correlator classifiers can be further categorised based on the computational domain i.e. (a) spatial domain, (b) frequency/Fourier domain, and (c) hybrid domain by a combination of spatial and frequency domains.

For the frequency domain, correlator type of filters are commonly used. The correlator type of filters can be further classified into two main classes, the single type of filters and the cascaded type of filters. From the first class, Jamal-Aldin et al. [29, 30, 31] have presented previously their work on the non-linear difference-of-Gaussians synthetic discriminant function (NL-DOG SDF) filter. Their design of the filter was motivated by the good detectability of the modified difference of Gaussians (DOG) filter [32]. In order to improve the interclass discrimination but still keep an intraclass tolerance for a higher distortion range of the true-class object, a non-linear operation was integrated into the synthesis of the modified synthetic discriminant function filter. The DOG function approximates the second- differential operator on the image-intensity function. In practice, when convolved with the image, the DOG filter results to an edge map of the reduced-resolution image. By properly adjusting the ratio of the standard deviations of the inhibitory and excitatory Gaussians to be equal to 1.6, the DOG filter provides smoother performance for the true-class object distortions, in effect improving the intraclass properties of the filter. The NL-DOG SDF filter is based on integrating the NL-DOG operation into the synthesis of the SDF filter as well as the input test images. Mahalanobis et al. [33], were first to propose the minimum average correlation energy (MACE) filter. The MACE filter belongs to the linear combinatorial type of filters derived from the synthetic discriminant function. It is designed to maximise the training set images peak height and minimise the response of the filter to non-training set input images, with the constraint of keeping the peak-amplitude response of the filter to a fixed value for all the true-class objects included in the training set. It can be designed to give a fixed peak-amplitude response to the non-training set images, too. The solution to this resulted optimization problem is found by applying the Langrange multipliers method. The resulting MACE filter produces a sharp peak response with narrow sidelobes and with a fixed peak-height for the true-class object images included in the training set of the filter. Later, Mahalanobis et al. [34] observed that filters may perform better if hard constraints are not imposed on the correlation peaks, and suggested the use of unconstrained correlation filters. Despite the previous work in SDF synthesis that assumed the correlation values at the origin are pre-specified, there is no need for such a constraint. Thus by removing the hard constraints, we increase the number of possible solutions, thus improving the chances of finding a filter with better performance. A statistical approach is used for the design of an unconstrained filter. This method produces sharp peaks, it is computationally simpler and the proposed filters offer improved distortion tolerance. The reason lies in the fact that we do not treat training images as deterministic representations of the objects but as samples of a class whose characteristic parameters are used in encoding the filter. Three types of metrics [33, 35] are used in the design of the unconstrained filters, namely: the average correlation energy (ACE), the average similarity measure (ASM) and average correlation height (ACH). If a filter is designed to maximise the ACH criterion, it is called the maximum average correlation height (MACH) filter [34, 36]. The MACH filter maximises the relative height of the average correlation peak with respect to the expected distortions. The MACH filter yields a high correlation peak in response to the average of the training image vector. Besides optimising the ACH criterion, in practice some other performance measures, e.g. the ACE and ONV, also need to be balanced to better suit different application scenarios. Thus, based on Refregier’s approach on optimal trade-off [36] filters, Mahalanobis et al. designed the optimal-tradeoff [37, 38] maximum average correlation height (OT-MACH) filter, which minimises the average correlation height criterion, holding the others constant. By adjusting the values of the three non-negative parameters of α,βandγ (0α,β,γ1), we control the OT-MACH filter’s behaviour to match different application requirements. If b = g = 0, the resulting filter behaves much like a minimum variance synthetic discriminant function (MVSDF) [39] filter with relatively good noise tolerance but broad peaks. If α=γ=0 then the filter behaves more like a MACE filter, which generally exhibits sharp peaks and good clutter suppression but is very sensitive to distortion of the target object. If α=β=0, the filter gives high tolerance for distortion but is less discriminating.

From the second class category of cascaded filters [40], Reed and Coupland [41] have studied a cascade of linear shift invariant processing modules (correlators), each augmented with a non-linear threshold as a means to increase the performance of high speed optical pattern recognition. They propose that their cascaded correlators configuration can be considered as a special case of multilayer feed-forward neural networks. They have proven that their cascaded correlator’s non-linear performance can exceed the MACE filter’s performance. Mahalanobis et al. [42, 43] have developed the Distance Classifier Correlation Filter (DCCF). Similarly with the work of Reed and Coupland [41], DCCF uses a cascade of shift-invariant linear filters (correlators) to compute the linear distances between the input test image and the trainset images under an optimum transformation. DCCF can be extended to support recognition of multiple object classes. Alkanhal and Kumar [44] have developed the Polynomial Distance Classifier Correlator filter (PDCCF). The underlying theory extends the original linear distance classifier correlation filter to include non-linear functions of the input pattern. PDCCF can optimise jointly all the correlators of the cascaded design, and can support multi-class object recognition.

4.2 Synthetic discriminant function filter

The main idea behind the Synthetic Discriminant Function (SDF) filter is to include the expected distortions in the filter design such that improved immunity to such distortions is achieved. For example, the inclusion of the out-of-class objects in the filter design achieves multi-class discrimination filter ability. In the conventional SDF filter [28] design the weighted versions of the target object are linearly superimposed, such that when the composite image is cross-correlated with any input training image, the resulting cross-correlation outputs at the origin of these cross-correlations are the same and are equal to a pre-specified constant.

The basic filter’s equation constructed by the weighted combination of the training set images is:

hxy=i=1Nai·tixyE9

where

a=R1cE10

are the weights, c is an appropriate external vector and

R=tixytjxydxdyE11

is the correlation matrix of the training image set ti.

4.3 Pure SDF correlator classifier for endangered species 1 and 2 speciation

Figure 3 shows the block diagram of the Pure SDF Correlator Classifier. The train set consists of images of endangered bird species 1 and 2. Endangered bird species 1’s peak value is constrained to be 0.2, and endangered bird species 2’s peak value is constrained to be 1.0. By linearly superimposing the constraints weighted training set images, the composite image of the Pure SDF Classifier tool is synthesised. The test set consists of images (snags) of birds captured during an aerial survey. Each test image is then correlated with the composite image of the Pure SDF Correlator Classifier. The center peak of the output correlation plane for each input test image is then used for classifying the object snag as being either an endangered bird species 1 or an endangered bird species 2. A scatter plot of the classified endangered bird species is then drawn. For each input snag the spectral absolute peak (SAP) value Red and SAP value Blue values are used in the scatter plot.

Figure 3.

Pure SDF correlator classifier for endangered bird species speciation.

Advertisement

5. Cognitive object recognition architecture

For the endangered species 1 and 2, we represent an object as a vector of an input image histogram’s SAP for Red and Blue components. In effect, we assume that each input image of a species bird captured during an aerial survey can have a spectral signature which consists of those SAP Red and SAP Blue values. It is important to note here that K-means clustering algorithm is an unsupervised learning method. In effect, there is no a-priori information regarding the clusters size and final positions of the centroids. As we have assumed that each object is represented by two vector components, one SAP Red and one SAP Blue component, then we have a 2-Dimensional (2D) object space.

5.1 Biologically-inspired hybrid digital-optical system

We need to develop a new method to improve the speciation of the endangered bird species 1 and 2 for automating the image analysis from the collected datasets of the aerial surveys. There is an increased level of difficulty in correctly classifying and performing speciation for endangered bird species 1 and 2 during the winter aerial surveys due to the higher similarity of the birds’ plumage between species 1 and species 2. Therefore, we propose the design and development of a novel biologically-inspired hybrid digital-optical system [14] for increasing the accuracy of the bird species speciation and the overall time it takes to process an aerial survey. As we are going to describe in the next sections, our proposed system is capable of performing both knowledge representation and knowledge learning by incorporating in its data: (i) the shape of each endangered species 1 and 2, (ii) the size of each endangered species 1 and 2, and (iii) the colour information of each endangered species 1 and 2.

5.2 Knowledge representation and knowledge learning

Figure 4 shows the block diagram of our proposed novel biologically-inspired hybrid digital-optical system called, Fast SDF K-means. It is a hybrid digital-optical design. Thus, the optical part and the K-means Clustering unit forms the digital part. The term “Fast” originates from the optical unit of the design where it consists of a correlator which can be implemented as a space domain function in a joint transform correlator architecture or be Fourier Transformed (FT) and used as a Fourier domain filter in a 4-f Vander Lugt type optical correlator [26]. Therefore, it can operate to the speed of the light wavelength.

Figure 4.

Fast SDF K-means classifier for endangered species speciation.

On Figure 4 two different modules are shown. The first is the knowledge representation module [45, 46] which consists of the optical correlator unit, and the second is the knowledge learning module. In effect, in the first module of the knowledge representation the shape, size and 3-band colour information of each input image are synthesised into the composite image of the Fast SDF K-means Classifier. For each input object then a correlation peak value is recorded which translates this information into a numerical value. This essentially forms the knowledge representation of all the objects space in the training set.

In the second module of the knowledge learning [47], spectral histogram information together with the composite image which consists of shape, size and colour information are learned by the Fast SDF K-means Classifier. Thus, the correlation peak value together with the Red and Blue components of the spectral histogram form a 3D vector for each input object. Then, those 3D vectors which have coded the shape, size, colour and histogram information of each object are unsupervised learned by the K-means clustering unit. The output values of the Fast SDF K-means Classifier can be visualised by a scatter plot of the recognised endangered species 1 and 2 where the separate clusters of the two classes can be observed together shown with any input objects either from endangered species 1 class or from endangered species 2 class which have been misclassified.

Advertisement

6. Results

In this section, the datasets used will be described. Then, the details of the recorded results will be shown together with the Fast SDF K-means performance metrics for the different datasets.

6.1 Datasets

Two different datasets have been used for testing the performance of the Fast SDF K-means classifier: (a) Winter Dataset which consisted of aeriial images taken during the Winter months, and (b) Summer Dataset which consisted of aerial images taken during the Summer months. Winter Dataset consisted of 221 three-band JPEG formatted aerial image shots also known as snags. Each snag had the size of [320240] pixels. Summer Dataset consisted of 270 three-band JPEG formatted snags. Each snag had, as for the Winter Dataset, the size of [320240] pixels. For aerial survey logging and identification reasons, all the shapefiles “.jgw” and “.mat” tagged information files have been saved for both datasets, too. To match the aerial survey automated image analysis of the snags with the ground truth data an object identification (ObjectID) information had been provided with each snag. Ground truth data had been collected from sea surveys or on-shore remote view including the total number of endangered species 1 and the total number of endangered species 2.

6.1.1 Winter dataset results

Figure 5 shows the K-means clustering algorithm classification scatter plot. The classified endangered species 1 and species 2 are drawn against their SAP Red (x-axis) and SAP Blue (y-axis) values. All the objects snags were classified using their histogram spectral values of SAP Red and SAP Blue. There is a high deviation and population ratio reverse between the circa ratio of species 1 and species 2 given by the boat survey i.e. (species1/species2) = 13:1 and the classified by the algorithm endangered species i.e. classified species2 = 185 and classified species1 = 36.

Figure 5.

Winter dataset speciation: K-means clustering algorithm. It shows the scatter plot of the classified endangered species 1 and species 2. The classified endangered species are drawn against their SAP red (x-axis) and SAP blue (y-axis) values. There is a high number of endangered species which their ID tagged by the GT scientists was uncertain i.e. tagged as “species 1 or species 2”.

Figure 6 shows the Pure SDF Correlator classification scatter plot. The classified endangered species 1 and species 2 are drawn against their SAP Red (x-axis) and SAP Blue (y-axis) values. However, now all the objects snags were classified using their shape, size and 3-band colour information. This time the classified by the Pure SDF Correlator endangered species produced a ratio of (species1/species2) = 24:2.

Figure 6.

Winter dataset speciation: Pure SDF correlator classifier. It shows the scatter plot of the classified endangered species 1 and species 2. All the objects snags were classified using the correlation peak value of each input image which encoded their shape, size and colour information. The classified endangered species are drawn against their SAP red (x-axis) and SAP blue (y-axis) values. There is a high number of endangered species which their ID tagged by the GT scientists was uncertain i.e. shown on the plot as “species 1 or species 2”.

Figure 7 shows the Fast SDF K-means Classifier scatter plot. The classified endangered species 1 and species 2 are drawn against their SAP Red (x-axis) and SAP Blue (y-axis) values. Now all the objects snags were classified using their 3D vectors which encode shape, size, 3-band colour, and spectral histogram information. This time the classified by the Fast SDF K-means Classifier produced a ratio of (species1/species2) = 20:4.

Figure 7.

Winter dataset speciation: Fast SDF K-means classifier. It shows the scatter plot of the classified endangered species 1 and species 2. All the objects snags were classified using their 3D vectors which have encoded their shape, size, 3-band colour and histogram spectral information. The classified endangered species are drawn against their SAP red (x-axis) and SAP blue (y-axis) values.

6.1.2 Summer dataset results

Figure 8 shows the K-means Clustering algorithm classification scatter plot. The classified endangered species 1 and species 2 are drawn against their SAP Red (x-axis) and SAP Blue (y-axis) values. All the objects snags were classified using their histogram spectral values of SAP Red and SAP Blue. There is a high deviation and population ratio reverse between the circa ratio of species 1 and species 2 given by the boat survey i.e. (species1/species2) = 15:2 and the classified by the algorithm endangered species i.e. classified species2 = 151 and classified species1 = 96.

Figure 8.

Summer dataset speciation: K-means clustering algorithm. It shows the scatter plot of the classified endangered species 1 and species 2. The classified endangered species are drawn against their SAP red (x-axis) and SAP blue (y-axis) values.

Figure 9 shows the Pure SDF Correlator classification scatter plot. The classified endangered species 1 and species 2 are drawn against their SAP Red (x-axis) and SAP Blue (y-axis) values. However, now all the objects snags were classified using their shape, size and 3-band colour information. This time the classified by the Pure SDF Correlator endangered species produced a ratio of (species1/species2) = 11:1.

Figure 9.

Summer dataset speciation: Pure SDF correlator classifier. It shows the scatter plot of the classified endangered species 1 and species 2. All the objects snags were classified using the correlation peak value of each input image which encoded their shape, size and colour information. The classified endangered species are drawn against their SAP red (x-axis) and SAP blue (y-axis) values.

Figure 10 shows the Fast SDF K-means Classifier scatter plot. The classified endangered species 1 and species 2 are drawn against their SAP Red (x-axis) and SAP Blue (y-axis) values. Now all the objects snags were classified using their 3D vectors which encode shape, size, 3-band colour, and spectral histogram information. This time the classified by the Fast SDF K-means Classifier produced a ratio of (species1/species2) = 15:1.

Figure 10.

Summer dataset speciation: Fast SDF K-means classifier. It shows the scatter plot of the classified endangered species 1 and species 2. All the objects snags were classified using their 3D vectors which have encoded their shape, size, 3-band colour and histogram spectral information. The classified endangered species are drawn against their SAP red (x-axis) and SAP blue (y-axis) values.

6.1.3 Performance metrics

Performance metrics have been used to assess the k-means classification algorithm, the Pure SDF correlator classifier, and the novel Fast SDF k-means classifier. Thus, precision metric is given by the ratio of true positives (TP) over the total number of false positives (FP) plus true positives:

Precision=TPTP+FPE12

Recall metric is computed as the ratio of the TP versus the TP plus the false negatives (FN):

Recall=TPTP+FNE13

True negative rate (TNR) is computed as the ratio of the true negatives (TN) versus the TN plus the FP:

TNR=TNTN+FPE14

Accuracy is given by the ratio of TP plus TN over the sum of TP, TN, FP and FN:

Accuracy=TP+TNTP+TN+FP+FNE15

Table 1 shows the performance metric values for the winter dataset of all the three tested classifiers. In effect, the second column shows the performance metric values of the k-means classification algorithm, the third column shows the performance metric values of the Pure SDF correlator classifier, and the fourth column shows the performance metric values of the novel Fast SDF k-means classifier. Similarly, Table 2 shows the performance metric values for the summer dataset of all the three evaluated classifiers. The results are shown on the corresponding columns of Table 2 as for Table 1.

Winter Speciation
k-means clusteringPure SDF correlator classifierFast SDF k-means classifier
Precision %1686.3396.11
Accuracy %484.3389.6
TNR %04.30
Recall %51.797.292.96

Table 1.

Winter dataset speciation performance metric values for the k-means clustering algorithm, the pure SDF correlator and the fast SDF k-means classifier.

Summer Speciation
k-means clusteringPure SDF correlator classifierFast SDF k-means classifier
Precision %42.583.9489.93
Accuracy %31.8782.083.23
TNR %015.00
Recall %56.0497.291.80

Table 2.

Summer dataset speciation performance metric values for the k-means lustering algorithm, the pure SDF correlator and the fast SDF k-means classifier.

Advertisement

7. Discussion and conclusions

In this section, the results of our evaluated classifiers are analysed and compared. In particular, the discussion is focused on the performance of the three classifiers during the winter dataset tests since the endangered species 1 and species 2 offer a greater classification challenge during the winter months than the summer ones. A separate section is included with the main conclusions of this work together with some future research suggestions.

7.1 Discussion

From Figures 57 it can be found that, overall, the Fast SDF k-means has performed better than the k-mean classification algorithm and the Pure SDF correlator classifier, too, for the winter dataset. Though the Pure SDF correlator classifier gave a classification rate of (species1/species2) closer to the ground-truth (GT) scientists ratio than the Fast SDF k-means, it still had a much higher, almost the double of the total number of classified endangered species i.e. sum of species 1 and species 2, of uncertain classifications which cannot be matched with either species1 or species2. Similarly, from Figures 810 it can be found that, overall, the Fast SDF k-means has performed better than the k-mean classification algorithm and the Pure SDF correlator classifier, too, for the summer dataset. Thus, the classification rate of (species1/species2) for the Fast SDF k-means was almost identical to the classification rate of the GT scientists. By incorporating the shape, size, 3-band colour and histogram spectral information into the Fast SDF k-means classifier it has improved the classification performance for both summer and winter datasets in comparison to the other two classifiers.

From Tables 1 and 2 the performance of all the classifiers can be assessed which they were used for the winter endangered bird species plumage and summer endangered bird species plumage speciation. It is can be clearly shown that k-means classification algorithm performed worse than the Pure SDF Correlator and Fast SDF K-means classifiers for both summer and winter endangered bird species plumage. In effect, the precision value was 96.11% and the accuracy value was 89.6% for the Fast SDF k-means classifier when tested with the winter dataset. The precision value became 89.93% and the accuracy value reached 83.23% for the Fast SDF k-means classifier when tested with the summer dataset. It worth of mentioning that the precision values of the Fast SDF k-means classifier for both datasets were higher than the human image analysts values which was not more than 88% for the winter plumage and not more than 89% for the summer plumage.

It should be noted that during the summer and winter surveys the weather conditions were significantly different. In effect, during the winter aerial survey the weather conditions were poor but during the summer boat survey the conditions were significantly improved. Thus, by observing the performance metric values of Tables 1 and 2, it can be concluded that the Pure SDF Correlator Classifier’s performance has not been significantly affected due to the different weather conditions when the surveys were conducted. Also, though the performance of Fast SDF k-means classifier seems to deviate on summer survey from the winter survey this was found to be due to glint effects in the capture image data. After examining the summer datasets, it was identified approximately 25% of the total number of snags to have significant glint effects in them. Nevertheless, the overall performance of the Pure SDF correlator classifier and the Fast SDF k-means classifier closely matched the boat survey during both winter and summer weather conditions.

Further, the novel Fast SDF k-means classifier has minimised the amount of total data needed to be ground-truthed by the GT scientists i.e. it can lead to an increased automation of the speciation process. We assessed the Fast SDF k-means classifier precision and accuracy values to be greater than 85% during the winter surveys i.e. approximately only 20% or less of the total amount of survey data would need to be ground-truthed. Hence, that would make more cost-effective the processing of the datasets i.e. more surveys per day would become possible to be processed by the GT scientists.

7.2 Conclusion

We have shown how our novel cognitive architecture of the Fast SDF k-means classifier is conceptually connected with the image analysis processing cycle. It combines a hybrid digital-optical design where the k-means unsupervised learning algorithm is integrated with a correlator. Thus, Fast SDF k-means classifier consists of a knowledge representation module formed by the SDF correlator and a knowledge learning module formed by the k-means classifier. The shape, size and 3-band colour information of each input image is synthesised into the composite image of the Fast SDF k-means classifier and, then, a corresponding correlation value is recorded which translates this information into a numerical value. Then, the knowledge learning module formed by the k-means classifier will learn the coded 3D vector of the composite image together with the Red and Blue components of the spectral histogram for each input object.

We have assessed the novel Fast SDF k-means classifier using performance metrics, and, then, compared it with the k-means classifier and the Pure SDF correlator classifier, too. The k-means classifier learns the vectors of the SAP Red (x-axis) and SAP Blue (y-axis) values of the input dataset. The Pure SDF correlator classifier uses the correlation peak value of each input image which encoded their shape, size and colour information. The precision values of the Fast SDF k-means classifier for both datasets were found to be higher than the human image analysts values which was not more than 88% for the winter plumage and not more than 89% for the summer plumage. Both the Pure SDF correlator classifier and the Fast SDF k-means classifier consistently performed under the different summer and winter conditions. Though the other types of surveys, e.g. boat survey can be prone to human error, applying those performance metrics with our developed Fast SDF k-means correlator classifier could be used for quality control (QC) and quality assessment (QA) of the classifier’s results over the aerial survey data.

In Section 4.1, NL-DoG SDF correlator classifier was described. NL-DoG SDF correlator in comparison with the Pure SDF correlator offers improved detectability and interclass discrimination but still keep an intraclass tolerance for a higher distortion range of the true-class object. Thus, we propose in future work to integrate a NL-DoG in our Fast SDF k-means classifier design which is expected to enhance its performance.

References

  1. 1. Historic England. Heritage and the Economy [Internet]. 2020. Available from: https://historicengland.org.uk/content/heritage-counts/pub/2020/heritage-and-the-economy-2020/ [Accessed: 26 October 2022]
  2. 2. Kordej-De VŽ, Cultural ŠI. Heritage, tourism and the UN sustainable development goals: The case of Croatia. In: Andreucci MB, Marvuglia A, Baltov M, Hansen P, editors. Rethinking Sustainability Towards a Regenerative. Vol. 15. Economy: Future City: Springer; 2021
  3. 3. Heritage Lottery Fund (HLF). Investing in Success: Heritage and the UK Tourism Economy [Internet]. 2010. Available from: https://www.heritagefund.org.uk/sites/default/files/media/about_us/hlf_tourism_impact_single.pdf [Accessed: 26 October 2022]
  4. 4. Barbosa AEA, Tella JL. How much does it cost to save a species from extinction? Costs and rewards of conserving the Lear’s macaw. Royal Society open science. 2019;6:190190
  5. 5. Chen F, Guo H, Tapete D, Masini N, Cigna F, Lasaponara R, et al. Interdisciplinary approaches based on imaging radar enable cutting-edge cultural heritage applications. National Science Review. 2021;8(9)
  6. 6. Kaur J, Singh W. Tools, techniques, datasets and application areas for object detection in an image: A review. Multimedia Tools Applications. 2022;81:38297-38351
  7. 7. Lo CP. Modern use of aerial photographs in geographical research. JSTOR Area. 1971;3(3):164-169
  8. 8. Haykin S. Neural Networks and Learning Machines. 3rd ed. New York: Pearson; 2009
  9. 9. Serre T, Oliva A, Poggio T. A feedforward architecture accounts for rapid categorization. PNAS Biological Sciences. 2007;104(15):6424-6429
  10. 10. DiCarlo JJ, Zoccolan D, Rust NC. How does the brain solve visual object recognition? Neuron. 2012;73(3):415-434
  11. 11. Kriegeskorte N, Douglas PK. Cognitive computational neuroscience. Nature Neuroscience. 2018;21(9):1148-1160
  12. 12. Lee I, Portier B. An empirical study of knowledge representation and learning within conceptual spaces for intelligent agents. In: Proceedings of the IEEE/ACIS International Conference Computer and Information Science; 11–13 July 2007; Melbourne. Australia: IEEE; 2007. pp. 463-468
  13. 13. Aler R, Borrajo D, Isasi P. Knowledge representation issues in control knowledge learning. In: Proceedings of the 17th International Conference on Machine Learning (ICML’00). San Francisco, CA. USA; 2000. p. 1-8. ISBN 1558607072
  14. 14. Kypraios I. Performance analysis of the modified-hybrid optical neural network object recognition system within cluttered scenes. In: Kypraios I, editor. Advances in Object Recognition Systems. London, UK, London, UK: InTech; 2012 ISBN: 978-953-51-0598-5
  15. 15. Bisang W. Knowledge representation and cognitive skills in problem solving. In: Zlatkin-Troitschanskaia O, Wittum G, Dengel A, editors. Positive Learning in the Age of Information. Wiesnaden: Springer V. S; 2018
  16. 16. A Tutorial on Clustering Algorithms [Internet]. 2012. Available from: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html [Accessed: 26 October 2022]
  17. 17. Russell S, Norvig P. Artificial Intelligence: A Modern Approach. 3rd ed. New York: Prentice Hall; 2009. p. 816-820. ch20
  18. 18. MacQueen JB. Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1. Berkeley: University of California; 1967. pp. 281-297
  19. 19. Bennet J, Ganaprakasam CA, Arputharaj K. A discrete wavelet based feature extraction and hybrid classification technique for microarray data analysis. The Scientific World Journal. 2014;2014:1-9. Article ID 195470. DOI: 10.1155/2014/195470
  20. 20. Wu S, Chen H, Zhao Z, Long H, Song C. An improved remote sensing image classification based on k-means using HSV colour feature. In: Proceedings of the 10th International Conference on Computational Intelligence and Security (CIS-2014). Kumming, Yunnan. China; 15-16 November 2014. p. 201-204
  21. 21. Abbas A, Minallh N, Ahmad N, Abid SA, R, Khan M. A. A. K-means and ISODATA clustering algorithms for landcover classification using remote sensing. Sindh University Research Journal. 2016;48
  22. 22. Vishwanath N, Ramesh B, Rao SP. Unsupervised classification of remote sensing images using k-means algorithm. International Journal of Latest Trends in Engineering and technology (IJLTET). 2016;7(2)
  23. 23. Yin X, Sasaki Y, Wang W, Shimizu K. YOLO and k-Means Based 3D Object Detection Method on Image and Point Cloud [Internet]. 2020. Available from: https://arxiv.org/abs/2004.11465
  24. 24. Nguyen H, Lee E-H, Bae C, Lee S. Multiple object detection based on clustering and deep learning methods. Sensors. 2020;16(4424). DOI: 10.3390/s20164424
  25. 25. Teknomo K. K-Means Clustering Tutorials [Internet]. Available from: http://people.revoledu.com/kardi/tutorial/kMean/ [Accessed: 26 October 2022]
  26. 26. Vander LA. Signal detection by complex spatial filtering. IEEE Transactions on Information Theory (IT-10). 1964;10(2):139-145
  27. 27. Bahri Z, Kumar BVK. Generalized synthetic discriminant functions. Journal of Optical Society of America. 1988;5(4):562-571
  28. 28. Kumar BVK. Tutorial survey of composite filter designs for optical correlators. Applied Optics. 1992;31(23):4773-4801
  29. 29. Jamal-Aldin LS, Young ECD, Chatwin CR. Application of non-linearity to wavelet-transformed images to improve correlation filter performance. Applied Optics. 1997;36(35):9212-9224
  30. 30. Jamal-Aldin LS, Young ECD, Chatwin CR. Nonlinear preprocessing operation for enhancing correlator fitler performance in clutter. In: Proceedings of the European Optical Society OC’ 98 Conference on Optics in Computing. Vol. 3490. Belgium: SPIE; 1998. pp. 182-186
  31. 31. Jamal-Aldin LS, Young ECD, Chatwin CR. In-class distortion tolerance, out-of-class discrimination and clutter resistance of correlation filters that employ a space domain non-linearity applied to wavelet filtered input images. SPIE. 1998;3386:111-122
  32. 32. Shang L, Wang RK, Chatwin CR. Frequency multiplexed DOG filter. Optics and lasers in engineering. Elsevier Applied Science. 1997;27(2):161-177
  33. 33. Mahalanobis A, Vijaya Kumar BVK, Casasent D. Minimum average correlation energy filters. Applied Optics. 1987;26(17):3633-3640
  34. 34. Vijaya KB, V. K, Hassebrook L. Performance measures for correlation filters. Applied Optics. 1990;29(20):2997-3006
  35. 35. Mahalanobis A, Vijaya Kumar BVK, Song S, Sims SRF, Epperson JF. Unconstrained correlation filters. Applied Optics. 1994;33(17):3751-3759
  36. 36. Refregier P. Filter design for optical pattern recognition: Multicriteria optimisation approach. Optics Letters. 1990;15(15):854-856
  37. 37. Zhou H, Chao TH. MACH filter synthesising for detecting targets in cluttered environment for gray-scale optical correlator. SPIE. 1999;715:394-398
  38. 38. Mahalanobis A, Vijaya KB, V. K. Optimality of the maximum average correlation height filter for detection of targets in noise. Optical Engineering. 1997;36(10):2642-2648
  39. 39. Vijaya KB, V. K. Minimum variance synthetic discriminant functions. Journal of Optical Society of America A. 1986;3:1579-1584
  40. 40. Dubois F. Non-linear cascaded correlation processes to improve the performances of automatic spatial-frequency-selective filters in pattern recognition. Applied Optics. 1996;35(23):4589-4597
  41. 41. Reed S, Coupland J. Cascaded linear shift-invariant processors in optical pattern recognition. Applied Optics. 2001;40(23):3843-3849
  42. 42. Mahalanobis A, Vijaya Kumar BVK, Sims SRF. Distance classifier correlation filters for multiclass target recognition. Applied Optics. 1996;35(17):3127-3133. DOI: 10.1364/AO.35.003127. PMID: 21102690.
  43. 43. Mahalanobis A, Vijaya KB, V. K., Sim S. R. F. Distance-classifier correlation filters for multiclass target recognition. Applied Optics. 1996;35(17):3127-3133
  44. 44. Alkanhal M, Vijaya KB, V. K. Polynomial distance classifier correlation filter for pattern recognition. Applied Optics. 2003;42(23):4688-4708
  45. 45. Khalifa M, Ning Shen K. Effects of knowledge representation on knowledge acquisition and problem solving. The Electronic Journal of Knowledge Management. 2006;4(2):153-158
  46. 46. Policastro CA, Zuliani G, da Silva RR, Romero RAF. Hybrid knowledge representation applied to the learning of the shared attention. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN-2008); IEEE World Congress on Computational Intelligence (WCCI-2008); 1–6 June 2008. Hong Kong, China: IEEE; 2008. pp. 866-870
  47. 47. Ho SB, Liausvia F. Knowledge representation, learning, and problem solving for general intelligence. In: Proceedings of the 6th International Conference on Artificial General Intelligence. 2013. DOI: 10.1007/978-3-642-39521-5-7

Written By

Ioannis Kypraios

Submitted: 01 November 2022 Reviewed: 15 November 2022 Published: 01 March 2023