Granular Approach for Recognizing Surgically Altered Face Images Using Keypoint Descriptors and Artificial Neural Network

This chapter presents a new technique called entropy volume-based scaleinvariant feature transform for correct face recognition post cosmetic surgery. The comparable features taken are the key points and volume of the Difference of Gaussian (DOG) structure for those points the information rate is confirmed. The information extracted has a minimum effect on uncertain changes in the face since the entropy is the higher-order statistical feature. Then the extracted corresponding entropy volume-based scale-invariant feature transform features are applied and provided to the support vector machine for classification. The normal scaleinvariant feature transform feature extracts the key points based on dissimilarity which is also known as the contrast of the image, and the volume-based scaleinvariant feature transform (V-SIFT) feature extracts the key points based on the volume of the structure. However, the EV-SIFT method provides both the contrast and volume information. Thus, EV-SIFT provides better performance when compared with principal component analysis (PCA), normal scale-invariant feature transform (SIFT), and V-SIFT-based feature extraction. Since it is well known that the artificial neural network (ANN) with Levenberg-Marquardt (LM) is a powerful computation tool for accurate classification, it is further used in this technique for better classification results.


Introduction
Human faces are multidimensional and complex visual stimuli, which contain useful information about the uniqueness of a person. Recognizing their faces used for security and authentication purposes has taken a new turn in the current era of computer image and vision analysis, for example, in monitoring applications, image recovery, man-machine interaction, and biometric authentication. Normally, the facial recognition system does not have the sense of touch or human interaction to complete the recognition process. This is one of the benefits of face recognition in relation to other recognition methods. Facial recognition can designate the verification phase [1] or the identification phase [2]. In the verification phase, the correspondence between two faces is resolved. There are many methods available to achieve facial recognition [3][4][5][6][7][8]. But the accuracy of recognition is not always high. This is due to variations in lighting levels, facial expressions, poses, aging, lowresolution input images, or facial markings [9,10]. Several investigators have implemented several methods of face recognition to treat the effects of imposition [11] of illumination [12], low resolution [13], aging [14], or a combination thereof [15]. However, these uncertainties could be overcome, and, in the face of plastic surgery, recognition will intensify with the identification of the person. The fact that face recognition in plastic surgery is due to the lack or variation of facial components, the texture of the skin, the general appearance of the face, and the geometric relationship between facial features or variation of the facial components [16][17][18]. Plastic surgery, both economic and sophisticated, has attracted people from all over the world. However, only a few contributions or research methodologies have been reported in the literature to address the problem of face recognition of plastic surgery. Few of them include recognition by local region analysis [19], a local form of cascade texture function (SLBT) with periocular features [20]. A review was also carried out in [21] to illustrate the use of multimodal features in the recognition of plastic surgery on the basis of contributions.

Related works
De Marsico et al. [22] have made perfect recognition of the face, undergone cosmetic surgery, with region-based approach on a multimodal supervised architecture, also named as Split Face Architecture (SFA). Author proved dominance of their method by the application of supervised SFA to conventional PCA as well as FDA, toward LBP in the multiscale, rotation-invariant version with uniform patterns, face analysis for commercial entities (FACE), as well as face recognition against occlusions and expression variations (FARO).
Kohli et al. [23] enclose layout of multiple projective dictionary learning framework (MPDL) that never needs to figure norms to recognize usual faces, which have undergone modification via cosmetic surgery. Several projective dictionaries as well as compact binary face descriptors have been used to understand local and global plastic surgery face representations, in order to facilitate the distinction between plastic surgery faces and their original faces. The tests performed on the plastic surgery database resulted in an accuracy of about 97.96%.
Chude-Olisah et al. [24] has overcome the degradation of facial recognition performance; they have found that the approach had gone beyond the facial recognition approaches of cosmetic surgery before accessible, regardless of changes in lighting, facial expressions, and other changes resulting from cosmetic surgery. Ouanan [25] has introduced HOG feature-based facial recognition approach, which uses HOG as a substitute of DOG in the scale-invariant feature transform. Ouloul [26] introduces a perfect recognition approach for face using SIFT feature in RGBD images which depend on RGBD images produced by Kinect; this kind of cameras are low price, as well as it can be utilized in every setting and in several situations. Bhatt et al. [27] have proposed a multi-objective granular evolutionary method, which provides the pairing of images taken before and after in cosmetic surgery. Primarily, the algorithm generates superimposed face granules in three levels of granularity. Facial recognition in plastic surgery has undergone several developments in recent years. Contributions to the research were reported in the literature, either in the feature extraction phase, in the classification phase, or in both phases.

Granular approach for recognizing surgically altered face images using EV-SIFT and LM trained NN
The surgical face recognition is developed, which is based on the granular approach and Laplacian sharpening since it is identified that the sharpening of images will automatically enhance the cornerness and contrast of the image granules. Further, the key point elimination is done in this technique with entropy threshold, because entropy is the effective selection criterion that is used to eliminate the unreliable interest points. Since it is well known that the artificial neural network (ANN) with Levenberg-Marquardt (LM) is a powerful computation tool for accurate classification, it is further used in this technique for better classification results. The architecture diagram of the proposed face recognition technique is diagrammatically illustrated in Figure 1.
The testing image I T is initially preprocessed, in such a way that the image I T gets cropped, resized, and formulated granularly. Then the local extrema of the preprocessed image I T p is detected using DOG scale space. Moreover, in this proposed recognition technique, EV-SIFT descriptor is used to extract the features. The NN classifier with LM is also adopted for better classification.

Preprocessing: granular and Laplacian sharpening
This is the initial process with the input image I T , where the image gets resized, cropped, and formulated. Two types of preprocessing are carried out, namely, Laplacian sharpening and granular processing.

Preprocessing-I
The image I T from database is cropped and resized to 150 Â 150. Laplacian operator: This operator is also called derivative operator that is used to identify the edges in an image. The foremost difference among Laplacian and other operators such as Sobel, Prewitt, Kirsch, and Robinson is that all the mentioned operators are first-order derivative masks, whereas the Laplacian is the second-order derivative mask. Further, two classifications are there in this mask: • Positive Laplacian operator • Negative Laplacian operator Moreover, one of the differences among the operators is that Laplacian will not use any corresponding direction. However, it uses edges in two classifications: • Inward edges

• Outward edges
Positive Laplacian operator: This category has the standard mask, the center element of the mask is the negative element, and the elements that present in the corner of the mask must be zero, which is utilized to take the outward edges in the image, which is illustrated in Figure 2.
Negative Laplacian operator: This operator also has a standard mask, in which the center element must be positive; all the elements that exist in the corner must be zero, and the remaining mask elements must be À1. This operator is utilized to take the inward edges in the image, which is illustrated in Figure 2.
Working strategy of Laplacian: This operator deemphasizes the region in image by using gray-level discontinuities, and it is happened by slowly varying gray levels. The operation results in the image that has grayish edge line with dark background, which grants both the inward and outward edges in image. The filter application basically emphasizes two major strategies: it is impossible to apply both the operators (positive and negative); rather only one operator can be applied. If the positive operator is applied to the image, then the resultant image is subtracted from the original image to get the sharpened image. Same as this, if the negative Laplacian operator is applied to the image, then the resultant image is added to the original image for the sharpened image.

Preprocessing II
This is the foremost process of the developed model. Consider I as the nominated plastic surgery face image of n Â m size. The face granules are formed with the consideration of three levels of granularity. The initial level outputs the information, namely, global information at multiple determinations. The inner and outer information from the face are resulted from the second level of granularity. Normally, features termed "local facial features" play a leading role in the recognition of face and therefore in the third-level extracts of the local facial features. The brief explanation of the three granularities is explained below: First level of granularity: In this level, the face granules are generated by applying the Gaussian and Laplacian operators. In accordance with this, Gaussian operator gives the series of low-pass filtered image along 2D Gaussian kernel, whereas the Laplacian operator gives the sequence of band-pass images. Consider I G g as the granules that are resultant from Gaussian as well as Laplacian operators, where g denotes the granule number. If the face image is of size 196 Â 224, the output image might be in the pyramid view with six granules I G g1 to I G g6 , and it may be either higher or lower determination. From the generated six granules, the facial features are separated at varied determination for providing blurriness, smoothness, edge information, and noise, which presents in I. Hence, the variations are compensated in this level with the alteration of face textures like skin resurfacing, dermabrasion, and facelift.
Second level of granularity: In this level, the face image I is divided into varied regions to get the horizontal granules I G g7 to I G g15 and the vertical granules I G g16 to I G g24 . The size of the first three granules is n Â m=3. From the size of the next three granules, the size of I G g10 and I G g12 is n Â m=3 À ∈ ð Þ , and the size of Þis the size of I G g13 and I G g15 , and n Â m=3 À 2 ∈ ð Þis the size of I G g16 . In the same manner, it generates the vertical granules. In this way, the second level grants the variations in both the inner and the outer facial regions. The variations that are present in the chin, cheek, ears, and forehead are denoted with the aid of relations among vertical and horizontal granules.
Third level of granularity: In general, humans classify individuals by identifying their local face regions like the eyes, mouth, and nose. This property is accomplished in this level, which extracts the local facial regions and is used as the granules. In eye coordinate, with the use of golden ratio face template, it is probable to extract 16 local facial regions. Every region is determined as the local information, in which it denotes the deviations due to the plastic surgery. This granularity preprocessing grants flexibility to deviations in both the inner and outer facial sections. It uses the relation among horizontal and vertical granules to view the deviations in the cheeks, chins, forehead, and ears that changed due to plastic surgery processes.
In Eq. (1), u ∈ 1; M ½ and v ∈ 1; N ½ , 0 ≤ m r ≤ M r À 1 and 0 ≤ n r ≤ N r À 1, Þis the size of the resized image, and Á ½ denotes the round-off function of the nearest integer:

Acquisition of the EV-SIFT key points
Choosing the key points in the variation of the Gaussian function is the vital role to be considered. The parameters of the key point are purely depending on distribution property of the gradient operation of the image. Thus, the formulation of both the orientation and gradient modules is done, which registers the invariance toward the rotation of the image. The computation of orientation and gradient module is defined in Eqs. (4) and (5), where θ x; y ð Þ denotes the orientation of key points and the gradient magnitude and L x; y ð Þ refers to the image sample: The scales used by L are the respective scale for each key point. Further, an orientation histogram is achieved as a result of gradient operation of sample points.

Entropy-based feature descriptor
The Changeable information is measured using entropy. It basically defines the statistical measure of randomness, which determines the texture of the input image. Only the least effect remains in the higher-order statistical feature due to the entropy on uncertain deviations in the face. The following steps show the entropybased feature descriptor: Step 1: The volume of the image is evaluated with the aid of V-SIFT formulation, which is determined in the matrix form as defined in Eq. (6): Step 2: The information basis is both memory less and static. The volume of the structure in EV-SIFT analysis is defined in Eq. (7), which is the probability function: Step 3: The computation of entropy is done from the volume of the structure. The entropy calculation for EV-SIFT process is determined in Eq. (8), which states that if E V ð Þ is high entropy, then the volume is from the unvarying direction, and if it is low entropy, then it means that the volume is a varied distribution. Thus, F D i describes the entire database that achieved the final EV-SIFT descriptor: The level of Gaussian blur of the image is selected by orientation and gradient magnitude with entropy, and also the volume of the image is also sampled in terms of scale of key points at particular key point location. The sample is an 8 Â 8 neighbor window, which is centered on the key point and splits the neighbor into 4 Â 4 child window. Hence, the formulation of gradient orientation histogram is done along with eight bins with the aid of each child window. In such a way that within each key point, each descriptor intends the 4 Â 4 array of histograms that comprises eight bins. The feature vector attained is the size of 4 Â 4 Â 8 = 128 dimension.

Local binary pattern (LBP)
LBP [1] operator is designed for texture description. It encodes the pixel-wise data in texture images, in such a way that a label is assigned to every pixel of the image. This is done by thresholding the 3 Â 3 neighborhood of all pixel value with the center pixel, and the result must be a binary number. The basic LBP thresholding function f T :; : ð Þ is defined as given in Eq. (9), where Y i , i ¼ 1, :::8 is the eight neighborhood point around Y 0 , which is shown in Figure 3. LBP in other words is termed as the concatenation of binary gradient direction, which is also known as "micro pattern":  Figure 4 illustrates the sample of attaining an LBP micro pattern when the threshold is set to 0. Further, the resultant histogram of the micro pattern presents the data related to the distribution of edges, spots, and more local features that present in the image. It is observed that the LBP is a great tool for face recognition. Despite a number of static learning approaches that tune with more parameters, LBP is more effective since it has an "easy-to-formulate" feature extraction process, and also the matching strategy is also very simple.

Center-symmetric local binary pattern (CSLBP)
CSLBP [1] is established for interest region description. It purposes for least LBP labels to generate smaller histograms, which are well suited to utilize in region descriptors. Moreover, it is designed for better stability, especially in regions that include the face image. Here, the comparison of pixel values are not done between the pixels and center pixels; rather the opposing pixels are symmetrically compared in correspondence to the center pixel, which is defined in Eq. (10): where s i and s iþ S=2 ð Þ refer to the gray values of center-symmetric pairs of S similarly space out pixels.
In this work, the value of T threshold is 1% of pixel value. T is set to 0.01 since the data lies among 0 and 1. The size of the neighborhood is eight as illustrated in Figure 5. From the CSLBP formulation, it is evident that CSLBP is related to gradient operator, and also it considers the gray level G differences among pairs of contrary pixels in neighborhood. Thus, the CSLBP features show the advantage of both the LBP parameters and gradient features.  the bias weight to n th hidden neurons, and w h ð Þ jn represents the hidden neuron's weight. The input is the dimensional reduced features from PCA, which is denoted as f , A h ð Þ is the output of the hidden layer that is defined in Eq. (11), and the nonlinear function is represented as F • ð Þ: where N I ð Þ denotes the count of input neurons.B is the output of the network model that is defined in Eq. (12), where w o ð Þ bk is the weight of the output bias to k th layer, w o ð Þ ik is the output weight from i th hidden neuron to k th layer, and N h ð Þ is the count of hidden neurons. The weight w * is optimally chosen by reducing the objective function, which is defined in Eq. (13), where B denotes the actual output and N o ð Þ is the number of output neurons: Here, the LM algorithm is used for training the NN model. The error functionEF w ð Þ to be reduced is represented as the sum of squared errors among the target output B T and the network model outputB, which is defined in Eq. (14): where W ¼ W 1 , W 2 , ………W N , which presents all the weights of the network and v is the error vector which includes the error of all the training samples. While training with LM model, the growth of weight ΔW is obtained, and it is defined in Eq. (15): where M is the Jacobian matrix and the learning rate to be updated is represented as η. The updation of η is done using α, which depends to the outcome. Particularly, η is multiplied by α, 0 , α , 1 ð Þdecay rate when EF W ð Þ minimizes, whereas when EF W ð Þ increases, η is divided by α. The given pseudo-code shows the training process of LM.
Step 1 Initializing the weights W and parameter η (η=.01 (approx.)) Step 2 Sum of the squared errors is formulated on the entire EF W ð Þ inputs.

Granularity preprocessing
By dividing the face image into varied regions, we get the vertical as well as horizontal face granules as illustrated in Figure 6. The horizontal granules are represented as R1, R2, and R3, and the size is 150 Â 150/3. Similarly, the vertical granules are denoted as R4, R5, and R6, which is of 150/3 Â 150 size.

Analysis on EV-SIFT
In this work, EV-SIFT descriptor is used for the feature extraction. Figure 7 illustrates the original images. For each original image, the corresponding vertical edge and horizontal edge of the image were evaluated, and it is illustrated in Figures 8 and 9. The gradient magnitude of the images is also shown in Figure 10. Similarly, the theta images of the given input images are illustrated in Figure 11.   One of the important processes is the evaluation of image orientation of the eight angles such as 0, 45, 90, 135, 180, 225, 270, and 315°in each image, which is shown in Figures 12-16. The resultant EV-SIFT contour of the input images is illustrated in Figure 17.

Learning performance of LM-NN
The performance of the LM-NN classifier is illustrated in Figure 18. It is observed that the best performance of the classifier is attained at the epoch 7, where the training performance is 0.00022204, gradient is 7.0363e-08, Mu is 1e-10, and the validation fail is 0 since there is no validation attained.

Comparative performance analysis of best-performing methods of proposed approaches
While analyzing the first research technique, in the evaluation on LM-NN, it is observed that the EV-SIFT proposed technique attained better results in all the measures like accuracy, sensitivity, specificity, precision, false-positive rate (FPR), false-negative rate (FNR), net present value (NPV), false discovery rate (FDR), and F1score (also F-score or F-measure) which is a measure of a test's accuracy and Matthews correlation coefficient (MCC), respectively. The evaluation is summarized in Tables 1-3.
It is observed that the proposed V-SIFT with LM-NN has achieved more over the conventional methods for various plastic surgeries, which is summarized in           Table 2. It is observed that for all the measures, the method has attained better results, which also leads to the other types of plastic surgery. From the second technique, it is observed that the proposed EV-SIFT with LM-NN are achieved more over the conventional methods for various plastic surgeries, which is summarized in Table 3. It is observed that for all the measures, the method has attained better results.

Conclusions
This chapter gives the detailed description of the second research technique. The feature descriptor EV-SIFT that is used for feature extraction is well explained. Further, the LM-based NN classifier is defined in this chapter, and the performance of both the EV-SIFT and LM-NN classifiers is shown in the Result section. The better work of EV-SIFT is effectively demonstrated in this section, which shows  Table 3.
Proposed EV-SIFT with LM-NN of different plastic surgery faces.
how the images are distinguished between them. The analysis of the LM-NN classifier is also more satisfactory with better performance.