Open access peer-reviewed chapter

Granular Approach for Recognizing Surgically Altered Face Images Using Keypoint Descriptors and Artificial Neural Network

By Archana Harsing Sable and Haricharan A. Dhirbasi

Submitted: December 31st 2018Reviewed: February 28th 2019Published: July 29th 2019

DOI: 10.5772/intechopen.85550

Downloaded: 109


This chapter presents a new technique called entropy volume-based scale-invariant feature transform for correct face recognition post cosmetic surgery. The comparable features taken are the key points and volume of the Difference of Gaussian (DOG) structure for those points the information rate is confirmed. The information extracted has a minimum effect on uncertain changes in the face since the entropy is the higher-order statistical feature. Then the extracted corresponding entropy volume-based scale-invariant feature transform features are applied and provided to the support vector machine for classification. The normal scale-invariant feature transform feature extracts the key points based on dissimilarity which is also known as the contrast of the image, and the volume-based scale-invariant feature transform (V-SIFT) feature extracts the key points based on the volume of the structure. However, the EV-SIFT method provides both the contrast and volume information. Thus, EV-SIFT provides better performance when compared with principal component analysis (PCA), normal scale-invariant feature transform (SIFT), and V-SIFT-based feature extraction. Since it is well known that the artificial neural network (ANN) with Levenberg-Marquardt (LM) is a powerful computation tool for accurate classification, it is further used in this technique for better classification results.


  • face recognition
  • plastic surgery
  • scale-invariant feature transform
  • (SIFT) feature
  • EV-SIFT feature
  • Levenberg-Marquardt-based neural network classifier (LM-NN)

1. Introduction

Human faces are multidimensional and complex visual stimuli, which contain useful information about the uniqueness of a person. Recognizing their faces used for security and authentication purposes has taken a new turn in the current era of computer image and vision analysis, for example, in monitoring applications, image recovery, man-machine interaction, and biometric authentication. Normally, the facial recognition system does not have the sense of touch or human interaction to complete the recognition process. This is one of the benefits of face recognition in relation to other recognition methods. Facial recognition can designate the verification phase [1] or the identification phase [2]. In the verification phase, the correspondence between two faces is resolved. There are many methods available to achieve facial recognition [3, 4, 5, 6, 7, 8]. But the accuracy of recognition is not always high. This is due to variations in lighting levels, facial expressions, poses, aging, low-resolution input images, or facial markings [9, 10]. Several investigators have implemented several methods of face recognition to treat the effects of imposition [11] of illumination [12], low resolution [13], aging [14], or a combination thereof [15]. However, these uncertainties could be overcome, and, in the face of plastic surgery, recognition will intensify with the identification of the person. The fact that face recognition in plastic surgery is due to the lack or variation of facial components, the texture of the skin, the general appearance of the face, and the geometric relationship between facial features or variation of the facial components [16, 17, 18]. Plastic surgery, both economic and sophisticated, has attracted people from all over the world. However, only a few contributions or research methodologies have been reported in the literature to address the problem of face recognition of plastic surgery. Few of them include recognition by local region analysis [19], a local form of cascade texture function (SLBT) with periocular features [20]. A review was also carried out in [21] to illustrate the use of multimodal features in the recognition of plastic surgery on the basis of contributions.

1.1 Related works

De Marsico et al. [22] have made perfect recognition of the face, undergone cosmetic surgery, with region-based approach on a multimodal supervised architecture, also named as Split Face Architecture (SFA). Author proved dominance of their method by the application of supervised SFA to conventional PCA as well as FDA, toward LBP in the multiscale, rotation-invariant version with uniform patterns, face analysis for commercial entities (FACE), as well as face recognition against occlusions and expression variations (FARO).

Kohli et al. [23] enclose layout of multiple projective dictionary learning framework (MPDL) that never needs to figure norms to recognize usual faces, which have undergone modification via cosmetic surgery. Several projective dictionaries as well as compact binary face descriptors have been used to understand local and global plastic surgery face representations, in order to facilitate the distinction between plastic surgery faces and their original faces. The tests performed on the plastic surgery database resulted in an accuracy of about 97.96%.

Chude-Olisah et al. [24] has overcome the degradation of facial recognition performance; they have found that the approach had gone beyond the facial recognition approaches of cosmetic surgery before accessible, regardless of changes in lighting, facial expressions, and other changes resulting from cosmetic surgery. Ouanan [25] has introduced HOG feature-based facial recognition approach, which uses HOG as a substitute of DOG in the scale-invariant feature transform. Ouloul [26] introduces a perfect recognition approach for face using SIFT feature in RGBD images which depend on RGBD images produced by Kinect; this kind of cameras are low price, as well as it can be utilized in every setting and in several situations. Bhatt et al. [27] have proposed a multi-objective granular evolutionary method, which provides the pairing of images taken before and after in cosmetic surgery. Primarily, the algorithm generates superimposed face granules in three levels of granularity. Facial recognition in plastic surgery has undergone several developments in recent years. Contributions to the research were reported in the literature, either in the feature extraction phase, in the classification phase, or in both phases.

2. Granular approach for recognizing surgically altered face images using EV-SIFT and LM trained NN

The surgical face recognition is developed, which is based on the granular approach and Laplacian sharpening since it is identified that the sharpening of images will automatically enhance the cornerness and contrast of the image granules. Further, the key point elimination is done in this technique with entropy threshold, because entropy is the effective selection criterion that is used to eliminate the unreliable interest points. Since it is well known that the artificial neural network (ANN) with Levenberg-Marquardt (LM) is a powerful computation tool for accurate classification, it is further used in this technique for better classification results. The architecture diagram of the proposed face recognition technique is diagrammatically illustrated in Figure 1.

Figure 1.

Block diagram of the proposed granular approach for recognizing plastic surgery faces.

The testing image ITis initially preprocessed, in such a way that the image ITgets cropped, resized, and formulated granularly. Then the local extrema of the preprocessed image IpTis detected using DOG scale space. Moreover, in this proposed recognition technique, EV-SIFT descriptor is used to extract the features. The NN classifier with LM is also adopted for better classification.

3. Preprocessing: granular and Laplacian sharpening

This is the initial process with the input image IT, where the image gets resized, cropped, and formulated. Two types of preprocessing are carried out, namely, Laplacian sharpening and granular processing.

3.1 Preprocessing-I

The image ITfrom database is cropped and resized to 150 × 150.

Laplacian operator: This operator is also called derivative operator that is used to identify the edges in an image. The foremost difference among Laplacian and other operators such as Sobel, Prewitt, Kirsch, and Robinson is that all the mentioned operators are first-order derivative masks, whereas the Laplacian is the second-order derivative mask. Further, two classifications are there in this mask:

  • Positive Laplacian operator

  • Negative Laplacian operator

Moreover, one of the differences among the operators is that Laplacian will not use any corresponding direction. However, it uses edges in two classifications:

  • Inward edges

  • Outward edges

Positive Laplacian operator: This category has the standard mask, the center element of the mask is the negative element, and the elements that present in the corner of the mask must be zero, which is utilized to take the outward edges in the image, which is illustrated in Figure 2.

Figure 2.

Standard mask of (a) positive Laplacian operator and (b) negative Laplacian operator.

Negative Laplacian operator: This operator also has a standard mask, in which the center element must be positive; all the elements that exist in the corner must be zero, and the remaining mask elements must be −1. This operator is utilized to take the inward edges in the image, which is illustrated in Figure 2.

Working strategy of Laplacian: This operator deemphasizes the region in image by using gray-level discontinuities, and it is happened by slowly varying gray levels. The operation results in the image that has grayish edge line with dark background, which grants both the inward and outward edges in image. The filter application basically emphasizes two major strategies: it is impossible to apply both the operators (positive and negative); rather only one operator can be applied. If the positive operator is applied to the image, then the resultant image is subtracted from the original image to get the sharpened image. Same as this, if the negative Laplacian operator is applied to the image, then the resultant image is added to the original image for the sharpened image.

3.2 Preprocessing II

This is the foremost process of the developed model. Consider Ias the nominated plastic surgery face image of n×msize. The face granules are formed with the consideration of three levels of granularity. The initial level outputs the information, namely, global information at multiple determinations. The inner and outer information from the face are resulted from the second level of granularity. Normally, features termed “local facial features” play a leading role in the recognition of face and therefore in the third-level extracts of the local facial features. The brief explanation of the three granularities is explained below:

First level of granularity: In this level, the face granules are generated by applying the Gaussian and Laplacian operators. In accordance with this, Gaussian operator gives the series of low-pass filtered image along 2D Gaussian kernel, whereas the Laplacian operator gives the sequence of band-pass images. Consider IgGas the granules that are resultant from Gaussian as well as Laplacian operators, where gdenotes the granule number. If the face image is of size 196×224, the output image might be in the pyramid view with six granules Ig1Gto Ig6G, and it may be either higher or lower determination. From the generated six granules, the facial features are separated at varied determination for providing blurriness, smoothness, edge information, and noise, which presents in I. Hence, the variations are compensated in this level with the alteration of face textures like skin resurfacing, dermabrasion, and facelift.

Second level of granularity: In this level, the face image Iis divided into varied regions to get the horizontal granules Ig7Gto Ig15Gand the vertical granules Ig16Gto Ig24G. The size of the first three granules is n×m/3. From the size of the next three granules, the size of Ig10Gand Ig12Gis n×m/3, and the size of Ig11Gis n×m/3+2. Further, n×m/3+is the size of Ig13Gand Ig15G, and n×m/32is the size of Ig16G. In the same manner, it generates the vertical granules. In this way, the second level grants the variations in both the inner and the outer facial regions. The variations that are present in the chin, cheek, ears, and forehead are denoted with the aid of relations among vertical and horizontal granules.

Third level of granularity: In general, humans classify individuals by identifying their local face regions like the eyes, mouth, and nose. This property is accomplished in this level, which extracts the local facial regions and is used as the granules. In eye coordinate, with the use of golden ratio face template, it is probable to extract 16 local facial regions. Every region is determined as the local information, in which it denotes the deviations due to the plastic surgery. This granularity preprocessing grants flexibility to deviations in both the inner and outer facial sections. It uses the relation among horizontal and vertical granules to view the deviations in the cheeks, chins, forehead, and ears that changed due to plastic surgery processes.

4. EV-SIFT, local binary pattern (LBP), and center-symmetric local binary pattern (CSLBP)


Consider the face image Fjand database IiD, where i=1,2.ND, which must satisfy the condition FjIiDand j=1,2..NS, and the database size is given as M×N. The preprocessing phase initiates with resizing of image. The resizing model of the image is defined in Eq. (1), where SMand SNdenote the scaled number of columns and rows:


In Eq. (1), u1Mand v1N, 0mrMr1and 0nrNr1, Mr×Nris the size of the resized image, and denotes the round-off function of the nearest integer:


4.1.1 Acquisition of the EV-SIFT key points

Choosing the key points in the variation of the Gaussian function is the vital role to be considered. The parameters of the key point are purely depending on distribution property of the gradient operation of the image. Thus, the formulation of both the orientation and gradient modules is done, which registers the invariance toward the rotation of the image. The computation of orientation and gradient module is defined in Eqs. (4) and (5), where θxydenotes the orientation of key points and the gradient magnitude and Lxyrefers to the image sample:


The scales used by Lare the respective scale for each key point. Further, an orientation histogram is achieved as a result of gradient operation of sample points.

4.1.2 Entropy-based feature descriptor

The Changeable information is measured using entropy. It basically defines the statistical measure of randomness, which determines the texture of the input image. Only the least effect remains in the higher-order statistical feature due to the entropy on uncertain deviations in the face. The following steps show the entropy-based feature descriptor:

Step 1: The volume of the image is evaluated with the aid of V-SIFT formulation, which is determined in the matrix form as defined in Eq. (6):


Step 2: The information basis is both memory less and static. The volume of the structure in EV-SIFT analysis is defined in Eq. (7), which is the probability function:


Step 3: The computation of entropy is done from the volume of the structure. The entropy calculation for EV-SIFT process is determined in Eq. (8), which states that if EVis high entropy, then the volume is from the unvarying direction, and if it is low entropy, then it means that the volume is a varied distribution. Thus, FiDdescribes the entire database that achieved the final EV-SIFT descriptor:


The level of Gaussian blur of the image is selected by orientation and gradient magnitude with entropy, and also the volume of the image is also sampled in terms of scale of key points at particular key point location. The sample is an 8 × 8 neighbor window, which is centered on the key point and splits the neighbor into 4 × 4 child window. Hence, the formulation of gradient orientation histogram is done along with eight bins with the aid of each child window. In such a way that within each key point, each descriptor intends the 4 × 4 array of histograms that comprises eight bins. The feature vector attained is the size of 4 × 4 × 8 = 128 dimension.

4.2 Local binary pattern (LBP)

LBP [1] operator is designed for texture description. It encodes the pixel-wise data in texture images, in such a way that a label is assigned to every pixel of the image. This is done by thresholding the 3 × 3 neighborhood of all pixel value with the center pixel, and the result must be a binary number. The basic LBP thresholding function defined as given in Eq. (9), where Yi,i=1,...8is the eight neighborhood point around Y0, which is shown in Figure 3. LBP in other words is termed as the concatenation of binary gradient direction, which is also known as “micro pattern”:

Figure 3.

Example of eight neighborhoods around Y0.


Figure 4 illustrates the sample of attaining an LBP micro pattern when the threshold is set to 0. Further, the resultant histogram of the micro pattern presents the data related to the distribution of edges, spots, and more local features that present in the image. It is observed that the LBP is a great tool for face recognition. Despite a number of static learning approaches that tune with more parameters, LBP is more effective since it has an “easy-to-formulate” feature extraction process, and also the matching strategy is also very simple.

Figure 4.

An example for LBP micro pattern for a given region.

4.3 Center-symmetric local binary pattern (CSLBP)

CSLBP [1] is established for interest region description. It purposes for least LBP labels to generate smaller histograms, which are well suited to utilize in region descriptors. Moreover, it is designed for better stability, especially in regions that include the face image. Here, the comparison of pixel values are not done between the pixels and center pixels; rather the opposing pixels are symmetrically compared in correspondence to the center pixel, which is defined in Eq. (10):


where siand si+S/2refer to the gray values of center-symmetric pairs of Ssimilarly space out pixels.

In this work, the value of Tthreshold is 1% of pixel value. Tis set to 0.01 since the data lies among 0 and 1. The size of the neighborhood is eight as illustrated in Figure 5. From the CSLBP formulation, it is evident that CSLBP is related to gradient operator, and also it considers the gray level Gdifferences among pairs of contrary pixels in neighborhood. Thus, the CSLBP features show the advantage of both the LBP parameters and gradient features. CSLBP generates 16 varied binary patterns. Feature vector of every key point is generated by concatenating 128-dimensional descriptor as well as LBP [256-dimensional descriptor]/CSLBP [16-dimensional descriptor]. The feature vectors’ dimensions are diminished to 25 dimensions by evaluating the covariance matrix for PCA, from which the highest 25 eigenvectors are chosen for description.

Figure 5.

CSLBP establishment.

5. Recognition system: Levenberg-Marquardt-based neural network classifier (LM-NN)

In this work, LM-NN classifier is used for recognition purpose. The NN model is represented in Eqs. (11)(13), where ndenotes the hidden neurons, wbnhrefers to the bias weight to nthhidden neurons, and wjnhrepresents the hidden neuron’s weight. The input is the dimensional reduced features from PCA, which is denoted as f, Ahis the output of the hidden layer that is defined in Eq. (11), and the nonlinear function is represented as F:


where NIdenotes the count of input neurons. B̂is the output of the network model that is defined in Eq. (12), where wbkois the weight of the output bias to kthlayer, wikois the output weight from ithhidden neuron to kthlayer, and Nhis the count of hidden neurons. The weight wis optimally chosen by reducing the objective function, which is defined in Eq. (13), where Bdenotes the actual output and Nois the number of output neurons:

w=arg minwbihwjihwbkowikok=1NoBB̂E13

Here, the LM algorithm is used for training the NN model. The error function EFwto be reduced is represented as the sum of squared errors among the target output BTand the network model output B̂, which is defined in Eq. (14):


where W=W1,W2,WN, which presents all the weights of the network and vis the error vector which includes the error of all the training samples. While training with LM model, the growth of weight ΔWis obtained, and it is defined in Eq. (15):


where Mis the Jacobian matrix and the learning rate to be updated is represented as η. The updation of ηis done using α, which depends to the outcome. Particularly, ηis multiplied by α,0<α<1decay rate when EFWminimizes, whereas when EFWincreases, ηis divided by α. The given pseudo-code shows the training process of LM.

Step 1 Initializing the weights Wand parameter η(η=.01 (approx.))

Step 2 Sum of the squared errors is formulated on the entire EFWinputs.

Step 3 Increment of weights ΔWis computed using Eq. (14)

Step 4 Recomputing EFW

Step 5 Use W+ ΔWas the trail Wand evaluate

     If trail EFW<EFWin step 2 then

         W= W+ ΔW

         η= ηαα=0.01

         Back to step 2



         Back to step 4

     End if

6. Results and discussion

6.1 Experimental setup

The cosmetic surgery face recognition experimentation is conducted in MATLAB 2015a. The database including presurgery faces and postsurgery faces are downloaded from The experimentation is performed for different plastic surgery faces. The total number of plastic surgery faces in the database is 460, where it comprises 68 images from blepharoplasty (eyelid surgery), 51 images from brow lift (forehead surgery), 51 images from liposhaving (facial sculpturing), 17 images from malar augmentation (cheek implant), 18 images from mentoplasty (chin surgery), 54 images from otoplasty (ear surgery), 75 images from rhinoplasty (nose surgery), 74 images from rhytidectomy (facelift), and 52 images from skin peeling (skin resurfacing).

6.2 Granularity preprocessing

By dividing the face image into varied regions, we get the vertical as well as horizontal face granules as illustrated in Figure 6. The horizontal granules are represented as R1, R2, and R3, and the size is 150 × 150/3. Similarly, the vertical granules are denoted as R4, R5, and R6, which is of 150/3 × 150 size.

Figure 6.

Computation of granular images 1, 2, and 3. (a) Horizontal granules and (b) vertical granules.

6.3 Analysis on EV-SIFT

In this work, EV-SIFT descriptor is used for the feature extraction. Figure 7 illustrates the original images. For each original image, the corresponding vertical edge and horizontal edge of the image were evaluated, and it is illustrated in Figures 8 and 9. The gradient magnitude of the images is also shown in Figure 10. Similarly, the theta images of the given input images are illustrated in Figure 11.

Figure 7.

Original images: (a) image 1, (b) image 2, (c) image 3, (d) image 4, and (e) image 5.

Figure 8.

Vertical edge of the given images: (a) image 1, (b) image 2, (c) image 3, (d) image 4, and (e) image 5.

Figure 9.

Horizontal edge of the given images: (a) image 1, (b) image 2, (c) image 3, (d) image 4, and (e) image 5.

Figure 10.

Gradient magnitude of the images: (a) image 1, (b) image 2, (c) image 3, (d) image 4, and (e) image 5.

Figure 11.

Theta representation of the images: (a) image 1, (b) image 2, (c) image 3, (d) image 4, and (e) image 5.

One of the important processes is the evaluation of image orientation of the eight angles such as 0, 45, 90, 135, 180, 225, 270, and 315°in each image, which is shown in Figures 1216. The resultant EV-SIFT contour of the input images is illustrated in Figure 17.

Figure 12.

Image orientation of eight angles (image 1): (a) 0°, (b) 45°, (c) 90°, (d) 135°, (e) 180°, (f) 225°, (g) 270°, and (h) 315°.

Figure 13.

Image orientation of eight angles (image 2): (a) 0°, (b) 45°, (c) 90°, (d) 135°, (e) 180°, (f) 225°, (g) 270°, and (h) 315°.

Figure 14.

Image orientation of eight angles (image 3): (a) 0°, (b) 45°, (c) 90°, (d) 135°, (e) 180°, (f) 225°, (g) 270°, and (h) 315°.

Figure 15.

Image orientation of eight angles (image 4): (a) 0°, (b) 45°, (c) 90°, (d) 135°, (e) 180°, (f) 225°, (g) 270°, and (h) 315°.

Figure 16.

Image orientation of eight angles (image 5): (a) 0°, (b) 45°, (c) 90°, (d) 135°, (e) 180°, (f) 225°, (g) 270°, and (h) 315°.

Figure 17.

EV-SIFT contour of images: (a) image 1, (b) image 2, (c) image 3, (d) image 4, and (e) image 5.

6.4 Learning performance of LM-NN

The performance of the LM-NN classifier is illustrated in Figure 18. It is observed that the best performance of the classifier is attained at the epoch 7, where the training performance is 0.00022204, gradient is 7.0363e-08, Mu is 1e-10, and the validation fail is 0 since there is no validation attained.

Figure 18.

Performance of LM-NN classifier in correspondence with (a) validation performance and (b) gradient, Mu, and validation fails.

6.5 Comparative performance analysis of best-performing methods of proposed approaches

While analyzing the first research technique, in the evaluation on LM-NN, it is observed that the EV-SIFT proposed technique attained better results in all the measures like accuracy, sensitivity, specificity, precision, false-positive rate (FPR), false-negative rate (FNR), net present value (NPV), false discovery rate (FDR), and F1score (also F-score or F-measure) which is a measure of a test’s accuracy and Matthews correlation coefficient (MCC), respectively. The evaluation is summarized in Tables 13.

SensitivityMalar augmentation0.24
PrecisionSkin peeling0.04
FNRMalar augmentation0.76
FDRSkin peeling0.96
F1scoreSkin peeling0.06
MCCSkin peeling0.04

Table 1.

Proposed SIFT with LM-NN of different plastic surgery faces.

MeasuresSurgeryAttained result
PrecisionSkin peeling0.03
FDRSkin peeling0.97
F1scoreSkin peeling0.05
MCCSkin peeling0.03

Table 2.

Proposed V-SIFT with LM-NN of different plastic surgery faces.

SensitivityBrow lift0.17
PrecisionSkin peeling0.04
FNRMalar augmentation0.83
FDRSkin peeling0.96
F1scoreSkin peeling0.06
MCCSkin peeling0.04

Table 3.

Proposed EV-SIFT with LM-NN of different plastic surgery faces.

It is observed that the proposed V-SIFT with LM-NN has achieved more over the conventional methods for various plastic surgeries, which is summarized in Table 2. It is observed that for all the measures, the method has attained better results, which also leads to the other types of plastic surgery.

From the second technique, it is observed that the proposed EV-SIFT with LM-NN are achieved more over the conventional methods for various plastic surgeries, which is summarized in Table 3. It is observed that for all the measures, the method has attained better results.

7. Conclusions

This chapter gives the detailed description of the second research technique. The feature descriptor EV-SIFT that is used for feature extraction is well explained. Further, the LM-based NN classifier is defined in this chapter, and the performance of both the EV-SIFT and LM-NN classifiers is shown in the Result section. The better work of EV-SIFT is effectively demonstrated in this section, which shows how the images are distinguished between them. The analysis of the LM-NN classifier is also more satisfactory with better performance.


To begin with, I express gratitude to God Shri Gajanan Maharaj, who provided me the potency as well as the ability to bring out this research. Additionally, I express my gratefulness to supervisor, Prof. S.N. Talbar, for his inspiration, priceless suggestion, and support along with concentration during exploration of research. I express thanks to every friend for giving out their experience and information. I also want to give an exceptional thanks to my companion for his honest advice and steady support to do a high-quality research.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Archana Harsing Sable and Haricharan A. Dhirbasi (July 29th 2019). Granular Approach for Recognizing Surgically Altered Face Images Using Keypoint Descriptors and Artificial Neural Network, Visual Object Tracking with Deep Neural Networks, Pier Luigi Mazzeo, Srinivasan Ramakrishnan and Paolo Spagnolo, IntechOpen, DOI: 10.5772/intechopen.85550. Available from:

chapter statistics

109total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Deep Siamese Networks toward Robust Visual Tracking

By Mustansar Fiaz, Arif Mahmood and Soon Ki Jung

Related Book

First chapter

Neural Forecasting Systems

By Takashi Kuremoto, Masanao Obayashi and Kunikazu Kobayashi

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us