Percentages of the three metric parameters.
The content of this work is the proposal of the technical HS-ab for the detection and tracking of the regions of skin in real time. First, two proposed techniques are analyzed for the modeling of skin color in images using a combination of color spaces HSV with YCbCr and HSV with CIELab. In the process of definition of the intervals of pixels is taken into account the following: non-skin color uses the components H and S, whereas skin color uses the components Cb, Cr, a, and b. The results showed that the HS-ab technique is better than the HS-CbCr technique because of the precision in detecting skin color according to the percentages C (34.8%) and CDR (67%). After, the morphologic operations are applied to debug the images of the previous segmentation and detect regions of skin using methods such as blob extraction and contour detection. Subsequently, the tracking of skin color consists of calculating the moments and positions of each frame to know the trajectory of the regions of skin. The purpose of the work is to design an easy-to-use computer vision system that will facilitate the early rehabilitation of patients before they are clinically ready to be fitted with a prosthesis.
- color spaces
The loss of a limb in the body, such as the arms, hands, and legs, is one of the most devastating events that can happen to a person. Subsequently, people require an amputation in order to reduce disability, eliminate useless limbs, and save lives. The treatment of the amputated person includes not only the surgery but also the restoration of the function and the setting of an artificial limb. The treatment must be considered as a continuous dynamic process that begins at the time of injury and continues until the patient has reached the maximum usefulness of his or her prosthesis and is able to perform the essential activities of daily living. The treatment can be divided into two stages: the preprosthetic stage and the prosthetic stage. The first is rehabilitation before using the prosthesis, and the second is rehabilitation with the use of the prosthesis. Both stages consist of physical exercises to recover muscular strength, functional independence of the limb, and mobility of the residual limb . This rehabilitation treatment can be carried out with computer vision systems such as the recognition system. This project develops a skin color recognition system so that it can be used with the residual limb and prosthesis of skin-colored frame.
Today, the use of recognition systems is increasingly being recognized as a useful tool for the study, assessment, and rehabilitation of functional abilities. This system can offer innovative and exciting ways to rehabilitate, making the treatment more enjoyable and, therefore, increasing the motivation of patients . In addition, these systems have the ability to create a dynamic stimulus environment by offering an active behavioral response in patients, which cannot be obtained from traditional therapies. The applications of the system are focused on cognitive processes, including attention processes, spatial abilities, and memory. Application scenarios of recognition systems are designed to teach the basic activities of daily living such as common object recognition, meal preparation, shopping, etc. These applications allow patients a feedback when they continue their exercise program in the home environment .
The mission of this work is the development of applications focused on the rehabilitation of patients with amputation. On this occasion, this document presents the beginning of the project that is to detect and track the areas of skin on human body; therefore, the concepts that form a recognition system are described below.
The recognition systems such as face detection, corporal detection, hands tracking, etc. use skin detection as the main step in the system. Color image segmentation is useful in many applications, mainly for the purpose of human-computer interaction (HCI) [4, 5]. The segmentation results allow to identify regions of interest in different scenes, which is very beneficial to provide the needed information. In this work, the skin color is our interest, for which the segmentation of images in skin color is realized. The information in the color of the skin is useful for the detection, the location, and the tracking of the parts of human body; it also allows a fast processing and provides robustness in the application. A system of detection and tracking of skin consists of the following points: to choose an appropriate color space for the detection of skin color, to extract skin color by a modeling, to debug the regions of the skin and to follow the movements of the parts of skin in people. Each point in the system is described below.
1.1. Color spaces
The aim of skin color pixel classification is to determine if a color pixel is a skin color or non-skin color. A useful classification should differentiate all skin types as yellowish, brownish, whitish, etc., and stock different lighting conditions. The color classification is employed using only pixels chrominance because it is expected that skin segmentation may be more robust to lighting variations when the pixels luminance are discarded . This section describes four color spaces, which are commonly used in the skin segmentation:
RGB: colors are specified in three primary colors red (R), green (G), and blue (B). The advantage is simplicity but it does not separate luminance and chrominance.
HSV: also known as HSB (B-brightness) color space. The component hue (H) defines the dominant color (such as red, green, purple, and yellow) of an area and varies from 0 to 1, saturation (S) measures the colorfulness of an area in proportion to its brightness, and value (V) is related to the color luminance, that is to say, it corresponds to brightness and it varies from 0 to 1. The HSV color space is computed using Eqs. (1)–(3). This model gives poor result where the brightness is very low. This color space discards luminance from chrominance. Other similar color spaces are hue, saturation, and intensity (HIS) and hue, saturation, and lightness (HSL).
E1 E2 E3
YCbCr: the Y, Cb, and Cr components refer to luminance, chromatic blue, and chromatic red. It provides an excellent color space for luminance and chrominance separability . The simplicity of the transform in Eqs. (4)–(6) and the explicit separation of luminance and chrominance components make this color space very attractive for skin color detection .
E4 E5 E6
CIELab: it is a perceptually uniform color space that was proposed by G. Wyszecki and standardized by Commission Internationale de L’Eclairage (CIE). It separates a luminance variable L from two perceptually uniform chromaticity variables a and b.
In this document, the color spaces HSV, YCrCb, and CIELab are used for the segmentation of color of the skin, due to the fact that they separate the components of chrominance and luminance achieving a characterization of the different colors of skin .
1.2. Skin color segmentation
A part of the process of the segmentation is to construct a rule of decision that discriminates or differs between the pixels of an image that corresponds to skin color and non-skin color. This can be done by nonparametric modeling as Bayesian classifier and histogram-based thresholding or parametric modeling such as Gaussian model (GM) and Gaussian mixture model (GMM) . This paper describes the technique of histogram threshold to image segmentation of skin color. This technique is fast and practical in training, also it is theoretically independent from the shape of skin distribution .
The method uses an approximate estimate to define the interval of a color space corresponding to skin color that appears in the training images, see Eq. (7).
where skin[c] is the interval value of the histogram, c corresponds to the color vector, and Norm is the normalization coefficient (sum of all histogram values) or the maximum value of the intervals.
Some research [12, 13] used the histogram-based thresholding for segmenting the skin pixels that is projected to work with lighting conditions, skin tones, and for real-time skin segmentation in video.
1.3. Morphological operations
The detection system involves subsequently the usage of morphological operations to refine the skin regions extracted from the segmentation. Morphological operations are a set of simple local filters, which can be combined to obtain more complex results.
Two morphological operations are used in this chapter. Erosion, which reduces the image regions that represent non-skin color pixels, and dilation that expands the regions of the image of skin color pixels, which were lost due to the aggressive erosion applied in the previous step. Applying the erosion and later the dilation will allow to smooth contour of the objects, eliminate small protuberances, and disappear fine structures.
1.4. Detection and tracking
This section describes the procedures to detect and track the regions of skin in image sequences.
The first, the detection of constant changes can be carried out with the blob method, which it is possible to know the positions of the moving object . A blob is a group of connected pixels in an image that shares some common property, for example, color. There are two commonly used manners for getting the blobs: one is to choose an appropriate threshold to differentiate moving objects (skin regions) from the background and another way is to conduct motion analysis . The latter can be set using open source computer vision (OpenCV), which is a library of programming functions mainly aimed at real-time computer vision. There are three blob detection libraries for OpenCV, cvBlobsLib, cvBlob, and Bloblib that allow to know the positions of the blobs. On the other hand, in the first, the blobs are extracted (areas of skin color) with the threshold and create a binary mask of skin color pixels. Subsequently, the method of contour detection is applied, it has as main target to locate all the pixels that belong to the contour of the body parts, in other words, this stage detects all the contours that surround the blob that matches with the skin regions. This last form is used in this project, which is the extraction of blob with contour detection.
One way of tracking the skin regions is through the moments. In pattern recognition, the moments have been extensively used as global features of images . The moments can calculate the position of the center of the skin regions. This calculation consists to calculate first-order spatial moments around x-axis and y-axis and the Zero-order central moments of the binary image. Zero-order central moments of the binary image are equal to the white area of the image in pixels. In the same way, this calculation can be carried out with OpenCV.
The chapter is organized as follows. Section 2 presents the procedure to calculate the threshold, the implementation of the techniques, the evaluation of the results of the techniques, and the description of the system algorithm. Section 3 demonstrates experimental results. Section 4 discusses the conclusions.
The methodology is based on the following parts: determination of thresholds in the color spaces for skin color or non-skin color pixels, development of algorithm for the techniques HS-CbCr and HS-ab, evaluation of the results of the techniques and algorithm for detection, and tracking of skin color.
2.1. Segmentation of skin color and non-skin color with histogram based thresholding
Two techniques are proposed: HS-CbCr, which is the combination of the components H and S of HSV color space with the components Cb and Cr of the YCbCr color space, and the technique HS-ab (it is the same form of the technique previously) is the combination of H and S but with the components a and b of CIELab color space. To determine the intervals of pixels non-skin color uses the components H and S, in the case of the intervals of pixels skin color uses the components Cb, Cr, a, and b.
This section uses 80 images, of which 40 are images of different types of color skin obtained from the database of SFA  and the other 40 images are different colors to skin color.
In the case of non-skin images, each image is transformed to the HSV color space by obtaining the histograms of the channels H and S. In each, histogram is estimated the minimum and maximum values for the components H and S. After having all the values of the 40 images and using the Eq. (7), the minimum and maximum average values of H are obtained, as well as the minimum and maximum average values of S. In this way, the thresholds for non-skin color pixels are obtained.
In the case of skin images, it is the same procedure described as above except that it transforms into the YCbCr color space to obtain the thresholds of the components Cb and Cr and also converts into CIELab color space to get as result the thresholds of the components a and b. Then, to get the skin color pixel thresholds for each color space, only the chrominance components are considered.
For example, a non-skin image is transformed into the HSV color space, and the histograms of H and S are obtained. The histogram of H has a minimum value (a1) and a maximum value (b1), when applying to the 40 images, the minimum values are a1 + a2 + ⋯ + a40, and the maximum values are b1 + b2 + ⋯ + b40. Then, in Eq. (7), skin[a] is the average of all the minimum values of a while Norm is the maximum value of all values of a, this process is also performed for the maximum values so that the minimum and maximum thresholds of H are Pskin(a) < H < Pskin(b). In this way, the intervals of the components of interest are obtained.
The procedure described in this section is shown in Figure 1. The algorithm for obtaining skin and non-skin color thresholds is implemented in MATLAB (R2016a—The MathWorks).
The aim of this section is to obtain the ranges of the components H and S for the non-skin color and the components Cb, Cr, a, and b for the skin color.
2.2. Techniques HS-CbCr and HS-ab
This chapter presents two techniques of color models for the detection of skin color. Both techniques use the HSV color space that provides additional information of hue and chrominance of an image with the aim to improve discrimination between skin pixels and non-skin pixels . In this work, the HSV color space is used to detect non-skin color pixels later, in one technique, the YCbCr color space is applied to detect the skin color pixels and in the other, the CIELab color space.
After knowing the thresholds of each color space, an algorithm is developed to detect the skin regions of an image. The algorithm is implemented in Microsoft Visual Studio (C++ 2015). This algorithm consists of loading the image, transforming the image into HSV color space, and identifying non-skin pixels with the thresholds H and S to discard them. Later with the same image, it turns in the YCbCr color space to select the skin color pixels with the thresholds Cb and Cr and displays the image with only skin regions, which is known as the technique HS-CbCr. On the other hand, the technique HS-ab has the same procedure described earlier with the exception that uses the thresholds a and b to identify skin color pixels. The algorithm of the techniques HS-CbCr and HS-ab is presented below:
The purpose of the development of the algorithm is to separate the skin areas against the non-skin areas in two different ways.
2.3. Performance evaluation
The metric parameters are used to evaluate the results of the skin detection algorithms. In this work, three different metrics, C, CDR, and FDR, are implemented as in [18, 19]. The C is the percentage of the correct skin detection to determine which chrominance component is the best skin detection result. The C is given in Eq. (8). The correct detection rate (CDR) is the percentage of the pixels that are classified correctly by the algorithm as skin pixels. It is expressed in Eq. (9). The false detection rate (FDR) is the percentage of the pixels that are classified wrongly by the algorithm as non-skin pixels. This corresponds to Eq. (10).
Where P is the number of pixels correctly classified as skin color of some color space, T is the total number of pixels in the database (training images), Pc represents the total number of pixels that are classified correctly as skin pixels by the proposed algorithm, Ts is the total number of skin pixels that are classified by the ground truth, Pw represents the total number of pixels that are classified wrongly as non-skin pixels by the algorithm, and Tns is the total number of non-skin pixels that are classified by the ground truth. The term ground truth refers to the information of the actual data, for example, the skin images are truthful information of skin color.
The performance metrics are essential concepts in verifying a specified algorithm as skin detection algorithm.
2.4. Skin color tracking
According to the results of the previous section, the best technique is selected to subsequently implement in the next algorithm. The algorithm is implemented in Microsoft Visual Studio (C++ 2015) with the library OpenCV 3.1.0, and this software is tested under Windows 10 of a computer ASUS Notebook UX32A. The algorithm is observed in Figure 2 and consists of the following steps:
Step 1. This step is to capture the image and to read temporary images via webcam, and then each frame image is converted into HSV to eliminate pixels of non-skin color and is subsequently transformed into the CIELab color space to represent only skin color pixels. Then, the values of the thresholds H, S, a, and b are implemented. Last, the image becomes binary, with 1 representing skin color pixels and 0 representing non-skin color pixels.
Step 2. The application of morphological operations clarifies the regions of skin extracted from the segmentation of the previous step. This project applies erosion and dilation with the aim of separating areas of skin that are connected to areas not to skin that managed to survive the segmentation.
Step 3. The methods of blob and contour detection are applied in this step. A blob is a connected part of a binary image that also refers to the regions in an image that are either brighter or darker than the surrounding. All blobs are classified in the image by their area, then the background of the image is extracted, and the blob of the body parts is obtained. Later, contour detection is used to store all the contours of skin regions. There are two most common types, the convex hull and the ellipse. The first is elemental to get the convexities of the body gestures, while the second is defined as the rectangle that includes all the points that compose the contour of skin regions that is making a gesture. On this occasion, it works with the convex hull, although it is still not getting the information of the position of the parts of the body by each frame. However, it is a work for future to implement it at the stage of tracking. The library cvBlobsLib is used to generate and filter any blobs, and the function cvFindContours() that gives the contour (outline) and blob area. The library and the function are from OpenCV.
Step 4. The tracking consists of knowing the positions of the user gesture to display on a computer screen the path of the movements. The problem is that the position of user can vary depending on the gesture performed. To carry out this, one must know the positions of the user’s movement. The information of the positions is obtained through the moments. Moments moments() is an OpenCV function that calculates all of the spatial moments and returns a moments object with the results, and this function is applied. Then, it is considered that if the white area of the binary image is less than or equal to 10,000 pixels, then there are not skin zones in the image because the skin zones are expected to have an area more than 10,000 pixels. Therefore, the positions of the x-axis and y-axis in the image are calculated considering the moment between the areas. In this way, the path can be displayed in the continuous images.
The algorithm of the previously described steps is shown below:
This section explains the process of design of a computer vision system for tracking of the skin areas of the user. Then, the following section presents the results.
In this section, representative results are provided for a proposed algorithm for the detection and tracking of human skin parts. As mentioned, each non-skin image is converted into HSV color space, and each skin image is transformed into YCbCr and CIELab color spaces to obtain the maximum and minimum thresholds of the channels H, S, Cb, Cr, a, and b. An example is presented in Figure 3, where a skin image is converted into the YCbCr color space and its respective histograms of each channel are presented where its thresholds are Cb = [110, 114] and Cr = [138, 142], which shows that this case is the procedure of an alone image. In this work, 40 skin images and 40 non-skin images are used to obtain the thresholds based on Eq. (7). For non-skin images, the thresholds of H and S are 0 < H < 0.2 and 0.15 < S < 0.9, whereas for skin images in YCbCr color space, the thresholds are 88 < Cb < 130 and 127 < Cr < 175, and in CIELab color space, the thresholds are 142 < a < 225 and 115 < b < 177. These thresholds are implemented in the algorithm of the techniques HS-CbCr and HS-ab.
For the application of the techniques HS-CbCr and HS-ab, 20 real images are used where people are able to detect areas of skin. After implementing the thresholds of H, S, Cb, Cr, a, and b in the algorithm of the techniques HS-CbCr and HS-ab, Figure 4 presents the results of some real images with the application of both techniques.
According to the experimental results, it is possible to say that the images with the technique HS-ab present better results in comparison with the technique HS-CbCr, since the latter technique shows false alarms in the confusion of the colors of sand, wood, hair brown, among others such as skin, i.e., the colors similar to the skin are detected as skin. To obtain quantitative results, both techniques are evaluated with the metric parameters.
P is the total number of skin color pixels of the skin images after transforming to the YCbCr and CIELab color space taking into account only the chrominance components, T is the total number of skin color pixels of the 40 skin images after converting to YCbCr and CIELab color space. In this case, two percentages of C are obtained, one for the YCbCr color space and another for CIELab. On the other hand, two percentages of CDR are obtained, one for the technique HS-CbCr and another for the technique HS-ab, where Pc is the total number of skin color pixels found in the algorithm HS-CbCr and HS-ab on real images, and Ts is the total number of skin color pixels of the 40 skin images. FDR has the same percentage for both techniques, since in detecting, the non-skin color pixels use the same procedure, then Pw is the total number of non-skin color pixels detected in the algorithms HS-CbCr and HS-ab on real images, and Tns is the total number of non-skin color pixels of the 40 non-skin images. These percentages appear in Table 1.
|Parameters||YCbCr color space||CIELab color space||Technique HS-CbCr||Technique HS-ab|
As seen in Table 1, the percentage of C in CIELab is higher with respect to YCbCr, which means that the CIELab color space is more accurate in detecting skin color with chrominance components; therefore, the best results are expected if the CIELab color space is used. The proposed algorithm of the technique HS-ab presents a high detection of pixels correctly classified as skin color compared to the technique HS-YCbCr, due to the fact that HS-ab has a greater CDR than HS-CbCr besides not presenting false alarms. This is the reason for the choice of technique HS-ab for the application in the next algorithm.
The results of algorithm for detecting and tracking skin color are presented in Figure 5. The operation of the algorithm is to capture the continuous images through the camera of the PC, showing on the screen the images of blob and the path in red color of the regions of greater area of skin in motion.
This system has been tested with different skin types for which a satisfactory response is obtained.
Numerous application areas use segmentation process such as computer vision and object recognition. Segmentation process is the partitioning of input image or input scene (video scenes) into meaningful objects, and each object can be treated, processed, or discriminated by its salient color. Several color spaces are applied for processing colored images. Each color space has different characteristics and allows to represent the color images. The choice of color space is of the utmost importance and goes according to the application; in this case, our interest is human skin color. There are color spaces for the segmentation of human skin color among the most common HSV, YCbCr, and CIELab . These color spaces consist of chrominance and luminance components, in which the luminance component is affected with lighting changes. This is the reason for the decision to discard the luminance components and use only the chrominance components (H, S, Cb, Cr, a, and b) for skin color segmentation. Based on the above, two techniques are created.
The skin segmentation for color images is a simple and easy way to classify into skin and non-skin pixels. In this work, the skin detection algorithm is used the techniques HS-YCbCr and HS-ab applying histogram threshold-based approach, where the global threshold value for each chrominance component is determined to differentiate pixels from skin color and non-skin color. Later, image preprocessing is required to achieve an efficient segmentation. Morphological operations are the needed operations to complete the segmentation process and reduce noise. There are applied morphological dilation and erosion operations to extract skin pixel. The last stage of the system is the tracking of skin regions. The method of moments is applied to calculate the positions of the largest area of skin color. This method has low computational cost and is easy to implement for the application in computer vision.
This work presents the comparison of two techniques for the modeling of skin color. The experimental results of both skin color segmentation techniques show a good detection of skin color in all images; therefore, the techniques are able to model skin color. However, the technique HS-YCbCr also presents false detections by confusing skin colors with similar ones. With respect to the results of the three metric parameters, the technique HS-ab presents high percentage compared with the other technique indicating that it is a reliable and accurate technique for the detection of skin color pixels. On the other hand, the algorithm for tracking is able to segment skin color in real time.
To end this chapter, a new technique for detection and tracking of skin color has been presented. This proposed system can cope successfully with different colors of skin and complex backgrounds. It can operate with images acquired by a camera and with various body movements.
The future work of research will focus to implement the stage of recognition of gestures in the system and finally test it with amputated people either before or after fixing the prosthesis. Then, this system will be better able to fulfill its main purpose, which consists in developing an effective rehabilitation using the technology of a computer vision system, to return to the patients the highest level of independence and functioning as possible.
Authors express their gratitude to Instituto Politécnico Nacional (IPN) and Consejo Nacional de Ciencia y Tecnología de México (CONACYT) for the trust and support to develop research.