Open access peer-reviewed chapter

Intoxication Identification Using Thermal Imaging

By Georgia Koukiou

Submitted: April 3rd 2017Reviewed: October 31st 2017Published: December 21st 2017

DOI: 10.5772/intechopen.72128

Downloaded: 249

Abstract

In this chapter, seven different approaches are presented for identifying persons who have consumed alcohol. The main concept is to identify a drunk person based on the thermal signature of his face. The thermal map of the face changes as the person consumes alcohol due to the increased activity of the blood vessels. The methods are mathematically supported and present high rate of identification success. The experimental material was based on a systematically created database which includes the thermal images of the drunk persons as well as the thermal images of the face of the corresponding sober persons. This database is freely available on the web and can be used by the scientific community. In each method, different features are extracted for intoxication identification. The advantage of the majority of the methods is that drunk identification can be achieved without employing the image of the sober person for comparisons. Accordingly, a commercial system incorporating some of the presented methods does not require the existence of a database with thermal images of sober faces, thus it will be capable to operate on unknown persons. The achieved identification success for each separate method is over 80%.

Keywords

  • thermal imaging
  • drunk identification
  • intoxication inspection
  • drunk database
  • noninvasive drunk monitoring

1. Introduction

Intoxication by means of alcohol consumption is a serious and sometimes dangerous condition that a person may fall into as far as its health, security, and the social security are concerned. Citizens have to be trained not to consume alcohol beyond the permissible limit. However, this is a societal problem and has to be encountered by the society and its mechanisms. The material of this chapter is beyond the social component of intoxication. It elaborates on the capabilities of contemporary technology to identify drunkenness and prevent intoxicated persons to be engaged in dangerous situations, that is, driving or handle critical installation.

Common means of identifying drunkenness is by a breathalyzer or a blood test. Both methods require the person under test to come in touch with the device and to stand for an invasive test. Both procedures are time-consuming, especially the blood test, and they have a considerable cost. These techniques cannot be applied or used to monitor intoxication remotely and prevent drunk persons from being engaged in tasks that require the operator’ s attention and are associated with security. For example, it is not efficient to perform a test with a breathalyzer before a football match if it is desirable to prevent the drunk persons entering the stadium. The material in this chapter presents ways for identifying intoxication by means of thermal infrared images of the face.

Specific algorithms are presented in this chapter and their discrimination capabilities are explained [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]. The original idea lies on the fact that the blood vessels’ network of the face will present increased activity when the person has consumed alcohol, changing in this way, the temperature distribution on the person’s face. Accordingly, a commercial system can be derived which could be used for a fast assessment of the intoxication situation. In case of a positive inference, a breathalyzer can be employed for verifying the results. Obviously, it is not possible to obtain a thermal map of the face by means of visible light. Acquiring images from faces in thermal infrared spectrum, information related to the temperature of the face is obtained which mainly depends on the physiological condition of the person (illness, exercises, and drunkenness). The human face being in a mean temperature around 300 K, radiates according to the Wien’s Law as a perfect black body, with maximum at 10 μm wavelength.

Drunkenness is a challenging physiological condition to be investigated using infrared imagery. However, most of the publications in the literature refer only to automotive anti-drunk driving systems, which utilize electrical signals from the heart or brain [14]. Extensive review of the relevant literature is given throughout the material which is being developed in this Chapter. Seven different approaches are discussed in this chapter for identifying intoxication by means of thermal infrared images. Specifically:

  1. A feature vector is formed for drunk person identification by simply taking 20 different points on the face of each person.

  2. The temperature differences which are presented on the face after alcohol consumption are discussed.

  3. The activity of blood vessels on the face when the person is drunk is examined.

  4. Neural networks are tested on infrared images of faces for discriminating drunk persons.

  5. Temperature distribution on the eyes of sober and drunk persons is studied by means of thermal infrared images. The iris and the sclera are of the same temperature for the sober person. For the intoxicated person, the iris becomes darker.

  6. Isothermal regions on the face of sober and intoxicated persons are extensively studied for drunk person identification.

  7. Markov procedures are employed for the discrimination of drunk persons. This approach is applied only on the forehead.

Initially, basic elements are provided regarding the limits in alcohol consumption posed in different countries, as well as the thermal behavior of the skin is analyzed. Furthermore, an analytical description of the database with sober and the corresponding drunk persons created in Electronics Laboratory Physics Department University of Patras, Greece, is provided and the experimental procedure carried out toward the completion of this database (http://www.physics.upatras.gr/sober/) is described.

The methods for drunk identification were developed independently and the obtained features are different. It is important, in a future work the correlation between these features as far as their common information is concerned to be analyzed so that an optimal identification procedure using all this information can be devised (information fusion).

The final goal of the exposed material is to achieve a drunk person identification using only its thermal images without the need of comparisons with the corresponding images obtained when the person was sober. This is achieved by most methods presented here and constitutes a significant challenge toward building a commercial product. Such a commercial product could scan the face of a person and in case he is identified as drunk, the system will prevent him of being engaged into critical procedures (driving or operating specialized infrastructure).

2. Alcohol consumption limits

Alcohol that enters our body mainly during meals, if of course not too much, do not endanger us, unless there are health reasons that prohibit its consumption. However, excessive alcohol consumption may be particularly dangerous for someone who manages machinery or drives a vehicle and may have consequences on other persons as well. The effects and symptoms that alcohol brings to the human body vary according to the amount of alcohol present in the body (milligrams). The way alcohol affects the human body begins with the absorption of ethyl alcohol into the digestive system and its final appearance in the blood and exhaled air, where authorities can measure with the well-known alcoholmeters (breathalyzers).

Each country has set its own limits on alcohol consumption, and there are countries where zero limits have been established, such as Slovakia, the Czech Republic, Romania, and Hungary. Estonia, Poland, and Sweden have placed the blood alcohol limit at 0.2 mg/mL, while Lithuania at 0.4 mg/mL. In Austria, Belgium, Bulgaria, Cyprus, Denmark, Finland, France, Germany, Italy, Luxembourg, the Netherlands, Portugal, Slovenia, and Spain, the limit is at 0.5 mg/mL. If the blood alcohol concentration exceeds 0.5 or 0.25 mg/L of exhaled air, the driver will be fined in proportion to the level of violation of the permissible limits. These limits are reduced in the case of special vehicle drivers such as ambulances, busses, trucks over 3.5 tons, motorcycles, and mopeds [15, 16, 17, 18, 19, 20, 21].

Drinks are categorized according to the amount of alcohol they contain. Blood alcohol concentration (BAC) is different for men and women depending on their weight, the amount of drinks they consume and at different time instances after consumption. The effects on each individual according to the indication of the BAC are as follows:

  • 0.2–0.3 mg/mL: The person has a slight euphoria and shyness, a laxity, a loss of coordination and perhaps a little dizziness.

  • 0.4–0.6 mg/mL: The person can have a sense of well-being, relaxation, reduction of inhibitions, warmth, and euphoria. It may also present some minor thought and memory dysfunction, attention deficit reduction. Finally, feelings may be more intense and behavior more intense.

  • 0.7–0.9 mg/mL: The person thinks he can work better than he actually does (overestimating his potential). There will be little decrease in balance, speech, vision, hearing, and reaction time. He is euphoric. His judgment and self-control are diminishing. Finally, attention, logic and memory are diminishing.

Driving with a blood alcohol concentration of 0.8–2.9 mg/mL is a criminal offense and if the person is legally drunk. Also, if the alcohol concentration in the blood is above 3.0 mg/mL, death occurs from alcohol poisoning.

The relationship between alcohol concentration in the blood and alcohol in the exhaled air should be highlighted. The amount of alcohol contained in 1 mL of blood is the same as the amount of alcohol contained in 2100 mL alveolar air. This corresponds to 0.24 mg/L of alveolar air. According to the above, for an average adult weighing 300–400 mL of pure alcohol, death occurs.

Alcohol contained in a drink is almost absorbed by the gastric tube. The absorption rate of alcohol depends mainly on:

  • The fullness or emptiness of the stomach. Full feed slows absorption while full absorption is between 2 and 3 hours instead of 45–90 minutes (most people after 1 hour).

  • The type of food in the stomach: fatty foods cause a greater deceleration in albumin absorption while less starch.

  • The type of drink: beverages with CO2 (carbonate) are absorbed faster because carbonate ions accelerate stomach emptying.

  • The alcohol content of the beverage: beverages with 10–20% alcohol are absorbed faster.

  • Individual factors: alcohol dependence (chronic drinkers), mood, particularity of gastric and intestinal mucosa.

Alcohol is distributed more rapidly to the tissues that have the greatest perfusion, but over time, it is redistributed everywhere. The greater the amount of water contained in a tissue, the more it is influenced by alcohol, for example, blood and nervous tissue. Conversely, it occurs in fat and bone tissues.

3. Skin thermal behavior

The human body emits electromagnetic radiation, like all bodies [22]. Since its temperature is close to that of the environment (~300°C), the entire spectrum of this radiation lies in the area of thermal infrared. At the same time, our body absorbs thermal radiation from the environment. The skin emits as nearly perfect black body (emissivity = 0.98) and based on the Wien’s law, the wavelength for the maximum of its emission is at 9.5 μm. Therefore, thermal imaging devices such as thermal infrared cameras designed for human body inspection operate in the range of 7–14 μm.

The use of infrared radiation has become popular as infrared thermography [22] in medicine, since the 1960s to record the temperature of human skin. Infrared thermography is the technique that measures the heat (infrared radiation) emitted by the body and displays the temperature distribution on the surface of the body. Measurements are made with special cameras that detect infrared without coming into contact with the body. The intensity of the thermal radiation is transformed into an electrical signal and this in a color thermogram, in which the hottest spots are presented in stronger colors. An infrared image is an optical map of surface skin temperature that can provide accurate temperature measurement but cannot quantify blood flow on the skin. In order to explain the thermographic images, it is necessary to have a good understanding of the physiological mechanism of blood flow on the skin and the factors that affect the heat transfer to it. From our understanding regarding blood flow on the skin, heat transfer between tissues and skin temperature has changed rapidly over the past 40 years, allowing us to better represent and understand thermal measurements. At the same time, the improvement in camera sensitivity coupled with improved CCD technology and the development of computational imaging systems have improved the noninvasive thermography method.

Modern technology provides accurate temperature measurement with accuracy better than 0.05°C without getting in or touching the skin. These systems produce high-resolution images at high speed and the measurements are quantitative. When measuring the temperature of the skin under the influence of a cold environment, its temperature distribution is heterogeneous. On the contrary, the temperature distribution of the skin is more homogeneous in warm conditions. During exposure to heat or intensive exercise, the blood flow in the skin may be increased in order to increase the consumption potential. When the human being is exposed to a cold environment, the surface of the skin restricts the flow of blood and thus becomes a perfect insulator. In these hypothermic conditions, our skin works to maintain the core body temperature.

Another procedure for controlling blood flow to the skin is dynamic thermography which includes local cooling or heating of the skin [23]. The ability of blood to transfer heat between the various levels of tissues to the skin can be predicted by models. These models are based not only on conductivity, tissue density, specific heat, and temperature of the tissues, but also their metabolic needs as well as the speed flow of the blood. A disadvantage of infrared thermography is that it cannot directly demonstrate that the increase in temperature is due to the increase in blood flow. One way to prove this is to combine infrared thermography with other direct blood flow control techniques such as Laser Doppler.

4. Experimental procedure and the database

The Thermo Vision Micron A10 Model infrared camera was used to capture the thermal images from the face. This camera has an operating wavelength range from 7.5 to 13 μm, and a radiometric dynamic range which adjusts automatically to temperature range. This wavelength region corresponds to the maximum of the Wien curve for blackbody emission with temperature at 300°K. This is exactly the behavior in emitting electromagnetic radiation from the human skin [22]. In our experiment, 41 persons participated among them 10 females. A quantity of half liter of wine which corresponds to 62.4 mL of alcohol was consumed by each subject, in an hour time duration. A first frame sequence of 50 frames was obtained for each person before alcohol consumption. Another sequence of the same number of frames was acquired half an hour after consuming the last glass of wine. The frame rate acquisition was set to 10 frames/sec.

The resolution of the infrared images is 128 × 160 pixels. The camera was quite close to the face of the person so that the thermal image contains the whole face. The experimental procedure requires the availability of the thermal images of an intoxicated person as well as the thermal images of the corresponding sober person so that comparisons can be carried out. The persons that participated in the experiment were alert about the strict requirements of the procedure. Researchers working close to our research group and being sensitized on the experimental requirements took part in the experiment. All of them were healthy and willing to release their personal data to the public (http://www.physics.upatras.gr/sober/). The created database contained all relevant information for the participants (age, weight, sex, etc.). We considered the person who consumed half a liter of wine as drunk or intoxicated. In the experimental procedure, no blood tests were contacted.

Three glasses of wine are enough to bring a person in the intoxication situation which corresponds in exceeding 0.2 mg/L of exhaled air [15]. However, with this quantity of wine, other participants were brought in the limit of intoxication while others were deeply intoxicated. Measurements carried out by the police showed off these differences in persons’ intoxication (breathalyzer 0.22–0.9 mg/L). The maximum concentration of alcohol in the exhaled air was reached half an hour after the consumption of the last glass of wine. This concentration was found at 0.22 mg/L for the heavy persons that used to drink alcohol and raised to 0.9 mg/L for the light persons that used not to drunk alcohol. Gradually, breathalyzer indication decreases. Females were more sensitive to alcohol than the males.

Finally, it is worth mentioning that all participants were healthy and calm when the experiment started, and had not undergone any kind of body exercise. All participants were present at the room of experiment half an hour before its initiation. Actually, the purpose of the experiment was to reveal temperature changes caused only by alcohol consumption. No other abnormality is considered.

5. A simple feature vector from face

A simple feature vector is formed for drunk person identification by simply taking the pixel values of 20 different points on the face of each person (Figure 1). Therefore, each face image corresponds to a 20-dimensional feature vector:

xi=181169203166217175171189169206152144243165225147247149247127tE1

which corresponds to a point in the 20-dimensional space; since in each single acquisition, 50 images are grabbed and this information corresponds to a cluster of 50 points in the 20-dimensional space.

Figure 1.

Twenty points were obtained on each face to monitor temperature changes with the consumption of alcohol.

It is important to find out if the cluster which corresponds to the same person moves in the feature space as the person consumes alcohol [24]. Simultaneously, we have to examine if the cluster of each person moves toward the same direction with alcohol consumption. If the direction of movement due to alcohol consumption is different for different persons, then we would have many directions in the 20-dimensional space, toward which the clusters of the drunk persons are moving. In this case, it would be difficult to demonstrate the space in a simpler way (preferably in two dimensions).

In this paragraph, it is analytically explained that the final problem is of two dimensions since only two of the eigenvalues obtained by means of the generalized eigenvalues problem are of significant value. In these two dimensions, it is evident that the clusters are moving toward the same direction with alcohol consumption (Figure 2).

Figure 2.

The 16 clusters of 8 persons in the 2-D space formed by the two most important directions (correspond to the first 2 largest eigenvalues). We call hereafter this space, the “drunk space”.

In our case, the feature space dimensionality was examined by the statistics of the clusters of eight persons. Consequently, there exist eight clusters in the feature space for the sober persons and the corresponding eight clusters for the drunk persons projecting onto this 2-D space, the suitable directions for maximum separability, has to be found.

This maximum separability in a reduced dimensionality space is achieved by a linear transformation W. The two most important directions wi of W are used for projection. This linear transformation is:

yi=wtxiE2

An important criterion function that can be used for the separation of the clusters is given by

J=SBSWE3

The clusters are moving apart as J increases. The matrices SW and SB are called within-scatter and between-scatter matrices, respectively. Eventually, SW must be small and SB must be large. The cumulative dispersion of all separate clusters (cluster scatter) can be estimated by the corresponding SW matrix, which is evaluated by summing up all individual cluster-scatter matrices Si as follows

Sw=S1+S2+...+S8E4

where

Si=xixitE5

In the transformed space, the within-scatter matrix (SW) is given by

SW=wtSWwE6

The between-scatter matrix SB reveals how much the centers of the clusters are separated. The evaluation of the between-scatter matrix SB is realized as follows

SB=mimit,i=1,2,8E7

where mi corresponds to each cluster center. In the transformed space, the between-scatter matrix (SB) will be given by

SB=wtSBwE8

Therefore, the function J in the transformed space is given by

Jw=wtSBwwtSWwE9

The maximization of the function J(w) results in vectors w obtained from the solution of the generalized eigenvalue problem:

SBWi=λiSWWiE10

The obtained matrix W contains the eigenvectors wi which show the directions in the transformed feature space on which the original features xi are projected. From this solution, the eigenvalues which correspond to wi are also obtained. Each eigenvalue describes the amount of information that the corresponding eigenvector contains regarding each cluster separability capabilities. Actually, the Fisher Linear Discriminant (FLD) method corresponds to the solution of (10). Obviously, in this procedure, the matrices SB and SW operate with opposite effect.

The generalized eigenvalue problem was solved, as we mentioned previously, for 8 persons (males) of the same weight. A total of 16 clusters are available in the 20-D feature space, that is, two clusters per person (sober and drunk). The sum of these two largest eigenvalues over the sum of all eigenvalues gives the quality of cluster separability in the reduced (2D) feature space. In this experiment, this ratio was found equal to 70%. The resulting two-dimensional feature space is demonstrated in Figure 2, along with the 16 clusters. Furthermore, in Figure 2, the direction of movement of the cluster of each person is exhibited. According to Figure 2, the new 2-D feature space is separated into two regions corresponding to sober and drunk persons, respectively. Consequently, a person can be easily classified as sober or drunk depending on the position of its cluster in this new reduced space. This space is called, hereafter, the “drunk space”.

6. Face temperature differences after alcohol consumption

The thermal differences between various locations on the face are examined in this section [2]. The purpose of this approach is to examine specific locations on the face and find out if the temperature difference between these regions changes with alcohol consumption. Thus we are not interested for the temperature of the eye but if its temperature changes with respect to another location of the face, for example, the lips. In order to apply this procedure, the image of the face of each person was partitioned into a matrix of 8 × 5 squared regions of 10 × 10 pixels each. The position of the regions was exactly the same for a specific person (sober and drunk). The temperature difference of all possible pairs of squared regions is monitored as the person consumes alcohol. A total of 40 values were calculated on the face of a specific person who correspond to the squared regions for a specific acquisition.

The thermal differences among all values of the 40 squared regions were evaluated, thus creating a matrix 40 × 40. We had to compare the difference matrices which correspond to the same person when he was sober and in the case he consumed alcohol (Figure 4). The maximum variation between the corresponding differences is monitored and actually reveals the regions which change temperature with alcohol consumption. It was found that for the drunk person the nose and mouth has increased temperature in relation to the forehead.

The main finding of this approach is that two locations, as shown in Figure 3, are good candidates for proving intoxication, namely the forehead and the nose. For the drunk person, the forehead appears cooler than the nose while for the sober, the two region are at the same temperature. Figure 4a is presented with the thermal difference matrix for the sober person. In Figure 4b, the thermal difference matrix for the drunk person is shown. These two matrices differ significantly on the white locations of matrix Figure 4(c). The final matrix which is the difference of the difference matrices is crucial for revealing locations of the face with large temperature variation.

Figure 3.

Each of the black regions is of 10 × 10 pixels area. A total of 8 × 5 regions are taken on each face. The temperature difference between the regions is monitored as the person consumes alcohol.

Figure 4.

Three difference matrices, (a) for the sober person and (b) for the drunk person, (c) the difference of the difference matrices (values normalized to full grayscale). Large changes for the thermal differences on the face are indicated by white points on this matrix. The white circle corresponds to the largest difference equal to 29.8.

In Figure 5 demonstration of the described method is given. The black regions on the face are those presenting maximum difference with alcohol consumption. These regions were indicated by the difference of the difference matrices. Accordingly, if the nose of a person is hotter than the forehead, this person should be declared as intoxicated by a drunk identification system.

Figure 5.

The black regions on faces A and B are those presenting maximum difference with alcohol consumption. These regions were indicated by the difference of the difference matrices.

7. Face blood vessels activity in drunk person

In this section, blood vessels are separated and isolated from the rest of the information on the image of the face by applying morphology on the diffused image. For this purpose, the top-hat transformation is applied [7]. Top-hat transformation is applied to isolate hot or cold features in an image of a specific size. An example is shown in Figure 6. The features to be isolated are of 5 × 5 pixels area. This transformation is described next.

Figure 6.

(a) Hot and cold spots of area 5 × 5 pixels on a varying grayscale background, (b) top-hat transformation of image in (a) using the summation of (15) and (16). The structuring element used, was a flat disk of radius 5.

7.1. Top-hat transformation

The basic morphological operation is that of erosion [26, 27]. Erosion is a shrinking procedure carried out when a signal A (binary or gray scale) is affected by another signal S, the structuring element:

AS=ijA:SijAE11

where ijis the position of Aon which Slies.

A complementary operation to that of erosion is dilation. It is a kind of expansion of the signal. It is defined as the erosion of the complement of A:

AS=AcScE12

When an erosion is followed by a dilation the smoothing-shrinking morphological operation called opening is obtained. Opening smoothes out from the signal A, all details that are smaller than the structuring element S. It is denoted as

As=ASSE13

Furthermore, when a dilation is followed by an erosion, the smoothing-expanding operation called closing is obtained. Closing covers (smoothes) all details (intrusions) of the signal A that are smaller than the structuring element S:

As=ASSE14

Employing a top-hat transformation (hot or cold), someone can extract small features from a signal A. Actually, protrusions in the signal can be obtained by subtracting the opened signal from the original (hot top-hat transformation)

Tophathot=AASE15

which allows to extract white (hot) features against a dark background. On the other hand, an intrusion of the signal can be obtained by subtracting the original signal from the closed one (cold top-hat transformation)

Tophatcold=ASAE16

and actually allows to extract dark (cold) features against a brighter background (see Figure 6 a and b).

However, before applying top-hat transformation on the image, anisotropic diffusion is performed to eliminate noise.

7.2. Anisotropic diffusion

Thermal infrared images contain noise, which many times distorts significant information and details that are important for the interpretation of the image. Anisotropic diffusion technique [28] is capable of filtering out noise leaving significant parts of the image very important in perceptual vision, like edges or lines, unchanged.

The physical background of diffusion is based on the concentration distribution u(pixel distribution), so that its gradient causes flux jaccording to Fick’s law:

j=DuE17

where Dis the diffusion tensor, which is in general a positive definite symmetric matrix, and is a function of the structure of the image. Diffusion corresponds to mass transport (gray values in images) without destroying mass or creating new mass. So,

tu=divj=jx+jyE18

where tuis the time partial derivative of the concentration distribution u. From the above equations, we have:

tu=divDuE19

In anisotropic nonlinear diffusion, the diffusion tensor is not constant over the image smoothing thus only along edges and living the information across edges unchanged. Specifically, if the diffusion tensor Dis defined to be a function of the gradient of u, that is,

D=gu2E20

then the diffusion preserves edges since no diffusion is performed vertically to edges but parallel to them. In real problems anisotropic nonlinear diffusion is capable to sharpen edges if the function g(.) is chosen properly.

The implementation of Eq. (19) in the experimental procedure can be carried out in the following way.

Let u0xybe the original input image and utxythe digital image at iteration t. The discreet in time implementation of (19) is carried out by employing the four nearest neighbors and the Laplacian operator which was used in [28]:

ut+1xy=utxy+λi=14gutixyutixyE21

where in the experimental procedure was used 0λ14and

ut1xy=utxy+1utxyE22

is the gradient of south direction,

ut2xy=utxy1utxyE23

is the gradient of north direction,

ut3xy=utx+1yutxyE24

is the gradient of east direction and

ut4xy=utx1yutxyE25

is the gradient of west direction.

The nonlinear anisotropic diffusion method was applied to all 41 faces corresponding to sober and intoxicated persons. In order the diffusion to take place only along edges the value of k which affects the degree of smoothing was selected equal to 20.

If thresholding is used on images after diffusion and top-hat transformation, the image obtained is richer for the intoxicated person compared to that of the sober person. In our experiments, the threshold was chosen to be equal to 100. In Figure 7, two images obtained for the intoxicated (right) and the sober person (left) are shown. Image registration was applied in order to compare the images. Discrimination between sober and intoxicated persons’ images was achieved based on the number of bright pixels. For the intoxicated persons, the number of bright pixels is larger for sober persons. This concept is the main supporting idea that the proposed method that contributes significantly in the forensic science. Brighter vessels constitute a clear evidence to suspect for alcohol consumption and proceed to further check up and inspection of the person.

Figure 7.

Binary images obtained using a threshold equal to 100. Sober left and intoxicated right. Vessels on the drunk person are more distinct compared to those on the sober person.

It is worthy to mention that it is possible using an image like those in Figure 7, to infer about intoxication since white pixels for the drunk person are more intense around the nose, the mouth, and on the forehead. The fact that the corresponding image from the sober person is not required for comparison constitutes the substantial forensic contribution of this method.

8. Neural networks for discriminating drunk persons

Neural networks have been used as a classification tool in a variety of machine vision techniques such as face recognition [29] and thermal infrared pattern recognition [30, 31, 32, 33]. Especially, a thermo vision application for biometric recognition is addressed in [32], while neural structures are employed in [33] for recognition of facial expressions using thermal maps of the face.

This method offers a way of discriminating sober from drunk persons, using thermal infrared images and neural networks. The neural networks are employed as a black box to discriminate intoxication by means of the values of simple pixels from the thermal images of the persons’ face. In this work, the neural networks were used by means of two different approaches. According to the first approach, a different neural structure is used from location to location on the thermal image of the face and the convergence capabilities of the network are monitored. A successful convergence characterizes the corresponding location of the face as being a good candidate for intoxication identification. According to the second approach, a single neural structure is trained with data from the thermal images of the whole face of a person (sober and drunk) and its capability to operate with high classification success to other persons is tested. Its generalization performance is also accessed.

In the first approach, different networks are trained on different locations of the same face. Thus, there will be a serious indication on the suitability of the specific face locations for drunk identification. Consequently, the face of each person is partitioned into a matrix of squared regions of 10 × 10 pixels each as the one depicted in Figure 8. There is a complete correspondence between these locations on the images of sober and drunk persons. Figure 8 is illustrated one of these square regions of 100 pixels on a pair of infrared images (sober-drunk) of a specific person.

Figure 8.

The same square region of 100 pixels on a pair of infrared images (sober-drunk) of a specific person.

A simple neural network is trained using the data in the two black regions as shown in Figure 8. The vectors used as input to the neural structure are of nine elements obtained when a small 3 × 3 window moves all over each of the two 10 × 10 pixels regions. In this way, 200 vectors are obtained to train a three-level neural structure of [9 30 1] neurons, for these two specific regions of 10 × 10 pixels. Furthermore, a larger network of [49 49 1] neurons was employed using as input vectors of 49 elements. These elements were obtained when a window of 7 × 7 pixels moves around each of the two 10 × 10 pixels regions. The back propagation algorithm was employed for training both neural structures.

Successful convergence of a neural network to a minimum value in a specific location means that this face location is suitable for drunk identification. For demonstration purposes, a region for which we have high convergence of the network is given in red in Figure 9. For all participants in the experiment and especially when the large neural structures were employed, high convergence was observed mainly on the forehead, the nose, and the mouth as depicted in Figure 9. Thus, these locations of the face of a person are the most suitable to be employed for intoxication discrimination.

Figure 9.

Two different persons correspond to the above colored matrices. The employed neural networks with structures [49 49 1] converge at the red areas giving to the areas of the forehead, the nose and the mouth desirable drunk discrimination capabilities.

In the next approach, the whole face of each specific person is examined as a single area of 5000 pixels (50 × 100), as shown in Figure 10. Our purpose is to be able to discriminate between the sober and the drunk image of a person using a specific neural structure which has been trained with information coming from the same person.

Figure 10.

The whole black region drawn on the face of a specific person in the above thermal image was used for training a single network.

The big region of 5000 pixels gives 5000 different vectors of 49-elements each, as a window 7 × 7 is moving around the image of the sober person. Another 5000 vectors are obtained from the thermal image of the drunk person. A neural structure of 3 layers with 49 neurons in the first layer, 49 neurons in the hidden layer and 1 neuron in the output layer, that is, a [49 49 1] structure, was trained with the above data. Next, the recognition procedure was tried. The same trained network was tested with the same data and resulted in satisfactory performance. When the output was closer to zero, the pixel was declared to belong to a sober person (black), otherwise (closer to one) it was declared to represent a drunk person (white). In Figure 11, the results of this experiment are presented. The achieved performance for the pixels of the sober image is 89.22%, while for the drunk image, the performance is 87.09%, with even higher performance at the regions of the forehead and nose. In conclusion, the training of a single neural network using information from the whole face can easily point out the regions which better support drunk discrimination.

Figure 11.

Discrimination results obtained by a neural structure [49 49 1], trained with the same data from the whole region. The achieved performance for the pixels of the sober image (left in black) is 89.22% while for the drunk image (right in white) the performance is 87.09%, with even higher performance at the regions of the forehead and nose.

The case of a neural network trained with the data from the whole face of a specific person (sober and drunk) and tested using the data of the rest persons is discussed also. Accordingly, if the person is sober, his face should be black, while if he is drunk, his face should be white. Relative results are demonstrated in Figure 12. A neural structure of [9 30 1] was trained using data from person 1 and tested using the data from person 2. At the left of Figure 12 the face should be black recognized as belonging to the sober person, while at the right it should be white since it corresponds to the drunk person. The depicted performance is satisfactory and an operator can discriminate the sober from the drunk easily. It is worth emphasizing that according to the images in Figure 12, the pixels on the forehead and the nose are correctly classified in almost all cases.

Figure 12.

Results obtained when a neural structure of [9 30 1] was trained using data from person 1 and tested using the data from person 2. At the left the face should be black recognized as belonging to the sober person while at the right it should be white since it corresponds to the drunk person.

Since the forehead is a very promising location on the face for intoxication identification, the above procedure was repeated only on the region of the forehead for all persons participated in the experiment. The area employed was shown in Figure 13. Accordingly, the neural structures are trained with the data from one person and tested with the data from the rest persons. The use of the forehead area for intoxication identification led to two significant conclusions: (a) The small neural structures have better identification performance since they achieve better generalization behavior during training. (b) Their success is on average 90%.

Figure 13.

The black region on the forehead of the persons was employed to test the neural structures for identifying drunk persons.

As a general conclusion of this final approach, we can say that we can decide with 90% confidence if a person is drunk or not using a small neural structure. No data records of the inspected persons when they are sober are needed for comparison.

9. Temperature distribution on the eyes

Temperature distribution on the eyes of sober and drunk persons is studied by means of thermal infrared images (Figure 14). It is observed that the temperature difference between the sclera and the iris is zero for the sober person and increases when somebody consumes alcohol (Figure 15). For the drunk person, iris appears darker compared to sclera which means that the sclera temperature increases. This is something expected since the sclera is full of blood vessels which present increased activity when the person consumes alcohol. Thus, in a screening procedure for drunk identification, the infrared images of the sober person are not needed. Although in most cases the sclera is brighter than the iris for the drunk persons, in case that their gray level difference is very small, histogram modification algorithms can be used to enhance this difference and show off intoxication. In order to express the confidence of the method in drunk person discrimination, the Student t-test was employed. The results gave over 99% confidence of the discrimination inference.

Figure 14.

Sclera is surrounding iris which is actually a muscle controlled part of the eye to adjust the size of the pupil. Sclera lies on a net of blood vessels.

Figure 15.

The left thermal image corresponds to the face of the sober person while the right to the face of the corresponding drunk person. For the drunk person the sclera becomes hotter from the iris.

Specifically, for the 28 among the 41 people who participated in the experimental procedure, it is evident by a simple comparison of the thermal images that the sclera becomes hotter compared to the iris after alcohol consumption. The images in Figure 15, where the iris is darker than the sclera (right), have not undergone any kind of preprocessing. This difference between the sclera and the iris becomes evident for four more persons when a histogram equalization algorithm is applied. Initially, for all these sober persons the sclera and the iris appeared with the same gray level as being in the same temperature (Figure 15, left image). Finally, four more persons presented this difference when a histogram clipping algorithm was applied which clips all values below 0.5 and above 0.75 of the gray level range and after that stretches the remaining histogram to its full range (MATLAB imadjust ([0.5 0.75], [0 1])). For five persons who used to drink alcohol and their breathalyzer indication was below 0.4 mg/L, it was not possible to show off the difference between the sclera and the iris.

The temperature difference between the sclera and the iris was examined [4, 6] based on the statistics of the pixels in these two regions by means of two different estimation procedures which correspond to two different discrimination features. In the first procedure, the ratio of the mean value of the pixels inside the sclera to the mean value of the pixels inside the iris was calculated. This procedure was performed on the left eye of each participant, both in the case he is sober and when he consumed alcohol. Consequently, two ratios of the mean value of the sclera to the mean value of the iris are available. It is observed that the ratio of the mean pixel value on the sclera to the mean value of the iris increases when the person has consumed alcohol. Specifically, for the 36 from the 41 cases, the specific ratio increases with alcohol consumption while only in 2 cases it decreases, and in the rest 3 it remains almost the same. The results were analyzed using the Students-t test, in order to support statistically the drunk screening capabilities of the proposed method from eye thermal images. In the second procedure, is estimated the variance of the pixels contained in the whole eye. This evaluation was performed for the left eye of each participant when the person is sober and when he is drunk. Therefore, two variances for each participant have been calculated, corresponding to sober and drunk person, respectively. It is observed that the variance increases in case that the person has consumed alcohol. Specifically, among the 41 participants only 4 presented decreased variance in the region of the eye for the drunk person compared to the sober one.

The proposed method presents the advantage that there is no need for comparison with the image of the sober person to infer for the intoxication situation. Simply, if an inspected person presents a gray level difference between the sclera and the iris, it has to be further tested for alcohol consumption with conventional means.

10. Face isothermal regions for drunk person identification

Drunk persons can be discriminated from sober ones using face isothermal regions. For this purpose, the morphological feature vector called pattern spectrum and support vector machines (SVMs) are employed for feature extraction and classification. Two different approaches are employed for extracting the isothermal regions of the face giving continuous vectors for intoxication identification. In the first one, the histogram of the face is divided (both of the sober and the drunk person) into equal regions. In the second approach, we examine in which isothermal region the whole forehead lies for the sober and the drunk person and which other regions of the face are within these isotherms.

Anisotropic diffusion was applied on the thermal infrared images for smoothing boundaries before extracting the isothermal regions. Specifically, anisotropic diffusion [28, 34] is used for noise removal, homogenization of regions, and detail preservation. The morphological feature vector called pattern spectrum [25, 26, 35] is extracted from the isothermal regions and transferred to the SVMs [36, 37, 38] for recognition of intoxicated persons. The identification success is found to be over 80% which is considered satisfactory.

Initially, four different types of isotherms were implemented, as follows:

  • Equidistant in the histogram range (0–255).

  • Equal populated in the histogram range.

  • Isolation of a single isotherm on the image.

    Arbitrary determination of each isotherm is min and max (e.g. based on the minima of the histogram).

Figure 16 illustrated one example for each case. Experimentation on these four different types of isotherms revealed that only two of them are suitable for identifying drunk persons. Specifically, the first and the fourth type of isotherms, that is, equidistant and arbitrary determination as stated in the begging of the section. Using these two different types of isotherms in combination with anisotropic diffusion and morphology, isothermal features have been extracted for identifying intoxicated persons.

Figure 16.

The four different types of isotherms on the face, (a) equidistant in the histogram range (0–255) in eight regions, (b) equal populated in the histogram range in eight regions, (c) isolation of a single isotherm on the image, (d) arbitrary determination of each isotherm min and max (e.g. based on the minima of the histogram).

In the first approach, the histogram of the face, both of the sober and the drunk person, is divided into equal regions. It was found that the best number of isothermal regions is eight. In this case, we have the maximum perceptual information on the different isothermal regions. The majority of the pixels on the face belong to the two higher regions with pixel values from 191.25 up to 255. These two regions (191–223.125 and 223.125–255) occupy almost the whole face and their shape will be used to discriminate between sober and drunk. The shape also of the whole isothermal region 191.25–255 will be tested for identifying intoxicated persons. In Figure 17(a) and (b), the regions 191.25–223.125, 223.125–255 and the whole region 191.25–255, respectively, are shown in red. The regions become larger in case of the drunk person as it is easily recognizable. Accordingly, someone can decide which person is the drunk, if both red images are available. The basic problem is that the images of the sober person are never available and thus in real problems, there is not any capability of comparisons.

Figure 17.

The images on the top row (a) correspond to sober person while in the bottom row to the corresponding drunk one. In the left column the red pixels have values in the range 191.25–223.125, in the middle column the pixel values lie in the range 223.125–255, while in the right column in the range 191.25–255. The regions, as it is easily recognizable, become wider in case of the drunk person.

In the second approach, features that could help to recognize the drunk person without using the images of the sober counterpart, were tested. Accordingly, it is examined in which of the isotherms the forehead lies for the sober person and which other regions of the face are within these isotherms. After that it is examined in which isotherms the forehead lies for the drunk person and which other regions of the face are within these isotherms. Thus, we do not care about the value of each isotherm but we care about the isothermal regions of the face that include the forehead. From Figure 18, it is evident that some regions below the forehead are not isothermal with the forehead for the drunk person. The area of the red region decreases. This can be easily monitored in real time problems by an operator (e.g. policemen).

Figure 18.

(a) The region of isotherms that contains the forehead, for the sober person includes other regions as well. For the drunk person (b) the forehead is more isolated.

Measuring the decrease of area details is a task that can be performed using morphological granulometria [26, 35], that is, successive openings with an increase in size structuring element (pattern spectrum). All the above procedures for feature extraction using isothermal regions of the face were applied on the infrared images after a light smoothing preprocessing was performed. The smoothing processing which was applied is the anisotropic diffusion. The simple spectrum without diffusion gave the largest success which reaches 86%. These values correspond to Linear and Precomputed Kernel types in the SVMs. This case is the most interesting one, since the drunk person is recognized from the fact that the isothermal region in which the forehead belongs contains actually no other region of the face. Thus, for intoxication identification, the thermal image of the drunk person is adequate to verify the drunkenness, and no comparison with the image of the sober person is needed.

11. Markov chains for drunk identification

In this approach, the features used for intoxicated person discrimination are the eigenvalues of the transition matrices which correspond to Markov chains [39] used to model the pixels on the area of the forehead of each person. In the experimental procedure followed, a region on the forehead of both the drunk and the sober person was obtained, having 25 × 50 pixels size as shown in Figure 19, for a specific participant in the experiment. For this region, separately for the sober and the drunk person, the pixels of the forehead were brought into histograms of 8, 16, and 32 equal populated bins, respectively. The reduction of the histogram size from 256 graylevel representation to either 8, 16, or 32 bins were necessary to avoid sparse two-dimensional transition matrices of first or second order, due to small number of pixels (25 × 50) in the inspected area of the forehead. For each person, a total of three transition matrices are created for the image of the sober and three for the image of the drunk. Accordingly, for the face in Figure 1 a more-or-less equal populated histogram with 16 bins is shown in Figure 20a. The transition matrix regarding the pixels of the histogram bins in Figure 20a is depicted as a black and white image in Figure 20b. This transition matrix of a Markov chain is a special tool for studying second-order statistics on the forehead, that is, co-occurrence properties of the pixels.

Figure 19.

Region on the forehead where the Markov properties of the pixels are studied.

Figure 20.

(a) A 16 bin histogram of the pixels on the forehead of a sober person, with 16 bins, (b) the transition matrix of the pixels on the forehead of the sober person in Figure 1. Quantization in a 16 bin histogram has been used.

Using the 41 × 50 16-D eigenvalue vectors for sober and that many for the drunk as data, a three layer neural network with different neurons at each layer, were trained. It was found that 16 neurons at the input layer are sufficient for the network to converge satisfactorily. In this process, a network was trained with the data from 40 people and its behavior was tested on the 41st (leave-one-out method). This process was repeated by excluding and testing each one of the 41 people. Each time a new network of neurons was trained, it was found that its magnitude is the same as the previous procedure having 16 input neurons. It was found in all cases that the person to be checked was correctly classified. The convergence success gave each time training error less than 2%. This fact shows that only one person out of 41 is not correctly identified, if he is drunk or not. This result is obtained for the case where 16 states have been used and therefore 16 eigenvalues of the transition matrix of each image.

Based on these results, a network of neurons of relatively small size may contain all the necessary information for separating the sober from the drunk. This conclusion results from all data obtained from the people participated in the experiment. This network can be integrated into an automatic intoxication detection system, which by using the face image of an intoxicated person end evaluating, the pixel transition matrix from the forehead will employ the vectors of the eigenvalues of the transition matrix for recognition. A 2D representation of the eigenvalues space (from 16D) for a specific person is shown in Figure 21. The separability of the sober and drunk person is obvious even in the 2-D space.

Figure 21.

2D representation of the eigenvalues space (from 16D) for a specific person. The two coordinates correspond to the largest eigenvalues. The separability of the sober and drunk person is obvious even in the 2D space.

12. Future perspectives: fusion approaches

The material presented in this chapter appears in the literature for first time worldwide. It is actually the first approach worldwide to address drunkenness by means of thermal infrared images of the face of the inspected person. This material was based on a PhD thesis carried out in the Physics Department University of Patras, Greece [39]. The whole material incorporates seven different approaches for feature extraction used for identifying drunk persons. All methods have been presented in scientific journals. The scientific work was based on a well-organized experimental procedure based on which thermal images of 41 persons were recorded when they were sober and when they were drunk. A well-organized database and the basic routines are available to access the thousands of images recorded (http://www.physics.upatras.gr/sober/).

All seven methods analyzed present high drunk identification success which is over 80%. It is expected that combining all this information into a unified identification procedure, the success rate will approach 100%. This requires sophisticated information fusion techniques which will employ combination of different kind of information. Work is currently carried out on this interesting topic. An important aspect toward completing this task is to elaborate on the correlation properties of the different features information extracted from the seven methods. In practice, an electronic system incorporating a thermal infrared camera can embody one of the proposed methods and point out to the police to whom an extended inspection for alcohol consumption is due.

The presented methodologies have found great recognition and publicity from the scientific community and the media [10, 11, 12, 13].

The main advantages of these methods are as follow:

  • They are not invasive and all the information is acquired remotely.

  • The images do not depend on the existing natural lightning, but absolutely on the face temperature.

  • Infrared images are obtained even in the drunk.

  • Most of the drunk identification methods require the images of the drunk persons only to perform. This means that the approaches are independent of the thermal image of the sober person and it is not required in a database the images of the sober persons for comparisons. That is, the inspected person can be any unknown person and its sober signatures are not required.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Georgia Koukiou (December 21st 2017). Intoxication Identification Using Thermal Imaging, Human-Robot Interaction - Theory and Application, Gholamreza Anbarjafari and Sergio Escalera, IntechOpen, DOI: 10.5772/intechopen.72128. Available from:

chapter statistics

249total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Socially Believable Robots

By Momina Moetesum and Imran Siddiqi

Related Book

First chapter

Motion Feature Quantification of Different Roles in Nihon-Buyo Dance

By Mamiko Sakata, Mieko Marumo, and Kozaburo Hachimura

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More about us