The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking

Eye-tracking applications are used in large variety of fields of research: neuro-science, psychology, human-computer interfaces, marketing and advertising, and computer science. The commonly known techniques are: contact lens method (Robinson, 1963), electro-oculography (Kaufman et al., 1993), limbus tracking with photo-resistors (Reulen et al., 1988; Stark et al., 1962), corneal reflection (Eizenman et al., 1984; Morimoto et al., 2000) and Purkinje image tracking (Cornsweet & Crane, 1973; Crane & Steele, 1978).

space and its projection on a plane can be exploited to gather an eye tracking technique that resorts on the limbus position to track the gaze direction on 3D. In fact, the ellipse and the circle are two sections of an elliptic cone whose vertex is at the principal point of the camera. Once the points that define the limbus are located on the image plane, it is possible to fit the conic equation that is a section of this cone. The gaze direction can be obtained computing which is the orientation in space of the circle that produces that projection (Forsyth et al., 1991;Wang et al., 2003). From this perspective, the more the limbus detection is correct, the most the estimation of gaze comes to be precise and reliable. In image based techniques, a common way to detect the iris is first to detect the pupil in order to start from a guess of the center of the iris itself, and to resort on this information to find the limbus (Labati & Scotti, 2010;Mäenpää, 2005;Ryan et al., 2008).
Commonly in segmentation and recognition the iris shape on the image plane is considered to be circular, (Kyung-Nam & Ramakrishna, 1999;Matsumoto & Zelinsky, 2000) and to simplify the search for the feature, the image can be transformed from a Cartesian domain to a polar one (Ferreira et al., 2009;Rahib & Koray, 2009). As a matter of fact, this is true only if the iris plane is orthogonal to the optical axis of the camera, and few algorithms take into account the projective distortions present in off-axis images of the eye and base the search for the iris on an elliptic shape (Ryan et al., 2008). In order to represent the image in a domain where the elliptical shape is not only considered, but also exploited, we developed a transformation from the Cartesian domain to an "elliptical" one, that transform both the pupil edge and the limbus into straight lines. Furthermore, resorting on geometrical considerations, the ellipse of the pupil can be used to shape the iris. In fact, even though the pupil and the iris projections are not concentric, their orientation and eccentricity can be considered equal. From this perspective, a successful detection of the pupil is instrumental for iris detection, because it allows for a domain to be used for the elliptical transformation, and it constrains the search for the iris parameters.
The chapter is organized as follows: in Sec. 3 we present the eye structure, in particular related to pupil and iris, and the projective rule on the image plane; in Sec. 4 we show how to fit the ellipse equation on a set of points without any constraint or given its orientation and eccentricity; in Sec. 5 we demonstrate how to segment the iris, resorting on the information obtained by the pupil and we show some results achieved on an iris database and on the images acquired by our system; in Sec. 6 we show how the fitted ellipse can be used for gaze estimation and in Sec. 7 we introduce some discussions and we present our conclusion.

Related works
The study of eye movements anticipates the actual wide use of computers by more than 100 years, for example, Javal (1879). The first methods to track eye movements were quite invasive, involving direct mechanical contact with the cornea. A first attempt to develop a not invasive eye tracker is due to Dodge & Cline (1901) which exploited light reflected from the cornea. In the 1930s, Miles Tinker and his colleagues began to apply photographic techniques to study eye movements in reading (Tinker, 1963). In 1947 Paul Fitts and his colleagues began using motion picture cameras to study the movements of pilots' eyes as they used cockpit controls and instruments to land an airplane (Fitts et al., 1950). In the same tears Hartridge & Thompson (1948) invented the first head-mounted eye tracker. One reference work in the gaze tracking literature is that made by Yarbus in the 1950s and 1960s (Yarbus, 1959). He studied eye movements and saccadic exploration of complex images, recording the eye movements performed by observers while viewing natural objects and scenes. In the 1960s, Shackel (1960) and Mackworth & Thomas (1962) advanced the concept of head-mounted eye tracking systems making them somewhat less obtrusive and further reducing restrictions on participant head movement (Jacob & Karn, 2003).
The 1970s gave an improvement to eye movement research and thus to eye tracking. The link between eye tracker and psychological studies got deeper, looking at the acquired eye movement data as an open door to understand the brain cognitive processes. Efforts were spent also to increase accuracy, precision and comfort of the device on the tracked subjects. The discovery that multiple reflections from the eye could be used to dissociate eye rotations from head movement (Cornsweet & Crane, 1973), increased tracking precision and also prepared the ground for developments resulting in greater freedom of participant movement (Jacob & Karn, 2003).
Historically, the first application using eye tracking systems was the user interface design. From the 1980s, thanks to the rapid increase of the technology related to the computer, eye trackers began to be used also in a wide variety of disciplines (Duchowski, 2002): • human-computer interaction (HCI) •n e u r o s c i e n c e • psychology • psycholinguistics • ophthalmology • medical research • marketing research • sports research Even if commercial applications are quite uncommon, a key application for eye tracking systems is to enable people with severe physical disabilities to communicate and/or interact with computer devices. Simply by looking at control keys displayed on a computer monitor screen, the user can perform a broad variety of functions including speech synthesis, control, playing games, typing. Eyetracking systems can enhance the quality of life of a disabled person, his family and his community by broadening his communication, entertainment, learning and productive capacities. Additionally, eyetracking systems have been demonstrated to be invaluable diagnostic tools in the administration of intelligence and psychological tests. Another aspect of eye tracking usefulness could be found in the cognitive and behavioural therapy, a branch of psychotherapy specialized in the treatment of anxiety disorders like phobias, and in diagnosis or early screening of some health problems. Abnormal eye movement can be an indication of diseases in balance disorder, diabetic retinopathy, strabismus, cerebral palsy, multiple sclerosis. Technology offers a tool for quantitatively measuring and recording what a person does with his eyes while he is reading. This ability to know what people look at and don't look at has also been widely used in a commercial way. Market researchers want to know what attracts people's attention and whether it is good attention or annoyance. Advertisers want to know whether people are looking at the right things in their advertisement. Finally, we want to emphasize the current 3 The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com and prospective aspect of eye and gaze tracking in game environment, either in rehabilitation, an entertainment or an edutainment context.
A variety of technologies have been applied to the problem of eye tracking.

Scleral coil
The most accurate, but least user-friendly technology uses a physical attachment to the front of the eye. Despite the older generation and its invasivity, the scleral coil contact lens is still one of the most precise eye tracking system (Robinson, 1963). In this table-mounted systems, the subject wears a contact lens with two coils inserted. An alternate magnetic field allows for the measurement of horizontal, vertical and torsional eye movements simultaneously. The real drawback of this technique is its invasivity respect to the subject, in fact it can decrease the visual acuity, increase the intraocular pressure, and moreover it can damage the corneal and conjunctival surface.

Electro-oculography
One of the least expensive and simplest eye tracking technologies is recording from skin electrodes, like those used for making ECG or EEG measurements. This method is based on the electrical field generated by the corneo-retinal potential, that can be measured on the skin of the forehead (Kaufman et al., 1993). The orientation of this field changes with the rotation of the eyes, and can be measured by an array of electrodes placed around the eyes. The electrical changes are subsequently processed to relate them with the movements of the eyes. Beyond the limited precision of this technique, there are some problems to be faced, as the contraction other than the eye muscles (like facial or neck) and eye blinking, that affect the electric potential related to eye movements, or as a correct and stable coupling of the electrodes, that ensures a measure of the field that is constant and reliable over time.
Most practical eye tracking methods are based on a non-contacting camera that observes the eyeball plus image processing techniques to interpret the picture.

Optical reflections
A first category of camera based methods use optical features for measuring eye motion. Light, typically infrared (IR), is reflected from the eye and sensed by a video camera or some other specially designed optical sensor. The information is then analyzed to extract eye rotation from changes in reflections. We refer to them as the reflections based systems.

• Photo-resistor measurement
This method is based on the measurement of the light reflected by the cornea, in proximity of the vertical borders of iris and sclera, i.e. the limbus. The two vertical borders of the limbus are illuminated by a lamp, that can be either in visible light (Stark et al., 1962) or in infra-red light (Reulen et al., 1988). The diffuse reflected light from the sclera (white) and iris (colored) is measured by an array of infra-red light photo-transducers, and the amount of reflected light received by each photocell are functions of the angle of sight.
Since the relative position between the light and the photo-transducers needs to be fixed, this technique requires a head-mounted device, like that developed by (Reulen et al., 1988). The authors developed a system that, instead of measuring the horizontal movements only, takes into account the vertical ones as well. Nevertheless, the measures can not be effectuated simultaneously, so they are performed separately on the two eyes, so that one is used to track the elevation (that can be considered equal for both the eyes), and one for the azimuth.

• Corneal reflection
An effective and robust technique is based on the corneal reflection, that is the reflection of the light on the surface of the cornea (Eizenman et al., 1984;Morimoto et al., 2000). Since the corneal reflection is the brightest reflection, its detection is simple, and offers a stable reference point for the gaze estimation. In fact, assuming for simplicity that the eye is a perfect sphere which rotates rigidly around its center, the position of the reflection does not move with the eye rotation. In such a way, the gaze direction is described by a vector that generates from the corneal reflection to the center of the pupil or of the iris, and can be mapped to screen coordinates on a computer monitor after a calibration procedure. The drawback of this technique is that the relative position between the eye and the light source must be fixed, otherwise the reference point, i.e. the corneal reflection, would move, voiding the reliability of the system. This technique, in order to be more robust and stable, requires an infrared light source to generate the corneal reflection and to produce images with a high contrast between the pupil and the iris.

• Purkinje images
The corneal is the brightest reflection created by the eye, but it is not the only one. The different layers of the eye produce other reflections, the Purkinje images, that are used in a very accurate eye tracking techniques (Cornsweet & Crane, 1973;Crane & Steele, 1978). From the first image to the fourth, the Purkinje images are respectively the reflection from the outer surface of the cornea, from the inner surface of the cornea, from the anterior surface of the lens, and finally the reflection from the posterior surface of the lens. Special hardware is required to detect the Purkinje images beyond the first, but such image allow the estimation of the three-dimensional point of regard.
In the last decades, another type of eye tracking family became very popular, thanks to the rapid increase of the technology related to the computer, together with the fact that it is completely remote and non-intrusive: the so called image based or video based eye tracker.

Image based
These systems are based on digital images of the front of the eye, acquired from a video camera and coupled with image processing and machine vision hardware and software. Two types of imaging approaches are commonly used: visible and infrared spectrum imaging (Li et al., 2005). This category of eye tracker algorithms is based on the geometric structure of the eye and on the tracking of its particular features: the pupil -the aperture that lets light into the eye, the iris -the colored muscle group that controls the diameter of the pupil, and the sclera, the white protective tissue that covers the remainder of the eye. A benefit of infrared imaging is that the pupil, rather than the limbus, is the strongest feature contour in the image. Both the sclera and the iris strongly reflect infrared light while only the sclera strongly reflects visible light. Tracking the pupil contour is preferable given that the pupil contour is smaller and more sharply defined than the limbus. Furthermore, due to its size, the pupil is less likely to be occluded by the eyelids. Pupil and iris edge (or limbus) are the most used tracking features, in general extracted through the computation of the image gradient (Brolly & Mulligan, 2004;Ohno et al., 2002;Wang & Sung, 2002;Zhu & Yang, 2002), or fitting a template model to the 5 The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com image and finding the best one consistent with the image (Daugman, 1993;Nishino & Nayar, 2004).

Perspective geometry: from a three-dimensional circle to a two-dimensional ellipse
If we want to resort on the detection of the limbus for tasks like iris segmentation and eye tracking, it is necessary good knowledge of the geometrical structure of the eye, in particular of the iris, and to understand how the eye image is projected on the sensor of a camera.

Eye structure
As it is evident from Fig. 1, the human eye is not exactly a sphere, but it is composed of two parts with different curvatures. The rear part is close to a sphere with radius ranging from 12 to 13 mm, according to anthropomorphic data. The frontal part, where the iris resides, is formed by two chambers, the anterior and the posterior one which are divided by iris and lens. The iris, the pupil, and the anterior chamber are covered by the cornea, that is a transparent lens with fixed focus. The crystalline lens is a lens with variable curvature that changes the focal distance of the eye in order to obtain on focus image of the object of interest on the retina. The cornea (about 8 mm in radius) is linked to a larger unit called the sclera (about 12 mm in radius) by a ring called the limbus, that is the external edge of the iris. The most important function of the iris is to work as a camera diaphragm. The pupil, that is the hole that allows light to reach the retina, is located in its center. The size of the pupil is controlled by the sphincter muscles of the iris, that adjusts the amount of light which enters the pupil and falls on the retina of the eye. The radius of the pupil consequently changes from about 3 to 9 mm, depending on the lighting of the environment.
The anterior layer of the iris, the visible one, is lightly pigmented, its color results from a combined effect of pigmentation, fibrous tissue and blood vessels. The resulting texture of the 6 Human-Centric Machine Vision

www.intechopen.com
The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking 7 iris is a direct expression of the gene pool, thus is unique for each subject like fingerprints. The posterior layer is very darkly pigmented, contrary to the anterior one. Since the pupil is a hole in the iris, is the most striking visible feature of the eye, because its color, except for corneal reflections, is dark black. Pigment frill is the boundary between the pupil and the iris. It is the only visible part of the posterior layer and emphasizes the edge of the pupil.
The iris portion of the pigment frill is protruding respect to the iris plane of a quantity that depends on the actual size of the pupil. From this perspective, even if the iris surface is not planar, the limbus can be considered lying on a plane (see Fig. 1, green line). Similarly, the pupil edge lies on a plane that is a little bit further respect to the center of the eye, because of the protrusion of the pupil (see Fig. 1, magenta line). For what concerns the shape of the pupil edge and the limbus, for our purpose we consider them as two co-axial circles.

Circle projection
Given an oriented circle C in 3D world space, this is drawn in perspective as an ellipse. This means that if we observe an eye with a camera, the limbus, being approximated by a circle, will project a corresponding perspective locus in terms of the Cartesian coordinates of the camera image plane which satisfy a quadratic equation of the form: in which the column vectors d and z are, respectively, termed the dual-Grassmannian and Grassmannian coordinates of the conics, and where 4z 1 z 3 − z 2 > 0 to be an ellipse. In the projective plane it is possible to associate to the affine ellipse, described by Eq.1, its homogeneous polynomial w 2 f (x/w, y/w) obtaining a quadratic form: Posing Eq.2 equal to zero gives the equation of an elliptic cone in the projective space. The ellipse in the image plane and the limbus circle are two sections of the same cone, whose vertex is the origin, that we assume to be at the principal point of the camera. The quadratic form in Eq.2 can also be written in matrix form. Let x be a column vector with components [x; y; w] and Z the 3x3 symmetric matrix of the Grassmannian coordinates: where the subscript means that the associated matrix to the quadratic form is Z.T ogetherwith its associated quadratic form coefficients, an ellipse is also described, in a more intuitive way, through its geometric parameters: center (x c , y c ), orientation ϕ, major and minor semiaxes [a,b]. Let see how to recover the geometric parameters knowing the quadratic form matrix Z. The orientation of the ellipse can be computed knowing that this depends directly from the xy term z 2 of the quadratic form. From this we can express the rotation matrix R ϕ : If we apply the matrix R T ϕ to the quadratic form in Eq.3, we obtain: We obtain the orientation of the ellipse computing the transformation Z ′ = R ϕ ZR T ϕ which nullifies the xy term in Z, resulting in a new matrix Z ′ : This is characterized by the angle ϕ: Once we computed Z ′ , we can obtain the center coordinates of the rotated ellipse resolving the system of partial derivative equations of Q Z ′ (x) with respect to x and y, obtaining: Then, we can translate the ellipse through the matrix T, The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking 9 to nullify the x and y term of the quadratic form: Now the major and minor semiaxes of the ellipse will be:

Pupil and Iris projection
Pupil and iris move together, rotating with the eye, so are characterized by equal orientation in the space. As shown in Fig.2, the slight difference in position between the pupil and iris plane cause the two projective cones to be not coaxial. Though this difference is relatively small, this fact reflects directly on the not geometrical correspondence between the center coordinates of the two projected ellipses on the image plane: pupil and limbus projections are not concentric. From Fig.2, it is also evident that, for obvious reasons, the dimensions of the two ellipse, i.e. the major and minor semiaxes, are very different (leaving out the fact that pupil changes its aperture with the amount of light). On the other side, if we observe the shape of the two ellipses, we can see that there are no visible differences: one seems to be the scaled version of the other. This characteristic is enclosed in another geometric parameter of the elliptic curve (and of the conic section in general): the eccentricity. The eccentricity of the ellipse (commonly denoted as either e or ǫ)isdefinedasfollow: where a and b are the major and minor semiaxes. Thus, for ellipse it assumes values in the range 0 < ε < 1. This quantity is independent of the dimension of the ellipse, and acts as a scaling factor between the two semiaxes, in such a way that we can write one semiaxis as 9 The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com

Perspective view Top view
Side view Image Plane Fig. 2. Cone of projection of limbus (red) and pupil (blue) circles. For sake of simplicity, the limbus circle is rotated about its center, that lies along the optical axis of the camera. The axis of rotation is vertical, providing a shrink of the horizontal radius on the image plane. On the image plane, the center of the limbus ellipse, highlighted by the major and minor semiaxes, is evidently different from the actual center of the limbus circle, that is the center of the image plane, and is emphasized by the projection of the circle radii (gray lines).
function of the other: a = b √ 1 − ε 2 . In our case, pupil and limbus ellipses have, in practical, the same eccentricity: we speak about differences in the order of 10 −2 . It remains to take into account the orientation ϕ. Also in this case, as for the eccentricity, there are no essential differences: we can assume that pupil and limbus share the same orientation, unless errors in the order of 0.01 • .

Ellipse fitting on the image plane 4.1 Pupil ellipse fitting
The ellipse fitting algorithms presented in literature can be collected into two main groups: voting/clustering and optimization methods. To the first group belong methods based on the Hough transform (Leavers, 1992;Wu & Wang, 1993;Yin et al., 1992;Yuen et al., 1989), on RANSAC (Rosin, 1993;Werman & Geyzel, 1995), on Kalman filtering (Porrill, 1990;Rosin & West, 1995), and on fuzzy clustering (Davé & Bhaswan, 1992;Gath & Hoory, 1995). All these methods are robust to occlusion and outliers, but are slow, heavy from the memory allocation point of view and not so accurate. In the second group we can find methods based on the Maximum Likelihood (ML) estimation (Chojnacki et al., 2000;Kanatani & Sugaya, 2007;Leedan & Meer, 2000;Matei & Meer, 2006). These are the most accurate methods, whose solution already achieves the theoretical accuracy Kanatani-Cramer-Rao (KCR) limit. First introduced by Kanatani (1996;1998) and then extended by Chernov & Lesort (2004), KCR limit is for geometric fitting problems (or as Kanatani wrote "constraint satisfaction problems") the analogue of the classical Cramer-Rao (CR) limit, traditionally associated to linear/nonlinear regression problems: KCR limit represents a lower bound on the covariance matrix of the estimate. The problem related to these algorithms is that they require iterations for non linear optimization, and in case of large values of noise, they often fail to converge. They are computationally complex and they do not provide a unique solution. Together with ML methods, there is another group of algorithms that, with respect to a set of parameters describing the ellipse, minimizes a particular distance measure function between the set of points to be fitted and the ellipse. These algorithms, also referred as "algebraic" methods, are preferred because they are fast and accurate, notwithstanding they may give not optimal solutions. The best known algebraic method is the least squares, or algebraic distance minimization or direct linear transformation (DLT). As seen in Eq.1, a general ellipse equation can be represented as a product of vectors: Given a set of N point to be fitted, the vector d becomes the N × 6 design matrix D The least square fitting implies to minimize the sum of the squared distance (or algebraic distance) of the curve to each of the N points: argmin z Dz 2 . ( 4 ) Obviously, Eq.4 is minimized by the null solution z = 0 if no constraint is imposed. The most cited DLT minimization in eye tracking literature is (Fitzgibbon et al., 1996). Here the fitting problem is reformulated as: where S = D T D is the scatter matrix, and C the 6 × 6 constraint matrix: The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking

www.intechopen.com
The problem is solved by a quadratically constrained least squares minimization. Applying the Lagrange multipliers and differentiating, we obtain the system: solved by using generalized eigenvectors. Halir & Flusser (1998) found some problems related to the Fitzgibbon et al. (1996) algorithm: • the constraint matrix C is singular • the scatter matrix S is also close to be singular, and it is singular when ideally the points' set lies exactly on an ellipse • finding eigenvectors is an unstable computation and can produce wrong solutions. Halir & Flusser (1998) proposed a solution to these problems breaking up the design matrix D into two blocks, the quadratic and the linear components: Next, the scatter matrix S becomes: The constrained matrix C: The Grassmanian coordinate vector z of the conic is split in: Similarly the eigensystem problem presented in Eq. 6 can be divided into two equation: from which we have: z 2 = −S −1 3 S T 2 z 1 and, substituting in Eq. 7, obtaining: It was shown that there is only one elliptical solution z e 1 of the eigensystem problem in Eq.8, corresponding to the unique negative eigenvalue of M. Thus, the fitted ellipse will be 12 Human-Centric Machine Vision

www.intechopen.com
The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking 13 described by the vector: 9) or equivalently, by the matrix associated to the quadratic form: z e 1 z e 2 /2 z e 4 /2 z e 2 /2 z e 3 z e 5 /2 z e 4 /2 z e 5 /2 z e 6 ⎤ ⎥ ⎦ Recently, Harker et al. (2008) increased the numerical stability by introducing a translation and scaling factor on the data, to yield a so-called mean free coordinates, and improving the matrix partitioning. Once obtained the quadratic form coefficients of the ellipse it remains only to recover the geometric parameters as seen in Sec.3: the center of the coordinates (x c ,y c ), major and minor semiaxes (a,b), and the angle of rotation from the x-axis to the major axis of the ellipse (ϕ).

Iris ellipse fitting
Once we have fitted the pupil ellipse in the image plane, we can think, as suggested at the end of Sec.3, to exploit the information obtained from the previous fitting: the geometric parameters of the pupil ellipse. Now, let see how we could use the orientation and eccentricity information derived from the pupil. Knowing the orientation ϕ, we could transform the (x i , y i ) data points pairs through the matrix R ϕ , obtaining: This transformation allow us to write the ellipse, in the new reference frame, without taking into account the xy term of the quadratic form. Thus, if we write the expression of a generic ellipse in (x ′ , y ′ ) reference frame, centered in (x ′ c , y ′ c ), with major semiaxes oriented along the x ′ axis, we have: If we assume x ′′ = x ′ √ 1 − ε 2 and y ′′ = y ′ , reordering the terms in x ′′ and y ′′ we have: that is the equation of a circle translated from the origin. This mean that the fitting of an ellipse in (x, y) becomes the fitting of a circle in (x ′′ , y ′′ ). The four parameters vector z ′′ =[ z ′′ 1 ; z ′′ 4 ; z ′′ 5 ; z ′′ 6 ] of the circle can be obtained using the "Hyperaccurate" fitting methods explained by Al-Sharadqah & Chernov (2009). The approach is similar to that of Fitzgibbon et al. (1996). The objective function to be minimized is always the algebraic 13 The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com distance Dz 2 , in which the design matrix D becomes an N × 4 matrix: subject to a particular constraint expressed by the matrix C. This leads the same generalized eigenvalue problem seen in Eq. 6, that is solvable choosing the solution with the smallest non-negative eigenvalue. The matrix C takes into account, with a linear combination, two constraints, introduced by Taubin and Pratt (Pratt, 1987;Taubin, 1991): The Pratt constraint, as seen in the Fitzgibbon method, can be put in a quadratic form constructing the matrix C P as follow: For the Taubin constraint, tougher mathematics is needed to make the quadratic form explicit: where D x and D y are, respectively, the partial derivatives of the design matrix with respect to x ′′ and y ′′ .Th usC Tb is where x represents the standard sample mean notation The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking 15 they verified that expressing the constraint matrix C as follow: produces an algebraic circle fit with essential bias equal to zero. For this reason they called it hyperaccurate. Once we have obtained the solution z ′′ , we must scale it to the (x ′ , y ′ ) reference frame with the scaling matrix T ecc : and then again, rotate Z ′ to the original frame (x, y), applying to rotation matrix R ϕ : z 1 z 2 /2 z 4 /2 z 2 /2 z 3 z 5 /2 z 4 /2 z 5 /2 z 6 ⎤ ⎥ ⎦

Iris and pupil segmentation
A proper segmentation of the iris area is essential in applications such iris recognition and eye tracking. In fact it defines in the first case, the image region used for feature extraction and recognition, while in the second case is instrumental to detect the size and shape of the limbus, and consequently an effective estimation of the eye rotation. The first step to be achieved for this purpose is to develop a system that is able to obtain images of the eye that are stable to different lighting conditions of the environment.

Pupil detection
• Reflex removal In order to be able to find correctly both the center of the pupil and the edge between

15
The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com the pupil and the iris, it is fundamental to remove effectively the light reflections on the corneal surface. Working with IR or near IR light, the reflection on the corneas are considerably reduced, because the light in the visible power spectrum (artificial lights, computer monitor, etc.) is removed by the IR cut filter. The only sources of reflections are the natural light and the light from the illuminator, but working in indoor environment, the first is not present. The illuminators, posed at a distance of ≈ 10 cm from the corneal surface, produce reflections of circular shape that can be removed with a morphological open operation. This operation performed on the IR image I, and it is composed of an erosion followed by a dilation: where d is the structuring element and is the same for both operations, i.e. a disk of size close to the diameter of the reflections. The operation, usually used to remove small islands and thin filaments of object pixels, with this structuring elements has also the effect of removing all the reflections smaller than the disk. The reflections position is individuated thresholding the image resulting from the subtraction of the original image I with the opened one I OP . In order not to flatten the image and to preserve the information, the new image I r is equal to the original one, except for the pixels above the threshold, that are substituted with a low-passed version of the original image. Once the corneal reflection regions are correctly located on the image, they are ignored in the next steps of the algorithm.

• Detection of the pupil center
The second step in the processing of the eye image is to roughly locate the center of the iris so to properly center the domain for the pupil edge detection. The I r is transformed into a binary image where the darkest pixels, defined by a threshold at the 10% of the image maximum, are set to 0, while the others are set to 1. In this way the points belonging to the pupil are segmented, since they are the dark part of the image. In this part of the image, are eventually present points belonging to the eyelashes, to the glasses frame and to and other elements that are as dark as the pupil (See Fig. 5). From the binary image, we calculate the chamfer distance, considering that the pixel farthest from any white pixel is the darkest one. The dark points due to other than the pupil, are usually few in number (as for eyelashes) or not far from the white ones (as for glasses frame). On the other side, the pupil area is round shape and quite thick, so that the position of the darkest pixel is usually found to be inside the pupil, and it is approximately the center of the pupil C =[ x c , y c ]. From this perspective, a diffuse and uniform illumination is helpful to isolate the correct points and thus to find the correct pupil center.

• Detection of the edge between pupil and iris
Starting from a plausible pupil center, the capability to correctly locate the pupil edge is subtended to the domain that we define for the research. From the chamfer distance it is possible to evaluate the maximum radius R max where the pupil is contained. In fact it is composed of a large number of pupil points centered around (x c , y c ), with some points belonging to eyelashes and other, spread in the image. From this perspective, the R max is computed as the first minimum of the histogram of the chamfer distance. Once the search domain is defined, the edge between the pupil and the iris can be located computing the derivative of the intensity of the image along a set of rays, originating from the center of the pupil. In such way each ray r can be written with the parametric equation: where t varies between 0 and 2π,a n dρ between 0 and R max . The directional derivative along a particular direction t = t * on a ray r(ρ, t * )is: For each ray, the edge is identified as the maximum of the derivative. Since it considers the intensity value of the image along the ray, this method can be sensitive to noise and reflections, finding false maxima, and detecting false edge points. In order to prevent the sensitivity to noise, instead of computing the derivative along the rays' direction, it is possible to compute the spatial gradient of the intensity, obtaining a more stable and effective information on the pupil edge. The gradient is computed on the smoothed image I = G * I r ,where * is the convolutional product between G and I r ,andG is the 2D Gaussian kernel used to smooth the image: Exploiting the properties of the gradient, the Eq. 10 can be written as ∇G * I r ,t h a t means that the spatial gradient is computed through the gradient of a Gaussian kernel.
Since the feature we want to track with the spatial gradient is a curve edge, the ideal filter to locate is not like those obtained by ∇G, but a filter with the same curvature of the edge. Moreover, since the exact radius of the circle is unknown, and its curvature depends on it, also the size of the filter changes with the image location. Following this considerations it is possible to design a set of spatio-variant filters that take into account both the curvature and the orientation of the searched feature, at each image location, with the consequence of increasing drastically the computational cost. The solution adopted to obtain a spatio-variant filtering using filters of constant shape and size, is to transform the image from a Cartesian to a polar domain. The polar transform of I r ,withoriginin(x c , y c ) is: where F is defined by the mapping from (x, y) to (ρ, t) such that: where I w is the warped image in the polar domain. The transform F is invertible, and defined by the mapping in Eq. 5.1. In such way the searched feature is transformed from a circle to a straight horizontal line, as for the normalization of the iris image (Ferreira et al., 2009;Rahib & Koray, 2009), and can be effectively detected considering only the first component of the spatial gradient (See Fig. 3a), i.e. (∇G) ρ * I w = ∂ I w /∂ρ. Nevertheless, as introduced in Sec. 3, the shape of the pupil edge is a circle only when the plane that lies on

17
The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com the pupil's edge is perpendicular to the optical axis of the camera, otherwise its projection on the image plane is an ellipse. In this case, a polar domain is not the ideal to represent the feature, because the edge is not a straight line (See Fig. 3b). In order to represent the image in a domain where the feature is a straight line, i.e. it can be located with a single component of the spatial gradient, we developed a transformation from the Cartesian to an "elliptical" domain: where (x ′ , y ′ ) is the (x, y) domain, rotated of an angle ϕ in such way: And from the elliptic domain to the Cartesian one: where ϕ is the orientation of the ellipse, and a = ρ is the major semi-axis, and b = a √ 1 − e 2 is the minor one.

• Pupil fitting
Since at this step no information is known about the orientation ϕ and eccentricity ε of the ellipse that describes the edge of the pupil, the points found are used to compute the ellipse parameters without any constraint, as explained in Sec. 4.1 from Eq. 8-9. At the first step the two axes are initialized to R max and ϕ to zero. Once the maxima have been located in the warped image I w , i.e. in the (ρ, t) domain, the Eq. 11 can be used to transform the points into the Cartesian coordinates system, in order to obtain a fitting for the ellipse equation of the pupil. In order to exploit the "elliptical" transformation and to obtain a more precise estimation of the ellipse, the fitting is repeated in a cycle where at each step the new domain is computed using the a, b and ϕ obtained by the fitting achieved at the previous step.

Iris detection
Analyzing both the images in the Cartesian domain I r and in the warped one I w (see Fig. 3), it is evident the high intensity change between the pupil and the iris points. With such a variation, the localization of the pupil edge is precise and stable even if a polar domain is used. Much more complicated is the detection of the limbus, for different reasons: first, the edge between iris and sclera is larger and less defined respect to the edge between pupil and iris, second, the pupil edge is almost never occluded, except during blinks, while the limbus is almost always occluded by the eyelids, even for small gaze angles. With the purpose of fitting the correct ellipse on the limbus, it is mandatory to distinguish the points between iris and sclera from the points between iris and eyelids.
• Iris edge detection Following the same procedure used to locate the points of the pupil edge, the image where the reflections are removed I r , is warped with an elliptical transformation using Eq. 11. Differently from the pupil, the domain is designed with a guess of the parameters, because, as presented in Sec. 3, the perspective geometry allows to use the same ϕ and e found for the pupil. The only parameter that is unknown is ρ, that at the first step of the iteration is defined to be within [R pupil ,3 R pupil ]. In such way it is ensured that the limbus is inside the search area, even in case of a small pupil. As in for the pupil edge, the ellipse equation that describes the limbus is obtained by the maxima of the gradient computed on the warped image. As explained in Sec. 4.2, the fitting is limited to the search of (x c , y c ) and a,b e c a u s eϕ and ε are those of the pupil. In order to prevent deformations of the ellipse due to false maxima that can be found in correspondence of eyelashes or eyebrows, we compute the euclidean distance between the maxima and the fitted ellipse. The fitting is then repeated not considering the points that are more than one standard deviation distant from the ellipse. In order to obtain a more precise identification of the iris edge, no matter if the points belong to the limbus or to the transition between the iris and the eyelids, the search is repeated in a loop where the parameters used to define the domain at the current step are those estimated at the previous one. Differently from the pupil search, the size of the parameter ρ is refined step by step, halving it symmetrically respect to the fitted ellipse.

• Removal of eyelid points
Once the correct points of the edge of the iris are found, in order to obtain correctly

19
The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com the limbus, it is necessary to remove the maxima belonging to the eyelids. Starting from the consideration that the upper and lower eyelid borders can be described by parabola segments (Daugman, 1993;Stiefelhagen & Yang, 1997), it is possible to obtain the parameters that describe the parabolas. With the specific purpose of removing the eyelid points, and without requiring to precisely locate the eyelids, it is possible to make some assumptions.
First, the parabolas pass through the eyelid corners, that slightly move with the gaze and with the aperture of the eyelids. If the camera is fixed, as in our system, those two points can be considered fixed and identified during the calibration procedure. Second, the maxima located at the same abscissa on the Cartesian image respect to the center of the iris, can be considered belonging to the upper and lower eyelids. The (x i , y i ) pairs of these points can be used in a least square minimization: where D is the N × 3 design matrix: is the parameters column vector that describe the parabola's equations, and y =[ y 1 ;...;y N ] is the ordinate column vector. The solution can be obtained solving the linear equation system of the partial derivative of Eq. 12 with respect to z: This first guess for the parabolas provides not a precise fitting of the eyelids, but a very effective discrimination of the limbus maxima. In fact it is possible to remove the points that have a positive ordinate respect to the upper parabola, and those that have a negative ordinate respect to the lower parabola, because they probably belong to the eyelids (See Fig. 6, white points). The remaining points can be considered the correct points of the edge between the iris and the sclera (See Fig. 6, red points), and used for the last fitting of the limbus ellipse (See Fig. 6, green line).

A quantitative evaluation of iris segmentation
In order to evaluate the reliability of the proposed algorithm in a large variety of cases, we performed an extensive test on the CASIA Iris Image Database (CASIA-IrisV1, 2010). After that, the algorithm was tested on images taken from a hand-made acquiring system, designed to obtain images where the eye centered in the image, with the minor number of corneal reflections possible, and taken in a indoor environment with artificial and diffuse light so to have an almost constant pupil size.

CASIA Iris Database
CASIA Iris Image Database is a high quality image database realized to develop and to test iris segmentation and recognition algorithms. In particular, the subset CASIA-Iris-Thousand contains 20.000 iris images taken in IR light, from 1.000 different subjects. The main sources of variations in the subset are eyeglasses and specular reflections.
Since the eye position in the image changes from subject to subject, it is not possible to define the eyelid corner position used to fit the eyelids parabolas. The algorithm was slightly modified to make it work with CASIA-Iris-Thousand, positioning the "fixed" points at the same ordinate of the pupil center, and at an abscissa that is ±5R pupil respect to the pupil center as well.
The correct segmentation of the iris and the pupil may fail for different reasons (See Fig 5).
Concerning the pupil, the algorithm may fail in the detection of its center if in the image are present dark areas, like in case of non uniform illumination and if the subject is wearing glasses (a-b). One other source of problems is if the reflexes are in the pupil area and are not properly removed, because they can be detected as pupil edges, leading to a smaller pupil (c). Moreover the pupil edge can be detected erroneously if a part of its edge is occluded by the eyelid or by the eyelashes (d-f). Concerning the iris, since its fitting is constrained by the pupil shape, if the pupil detection is wrong consequently the iris can be completely wrong or deformed. Even if the pupil is detected correctly, when the edge between the iris and the sclera come to have low contrast, for reasons like not uniform illumination, not correct camera focus, or bright color of the eye, the algorithm may fail in finding the limbus (g-h).
Over the whole set of images, the correct segmentation rate is 94, 5%, attesting a good efficacy of the algorithm. In fact it is able to segment properly the iris area (See Fig 4) with changhing size of the puil, in presence of glasses and heavy reflexes (a-c), bushy eyelashes (d-e), iris and pupil partially occluded (f).

The proposed system
The images available in the CASIA-Iris-Thousand are characterized by a gaze direction that is close to the primary, and are taken by a camera positioned directly in front of the eye. The images provided by such a configuration are characterized by a pupil and an iris whose edge is close to a perfect circle. In fact, since the normal to the plane that contains the limbus is parallel to the optical axis of the camera, the projected ellipse has eccentricity close to zero. While the feature to be searched in the image has a circular shape, the technique of re-sampling the image with a transformation from Cartesian to polar coordinates is an effective technique (Ferreira et al., 2009;Lin et al., 2005;Rahib & Koray, 2009). In such domain, starting from the assumption that its origin is in the center of the iris, the feature to be searched, i.e. the circle of limbus, is transformed to a straight line, and thus it is easier to be individuated than in the Cartesian domain.
On the other side, considering not only the purpose of biometrics but also the eye tracking, the eye can be rotated by large angles respect to the primary position. Moreover, in our system, the camera is positioned some centimeters lower than the eye center in order to prevent as much as possible occlusions in the gaze direction. The images obtained by such a configuration are characterized by the pupil and the iris with an eccentricity higher than zero, that increases the more the gaze direction differs from the optical axis of the camera (see for example Fig. 6, top-right).
Since the presented eye-tracking system is based on the perspective geometry of the pupil and iris circle on the image plane, it is important for the system that the relative position between the eye and the camera stay fixed. On the oder side, for a good easiness of use and freedom of movement of the subject it is important the the system allows free head movement. For this purpose, we developed an head mounted device, in order to guarantee both these features.

22
Human-Centric Machine Vision

www.intechopen.com
The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking 23

Hardware implementation
The head-mounted device is endowed with two cheap USB web cams (Hercules Deluxe Optical Glass) that provide images at a resolution of 800 × 600 pixels, with a frame rate of 30 fps. The cameras were mounted in the inner part of a chin strap, at a distance of ≈ 60 mm from the respective eye. At this distance, the field of view provided by the cameras, is [36 • ,2 6 • ], that is more than enough to have a complete view of the eyes. To make them work in infra-red light, the IR-cut filter were removed from the optics, and substituted with a IR-pass filter, with cut frequency of 850 nm. To have a constant illumination of the images, both in daylight or in indoor environments and during night time, the system were endowed with three IR illuminators, that help to keep constant the contrast and the illumination of the stream of images.
In fact, the illuminators produce visible corneal reflexes, that are used as reference feature in other kinds of eye trackers (Eizenman et al., 1984;Morimoto et al., 2000). In our case, since we are seeking to use the limbus position to track the eye, if the reflex, depending on the position of the eye falls in its correspondence, it can lead to the detection of a wrong edge, thus to a wrong gaze estimation. To prevent this case, and considering that the points affected by the reflexes are few respect to the entire limbus edge, these points are removed at the beginning of the image elaboration.

Image acquisition and segmentation
The developed algorithm was tested on three sets of images, taken from different subjects. In each set, the subjects were asked to fixate a grid of points, in order to have the gaze ranging from −30 • and −30 • of azimuth, and from −20 • and 20 • of elevation, with a step of 5 • .I n this way each set is composed by 117 images where the gaze direction is known. The azimuth and elevation angles were defined following a Helmholtz reference frame (Haslwanter, 1995). The use of a transformation of the image from a Cartesian to an elliptic domain allows the algorithm to work properly on the segmentation of the pupil and consequently, as explained in Sec 3, on the segmentation of the iris, even in the cases where the iris is drastically occluded (see for example Fig. 6, center-right).
Considering that the images are captured in an optimal condition, i.e. in an indoor environment where the only sources of IR light are the illuminators and the subjects do not wear glasses, and with the eye correctly centered in the image, the algorithm is able to segment properly the pupil and the iris in the 100% of the cases.

Eye-tracking
Once the iris circle is detected steadily on the image plane, and its edge is fitted with an ellipse, knowing the coefficient matrix Z of the quadratic form, it remains to estimate the gaze direction. This can be obtained computing which is the orientation in space of the circle that produces that projection.

Tracking
Because Z is a symmetric matrix, it can be diagonalized, leading to a matrix: The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking www.intechopen.com Fig. 6. Subset of the images taken from a subject with the IR camera. The subject is fixating to a grid of nine points at the widest angles of the whole set. The magenta ellipse defines the pupil contour, while the green one is the limbus. The red dots represent the points used to compute the limbus ellipse equation, while the white ones are those removed for their possible belonging to the eyelashes or to wrong estimation of the edge. The blue lines represent the parabolas used to remove the possible eyelashes points.
with λ 1 , λ 2 , λ 3 ∈ R. This transformation from Z to Σ is just a change of basis, and thus Σ may be expressed as Σ = R −1 ZR,w h e r eR is the matrix changing between the actual orthonormal basis to a new one, formed by three eigenvectors of Z.T h ec o l u m n so fR are the components of the eigenvectors e 1 , e 2 , e 3 , and the elements λ 1 , λ 2 , λ 3 of the diagonal of Σ are the associated eigenvalues. For the Sylvester's Law of inertia the signature of Σ is equal to that of Z, (−, +, +); thus only one eigenvalue is negative, and the other are positive. We assume λ 3 < 0andλ 2 > λ 1 > 0. If we apply the transformation matrix R to Q Z (x), we obtain: and consider x ′ = R T x, the equation of the projective cone in the new basis is: which is a cone expressed in canonical form, whose axis is parallel to e 3 . Now, look for a while at the intersection of the cone with the plane w = 1 λ 2 3 . This is the ellipse: λ 1 x ′ 2 + λ 2 y ′ 2 = 1 whose axes are parallel to e 1 and e 2 , and semiaxis length are √ λ 1 and √ λ 2 .I fw ec o n s i d e rt o cut the cone in Eq.12 with a plane tilted along e 1 there exist a particular angle θ which makes tha plane to intersect the cone in a circle: this circle will be the limbus and θ its tilt angle in the basis described by the rotation matrix R. As suggested in Forsyth et al. (1991), to find θ it is possible to exploit the properties of circle to have equal semiaxes or, equivalently, to have equal coefficient for the x ′ 2 and y ′ 2 terms in the quadratic form. Equality of the x ′ 2 and y ′ 2 coefficients is achieved by a rotation along the x ′ axis by an angle θ = ± arctan λ 2 −λ 1 λ 1 −λ 3 , which set both the coefficients equal to λ 1 . The normal to the plane that intersects the cone in a circle, expressed in the camera coordinate system, is n = R cam RR θ [0; 0; −1],where: and R cam is the matrix describing the rotation of the camera with respect the fixed world reference frame.

Calibration
Since it is not possible to measure with the desired precision the relative position between the center of the eye and nodal point of the camera, and particularly the relative orientation, we developed a calibration procedure with the purpose of estimating the implicit parameters. The subject is asked to fixate a calibration grid of 25 points in a known position. The grid of points is designed to make the subject fixate with an azimuth angle between −30 • and 30 • with steps of 15 • , and with an elevation angle between −20 • and 20 • with steps of 10 • .
The calibration procedure is based on a functional whose minimization provides: (1) the eye position respect to the camera, (2) the camera orientation respect to a fixed reference frame, (4) the radius of the eye.

Estimation of the gaze direction
In order to have a validation of the algorithm, the estimation of the fixation angle is computed over a different set of points respect to the calibration grid. The grid of point used for the test is designed to make the subject fixate with an azimuth angle between −20 • and 20 • with steps of 5 • , and with an elevation angle between −10 • and 10 • with steps of 5 • .

25
The Perspective Geometry of the Eye: Toward Image-Based Eye-Tracking

www.intechopen.com
The error is measured as the angle between the estimated gaze direction and the actual direction of the calibration points. Over the whole set of 45 points, the algorithm is able to provide a mean error of ≈ 0.6 • . Elevation

Discussion and conclusion
We developed a novel approach for iris segmentation and eye tracking that resorts on the geometrical characteristics of the projection of the eye on the image plane.
Once that the pupil center is roughly located and the ellipse that describes the pupil is fitted, the parameters of the pupil ellipse can be exploited to improve the search of the limbus. We developed a transformation of the image to an elliptical domain, that is shaped by the pupil, in order to transform the limbus in a straight line, thus easier to be detected. The points that do not belong to the limbus are removed considering that the border of the superior and inferior eyelids is well described by two parabolas intersecting at the eyelids intersections. The similarity of the projections of iris and pupil allows a proper segmentation even if large parts of the iris are occluded by eyelids. We developed a method that takes into account the orientation and eccentricity of the pupils ellipse in order to fit the limbus ellipse. The iris segmentation algorithm is able to work both on an iris image database and on the images acquired by our system. Since the limbus can be considered a perfect circle oriented in 3D with respect to the image plane, its imaged ellipse is used to compute the gaze direction finding the orientation in space of the circle that projects the fitted ellipse.
Even though the iris segmentation demonstrates a good effectiveness in a large variety of cases and a good robustness to perturbations due to reflections and glasses, the gaze tracking part is in a preliminary implementation, and many improvements can be implemented in the current algorithm. In order to restrain the wrong matching of the pupil center, the pupil search area can be constrained to a circle defined by the pupil points found during the calibration procedure. In fact, considering to calibrate the algorithm over the range of interest for the tracking of the eye, the pupil is searched in an area where it is likely to be, preventing to detect the initial point on the glasses frame or on other dark regions of the image. Moreover, since the system is not endowed with a frontal scene camera, it comes to be more difficult both to calibrate correctly the algorithm and to test it. Currently for the calibration, the subject is posed manually in the desired position respect to the grid, without any chin rest, and she/he is asked to remain steady all along the procedure. Without any visual feedback from where the subject is fixating, any movement between the subject and the grid (due to undesired rotations and translations of the head, or to physiologic nystagmus) becomes an unpredictable and meaningful source of error. The next steps of our research are to implement of a more comfortable and precise calibration procedure, as through a chin rest or a scene camera, and to extend the system from monocular to binocular tracking.
In conclusion, the proposed method, resorting on visible and salient features, like pupil and limbus, and exploiting the known geometry of the structure of the eye, is able to provide a reliable segmentation of the iris that can be in principle used both for non-invasive and low-cost eye tracking and for iris recognition applications.

Acknowledgment
Portions of the research in this paper use the CASIA-IrisV4 collected by the Chinese Academy of Sciences' Institute of Automation (CASIA).
This work has been partially supported by the Italian MIUR (PRIN 2008) project "Bio-inspired models for the control of robot ocular movements during active vision and 3D exploration".