Statistical Deformation Model for Handwritten Character Recognition

One of the main problems of offline and online handwritten character recognition is how to deal with the deformations in characters. A promising strategy to this problem is the incorporation of a deformation model. If recognition can be done with a reasonable deformation model, it may become tolerant to deformations within each character category. There have been proposed many deformation models and some of them were designed in an empirical manner. Recognition methods based on elastic matching have often relied on a continuous and monotonic deformation model (Bahlmann & Burkhardt, 2004; Burr, 1983; Connell & Jain, 2001; Fujimoto et al., 1976; Yoshida & Sakoe, 1982). This is a typical empirical model and has been developed according to the observation that character patterns often preserve their topologies. Affine deformation models (Wakahara, 1994; Wakahara & Odaka, 1997; Wakahara et al., 2001) and local perturbation models (or image distortion models (Keysers et al., 2004)) are also popular empirical deformation models. While the empirical models generally work well in handwritten character recognition tasks, they are not well-grounded by actual deformations of handwritten characters. In addition, the empirical models are just approximations of actual deformations and they cannot incorporate category-dependent deformation characteristics. In fact, the category-dependent deformation characteristics exist. For example, in category “M”, two parallel vertical strokes are often slanted to be closer. In contrast, in category “H”, however, the same deformation is rarely observed. Statistical models are better alternatives to the empirical models. The statistical models learn deformation characteristics from actual character patterns. Thus, if a model learns the deformations of a certain category, it can represent the category-dependent deformation characteristics. Hidden Markov model (HMM) is a popular statistical model for handwritten characters (e.g., (Cho et al., 1995; Hu et al., 1996; Kuo & Agazzi, 1994; Nag et al., 1986; Nakai et al., 2001; Park & Lee, 1998)). HMM has not only a solid stochastic background and but also a well-established learning scheme. HMM, however, has a limitation on regulating global deformation characteristics; that is, HMM can regulate local deformations of neighboring regions due to its Markovian property. This chapter is concerned with another statistical deformation model of offline and online handwritten characters. This deformationmodel is based on a combination of elasticmatching and principal component analysis (PCA) and also capable of learning actual deformations of 1


Introduction
One of the main problems of offline and online handwritten character recognition is how to deal with the deformations in characters.A promising strategy to this problem is the incorporation of a deformation model.If recognition can be done with a reasonable deformation model, it may become tolerant to deformations within each character category.There have been proposed many deformation models and some of them were designed in an empirical manner.Recognition methods based on elastic matching have often relied on a continuous and monotonic deformation model (Bahlmann & Burkhardt, 2004;Burr, 1983;Connell & Jain, 2001;Fujimoto et al., 1976;Yoshida & Sakoe, 1982).This is a typical empirical model and has been developed according to the observation that character patterns often preserve their topologies.Affine deformation models (Wakahara, 1994;Wakahara & Odaka, 1997;Wakahara et al., 2001) and local perturbation models (or image distortion models (Keysers et al., 2004)) are also popular empirical deformation models.While the empirical models generally work well in handwritten character recognition tasks, they are not well-grounded by actual deformations of handwritten characters.In addition, the empirical models are just approximations of actual deformations and they cannot incorporate category-dependent deformation characteristics.In fact, the category-dependent deformation characteristics exist.For example, in category "M", two parallel vertical strokes are often slanted to be closer.In contrast, in category "H", however, the same deformation is rarely observed.Statistical models are better alternatives to the empirical models.The statistical models learn deformation characteristics from actual character patterns.Thus, if a model learns the deformations of a certain category, it can represent the category-dependent deformation characteristics.Hidden Markov model (HMM) is a popular statistical model for handwritten characters (e.g., (Cho et al., 1995;Hu et al., 1996;Kuo & Agazzi, 1994;Nag et al., 1986;Nakai et al., 2001;Park & Lee, 1998)).HMM has not only a solid stochastic background and but also a well-established learning scheme.HMM, however, has a limitation on regulating global deformation characteristics; that is, HMM can regulate local deformations of neighboring regions due to its Markovian property.This chapter is concerned with another statistical deformation model of offline and online handwritten characters.This deformation model is based on a combination of elastic matching and principal component analysis (PCA) and also capable of learning actual deformations of handwritten characters.Different from HMM, this deformation model can regulate not only local deformations but also global deformations.In the following, the contributions of this chapter are summarized.

Contributions of this chapter
The first contribution of this chapter is to introduce a statistical deformation model for offline handwritten character recognition.The model is realized by two steps.The first step is the automatic extraction of the deformations of character images by elastic matching.Elastic matching is formulated as an optimization problem of the pixel-to-pixel correspondence between two image patterns.Since the resulting pixel-to-pixel correspondence represents the displacement of individual pixels, i.e., the deformation of one character image from another.The second step is statistical analysis of the extracted deformations by PCA.The resulting principal components, called eigen-deformations, represent intrinsic deformations of handwritten characters.
The second contribution is to introduce a statistical deformation model for online handwritten character recognition.While the discussion is similar to the above offline case, it is different in several points.For example, deformations often appear as the difference in pattern length.Consequently, online handwritten character patterns have rarely been handled in a PCA-based statistical analysis framework, which assumes the same dimensionality of subjected patterns.In addition, online handwritten character patterns often undergo heavy nonlinear temporal/spatial fluctuation.Elastic matching to extract the relative deformation between two patterns solves these problems and helps to establish a statistical deformation model.

Statistical deformation model of offline handwritten character recognition 2.1 Extraction of deformations by elastic matching
The first step for statistical deformation analysis of handwritten character images is the extraction of deformations of actual handwritten character images and it can be done automatically by elastic matching.Elastic matching is formulated as the following optimization problem.Consider an I × I reference character image R = {r i,j } and an I × I input character image E = {e x,y },w h e r er i,j and e x,y are d-dimensional pixel feature vectors at pixel (i, j) on R and (x, y) on E, respectively.Let F denote a 2D-2D mapping from R to E, i.e., F : (i, j) → (x, y).As shown in Figure 1, the mapping F determines the x y Fig. 2. Eigen-deformations of handwritten characters.pixel-to-pixel correspondence from R to E. Elastic matching between R and E is formulated as the minimization problem of the following objective function with respect to F : where E F is the character image obtained by fitting E to R, i.e., E F = {e x i,j ,y i,j },a n d (x i,j , y i,j ) denotes the pixel of E corresponding to the (i, j)th pixel of R under F .O n t h e minimization, several constraints (such as a smoothness constraint and boundary constraints) are often assumed to regularize F .Let F denote the mapping F which minimizes J R,E (F ) of (1).This mapping F represents the relative deformation of the input image E from the reference image R. Specifically, the deformation of E is extracted as the following 2I 2 -dimensional vector, called deformation vector, v =((1 − x 1,1 ,1− y 1,1 ),...,(i − x i,j , j − y i,j ),...,(I − x I,I , I − y I,I )) T .( 2 ) Note that v is a discrete representation of F .The constrained minimization of (1) with respect to F (i.e., the extraction of v)i sd o n eb y various optimization strategies.If the mapping F is defined as a parametric function, iterative strategies and exhaustive strategies are often employed for optimizing the parameters of F .In contrast, if the mapping F is a non-parametric function, combinatorial optimization strategies, such as dynamic programming, local perturbation, and deterministic relaxation, are employed.Various formulations and optimization strategies of the elastic matching problem are summarized in Uchida & Sakoe (2005).

Estimations of eigen-deformations
Eigen-deformations of a category are intrinsic deformations of the category and defined as M principal axes {u 1 ,...,u m ,...,u M } which span an M-dimensional subspace in the 2I 2 -dimensional deformation space.The eigen-deformations can be estimated by applying  PCA to {v n |n = 1,...,N},w h e r ev n is the extracted deformation between R and E n .Specifically, the eigen-deformations are obtained as the eigen-vectors of the covariance matrix Figure 2 shows the first three eigen-deformations estimated from 500 handwritten characters of the category "A".The first eigen-deformation u 1 , that is, the most frequent deformation of "A", was the global slant transformation.The second was the vertical shift of the horizontal stroke and the third was the width variation of the upper part.Consequently, this figure confirms that frequent deformations of "A" were extracted successfully.Note that in this experiment, the dimensionality of the deformation vector v was 74 though the size of the character image pattern was 20 × 20 (i.e., I = 20 and 2I 2 = 800).This is because a "sparse" EM was used where the displacements of 3 pixels (leftmost, middle, and rightmost) were optimized at every row.The displacements of the other pixels were given by linear interpolation.
Figure 3 shows the patterns R deformed by the first three eigen-deformations u 1 , u 2 ,a n d u 3 with the amplification with k √ λ m (k = −2, −1, 0, 1, 2),w h e r eλ m is the eigenvalue of the mth eigenvector.This figure also show that frequent deformations were extracted as the eigen-deformation at each category.Figure 4 shows the cumulative proportion of each category.The cumulative proportion by the top M eigen-deformations is defined as ρ(M)=∑ M m=1 λ m / ∑ 74 m=1 λ m .In all categories, the cumulative proportion exceeded 50% with the top 3 ∼ 5 eigen-deformations and 80% with the top 10 ∼ 20 eigen-deformations.Thus, the distribution of deformation vectors was not isotropic and can be approximated by a small number of eigen-deformations.In other words, there existed a low-dimensional and efficient subspace of deformations.

Recognition with eigen-deformations (1)
The eigen-deformations can be utilized for recognizing handwritten character images.A direct use of the eigen-deformations for evaluating a distance between two characters R and E is as follows: where E is an unknown input image and v is the deformation extracted by the elastic matching between R and E. This is the well-known Mahalanobis distance and evaluates the statistical divergence of the estimated deformation on E from the deformations which usually appear in the category of R. If the estimated deformation v gives a large distance value, the result of elastic matching between E and R is somewhat abnormal and therefore the category of R will not become a candidate of the correct category of E.
The recognition performance by D disp (R, E) alone, however, is not satisfactory.This is because the distance D disp (R, E) completely neglects the distance of pixel features.This fact will be certified through an experimental result in 2.5.An alternative and reasonable choice is the linear combination of the distance in the pixel feature space and the distance in the deformation space (Uchida & Sakoe, 2003b), that is, where D feat (R, E) is the elastic matching distance in the pixel feature space, i.e., and w is a constant (0 ≤ w ≤ 1) to ballance two distances.
In practice, the modified Mahalanobis distance (Kimura et al., 1987) is employed instead of (3).Specifically, the higher-order eigenvalues λ m (m = M + 2,...,2I 2 ) are replaced by  λ M+1 , to suppress the estimation errors of higher-order eigenvalues in (3).According to this replacement, ( 3) is reduced to The parameter M is to be determined experimentally, for example, considering the cumulative proportion ρ(M).

Recognition with eigen-deformations (2)
The above recognition method has a weak-point that two heterogeneous distances D feat and D disp are added naively to create the single distance D hybrid .In contrast, the following method (Uchida & Sakoe, 2003a) can avoid this weak-point by embedding the eigen-deformations into an elastic matching procedure.Consider that the mapping F is defined as a linear combination of eigen-deformations, i.e., where α =( α 1 ,...,α m ,...,α M ) T .Then an elastic matching problem with F (α) can be formulated as the minimization problem of the following objective function: where R F (α) is the reference pattern deformed by the mapping F (α).
The set of deformed reference patterns, {R F (α) |∀α}, will form an M-dimensional manifold in an (I 2 • d)-dimensional pixel feature space.Thus the minimum value of J R,E (α) is equivalent to the shortest distance between the M-dimensional manifold and E. The minimization problem (8) with respect to α is hard to solve directly.This is because the M-dimensional parameter vector α to be optimized is involved in the nonlinear function R. Thus, some approximation is required to solve the optimization problem.In Uchida & Sakoe (2003a), the approximation scheme used in the tangent distance method (Simard et al., 1992) has been employed for the above minimization problem.As shown in Fig. 5, the minimum distance min α J R,E (α) can be approximated by the following tangent distance, D TD (R, E)=min where T α is the tangent plane of the manifold at α = 0.The tangent plane is an M-dimensional hyperplane in the feature space and linear with respect to α.Thus the minimization problem of (9) has a closed-form solution.Intuitively speaking, the distance D TD (R, E) is the Euclidean distance between the input E and its closest point on the tangent plane.Figure 6 shows three tangent vectors which span the tangent plane of the category "A".

Recognition result
Figure 7 shows results of a handwritten character recognition experiment using 26 (categories) × 1,100 (samples) isolated handwritten English uppercase character images from the standard character image database ETL6.The first 100 samples of each category were simply averaged to create one reference pattern R and the next 500 samples were used as training samples E n to estimate the eigen-deformations.The remaining 500 samples (13, 000 = 26 × 500 samples in total) were used as test samples E.
The highest recognition rate (99.47%) was attained by D hybrid with its best weight w.T h e recognition rate by D disp , i.e., the recognition rate by evaluating only the deformation v,was not sufficient.Thus, the pixel features (i.e., appearance features) should not be neglected for evaluating the distance of two character images.The recognition rates by D TD were saturated around M = 3.This result is supported by the fast saturation of the cumulative proportion of Fig. 4.

7
Statistical Deformation Model for Handwritten Character Recognition www.intechopen.com

Related work
The original idea of the eigen-deformations, i.e., principal components of deformations, can be found in the point distribution models (PDM), which has been proposed by Cootes et al. (1995) and applied to various patterns.Shen & Davatzikos (2000) have introduced an automatic deformation collection scheme into the PDM.PDM for curvilinear patterns has been applied to face recognition (Lanitis et al., 1997), Chinese character recognition (Shi et al., 2003), and hand posture recognition (Ahmed et al., 1997).Uchida & Sakoe (2003b) have extended the PDM to deal with fully 2D deformations and have applied to an elastic matching-based handwritten character recognition system.Iwai et al. (1997) have applied PCA to interframe motion vector fields obtained by block matching, which can be considered as the simplest elastic matching.Bing et al (2002) have proposed a face expression recognition method based on a subspace of face deformations.Naster et al. (1997) have analyzed a deformation vector extended to deal with the variation of the pixel feature value.Those ideas will be promising for recognizing handwritten character images.
The eigen-deformations are the principal axes spanning a subspace of the 2I 2 -dimensional deformation space.Any point on the subspace represents a deformation F .On the other hands, we can consider a subspace on the (I 2 • d)-dimensional pixel feature space.Any point on the subspace represents an I × I × d image pattern.The axes spanning this subspace are derived as dominant eigen-vectors of the covariance matrix w h e r eE is the mean vector of {E n }.There are huge research attempts about the subspace (Oja, 1983).Eigenface (Turk & Pentland, 1991) and parametric eigenspace (Hase et al., 2003;Murase & Nayar, 1994) are famous examples of those attempts.
While the subspace derived in the above manner can represent a set of deformed character patterns, the subspace spanned by the eigen-deformations will represent the same set in a more compact manner.Consider a character image R and a set of character images created by translating R. The number of the eigen-deformations estimated from the set is two; one will represent horizontal shift and the other vertical shift.In contrast, the number of the principal eigen-vectors in the pixel feature space will be far larger than two.This superiority will hold for other geometric deformations and thus the subspace of deformations can be a more efficient representation than the subspace of the pixel features.

Extraction of deformations by elastic matching
Consider two online handwritten character patterns, R = r 1 , r 2 ,...,r i ,...,r I and E = e 1 , e 2 , ...,e x ,...,e I ′ .The former is a reference character pattern and the latter is an input character pattern.Their elements r i and e x are d-dimensional feature vectors representing the features at i and x; they are often 3-dimensional vectors comprised of x-coordinate, y-coordinate, and local direction.
Let F denote a 1D-1D mapping from R to E, i.e., F : i → x.F i g u r e8d e p i c t sF .Elastic matching between R and E is formulated as the minimization of the following objective function with respect to where E F is the character pattern obtained by fitting E to R, i.e., E F = e x 1 ,...,e x i ,...,e x I , where x i represents the i − x correspondence under F .On the minimization, several constraints (such as the monotonicity and continuity constraint defined as x i − x i−1 ∈ {0, 1, 2} and boundary constraints x 1 = 1a n dx I = I ′ ) are often assumed to regularize F .This constrained minimization problem can be solved effectively by a DP algorithm, called dynamic time warping or DP matching, and its detail are omitted here.
The deformation of E from R is represented by the following It should be noted that the dimension of the above deformation vector v is fixed at (I • d) and independent of the length of E, i.e., I ′ .This property is very important to apply various statistical methods, such as PCA, to sequential patterns.Also note that it is possible to define v as v = (1 − x 1 ,...,i − x i ,...,I − x I ) T .
Although this definition is a straightforward modification of the deformation vector of (2), we will use v of (11) as a deformation vector here.This is because in online character recognition, r i and e x are often spatial features and thus their difference represents a deformation.

Estimation of eigen-deformations
Eigen-deformations of online handwritten character patterns are also estimated by the procedure of 2.2; that is, they can be estimated as dominant eigen-vectors of the covariance matrix of v.
Eigen-deformations of online handwritten digits were estimated by using about 1,000 samples from UNIPEN Train-R01/V07 database (1a) (Guyon et al., 1994).Figure 9 shows character patterns generated by (Mitoma et al., 2005).That is, those patterns are reference patterns deformed by their mean deformation vector v and the first two eigen-deformations u m .Note that the effect of v was not significant because R was set around the center of the set of the training samples by a clustering technique and thus the norm of v was small.Figure 9 shows that deformations frequently observed in actual characters were estimated as eigen-deformations.For example, the first eigen-deformation of "6" represents the vertical variation of its loop part, and the second one represents the horizontal variation of the loop part.

Recognition with eigen-deformations
For online handwritten character recognition based on the eigen-deformations, the following quadratic discrimination function (QDF) is a possible choice (Mitoma et al., 2005).The QDF is the Bayes discrimination function under the assumption that the deformation vectors have a Gaussian distribution and defined as The last term, (I • d) log 2π, cannot be omitted here because each category has a different dimension of v (i.e., I • d).

Recognition results
Figure 10 shows the results of an online character recognition experiment using digit samples from the UNIPEN database.Recognition rates attained by D MQDF are plotted as a function of the total number of reference patterns, which are created by a clustering technique.The recognition rates attained by the conventional DP-matching distance (D DP ), which equals to the minimum value of (10), are also plotted.
As shown in Fig. 10, MQDF with the eigen-deformations outperformed the DP-matching distance.This will be because elastic matching results F which were deviated from the distribution of the deformations of the category were penalized by the eigen-deformations in MQDF.Thus, the above recognition method can avoid misrecognitions due to overfitting, which is the phenomenon that the distance between E and R of a wrong category is underestimated by unnatural mapping F .This result also proves that D MQDF outperforms that statistical dynamic time warping (SDTW) (Bahlmann & Burkhardt, 2004), which is a recent and sophisticated online character recognition technique.In fact, it has been reported in Bahlmann & Burkhardt (2004) that SDTW attained 97.10% on the same UNIPEN data set by 150 reference patterns.

Related work
Sequential patterns, such as online handwritten character patterns, are often re-sampled to have the same dimension in advance to applying PCA or other statistical analysis techniques.For example, Deepu et al. (2004) have proposed an online character recognition technique based on a subspace method where all online character patterns are re-sampled to have a constant number of data points.The online character recognition technique by Zheng et al. (1999) is more radical because they used only two points (i.e., the start point and the end point) for each character stroke segment.In the handwriting synthesis technique by Wang et al. (2005), online cursive handwritings are firstly aligned to be the same dimension and then PCA is applied to them.PCA-based gesture/motion analysis techniques (Fod et al., 2002;Sanger, 1995;Yacoob & Black, 1999) also re-sampled gesture patterns to have the same dimension.An exception is Martens & Claesen (1996), which employed elastic matching to extract a fixed-dimensional deformation vector from online signatures.

Conclusion
Statistical deformation models of handwritten character images and online handwritten character patterns have been introduced.The body of those models are eigen-deformations,

11
Statistical Deformation Model for Handwritten Character Recognition www.intechopen.comwhich are deformations frequently observed in a certain category and span a subspace in a deformation space of the category.For estimating the eigen-deformations, elastic matching and principal component analysis (PCA) were employed.The former was utilized to extract deformations of target patterns automatically.For the online patterns, elastic matching was also utilized to adjust difference in their lengths.The latter was utilized to derive the eigen-deformations as the principal components of the extracted deformations.
The usefulness of the statistical deformation models with eigen-deformations has been confirmed experimentally.The estimated eigen-deformations could represent frequently observed deformations in each character category.In addition, the eigen-deformations were useful for improving accuracy in both of offline and online character recognition tasks.