Character Recognition with Metasets

The chapter presents a new approach to the character recognition problem. It is based on metasets – a new concept of sets with partial membership relation. By the character recognition problem we understand determining the similarity degree of the given character sample to the defined character pattern. The discussed mechanism may be applied not only to characters (e.g. letters), but to arbitrary data represented onmonochromatic images or even multi-dimensional figures. The theory of metasets brings a newmodel of “fuzzy”membership relation for sets. Ametaset may be a member of (or equal to) another metaset to variety of different degrees – contrary to classical sets where membership and equality are always either true or false. The goal of the chapter is to present the application of the new, abstract theory to solving a practical, well-known problem. It develops the method which was partially introduced for some particular case in (Starosta, 2009). The proposed solution had been implemented as a computer program. The experiments made with the program confirm that the theoretical assumptions are correct and the obtained results properly reflect our perception of similarity of characters. It should also be stressed that the concept of metaset itself was partially inspired by another computer application for character recognition, based on neural networks.


Introduction
The chapter presents a new approach to the character recognition problem.It is based on metasets -a new concept of sets with partial membership relation.By the character recognition problem we understand determining the similarity degree of the given character sample to the defined character pattern.The discussed mechanism may be applied not only to characters (e.g.letters), but to arbitrary data represented on monochromatic images or even multi-dimensional figures.The theory of metasets brings a new model of "fuzzy" membership relation for sets.A metaset may be a member of (or equal to) another metaset to variety of different degrees -contrary to classical sets where membership and equality are always either true or false.The goal of the chapter is to present the application of the new, abstract theory to solving a practical, well-known problem.It develops the method which was partially introduced for some particular case in (Starosta, 2009).The proposed solution had been implemented as a computer program.The experiments made with the program confirm that the theoretical assumptions are correct and the obtained results properly reflect our perception of similarity of characters.It should also be stressed that the concept of metaset itself was partially inspired by another computer application for character recognition, based on neural networks.

The general idea
The process of determining the similarity degree consists in two stages.Initially, the compound character pattern must be prepared.It consists of several character samples accompanied by quality grades.The samples are depicted on rectangular matrices and they correspond to different forms of the same character.The pattern itself represents various possible approaches to the same character, as a single entity.In the second stage a testing character sample is matched against the pattern and the resulting similarity degree is calculated.The character samples as well as the compound pattern are encoded as metasets.As the result of matching the testing sample against the pattern we obtain the membership degree of the sample metaset in the pattern metaset and additionally, the sequence of equality degrees of the sample metaset and the pattern elements.The membership degree measures how far the sample resembles the pattern.The equality degrees indicate the similarity of the input sample and each pattern element separately.The membership degrees as well as equality degrees for metasets are expressed as sets of nodes of the binary tree, which are finite binary sequences, and they may be evaluated as real numbers.
The quality grades of the samples in the pattern are membership degrees of the corresponding metasets, too.However, they are manually specified as areas of the matrix for depicting the characters, which contain valid pixels to be included in the matching process.This specification is interpreted as membership degrees of appropriate metasets.The quality grades show how close is a particular sample to the ideal.They may be supplied by experts together with the samples.The most significant innovation here is treating the membership and equality degrees of metasets as similarity measures for characters provided they are properly encoded as metasets.

Basic terms and notation
The concept of binary tree plays the key role in the definition of metaset and related notions.Therefore, we start with establishing some well known terms and notation concerning it.We use the symbol for the infinite binary tree with the root .The nodes of the tree are finite binary sequences, the root is the empty sequence.For p ∈ the symbol |p| denotes the length of the sequence and #p denotes the natural number represented by the binary sequence p.Note, that | | = 0 and we assume # = 0.The ordering of nodes in is determined by reverse ordering of their lengths: p ≤ q whenever |p|≥| q|.In particular the root is the largest element in .The set of nodes of equal length n is called the n-th level in the tree: n = { p ∈ : |p| = n }.The level 0 contains only the root.Nodes of the tree are sometimes called conditions.I fp ≤ q ∈ , then we say that the condition p is stronger than the condition q,a n dq is weaker than p.Thus, the conditions 0 and 1 are stronger than the root and they are weaker than the conditions 00, 01, 10, 11, which form the level 2 . [0] ✁ ✁ ✕ Fig. 1.The binary tree and the ordering of nodes (conditions).Arrows point at the larger element, i.e., the weaker condition AsetofnodesC ⊂ is called a chain in , whenever all its elements are pairwise comparable: ∀ p,q∈C (p ≤ q ∨ q ≤ p).A s e t A ⊂ is called antichain in , if it consists of mutually incomparable elements: ∀ p,q∈A (p = q →¬(p ≤ q) ∧¬(p ≥ q)).On the Fig. 1, the elements { 00, 01, 100 } form a sample antichain.A maximal antichain is an antichain which cannot be extended by adding new elements -it is a maximal element with respect to inclusion of antichains.Examples of maximal antichains on the Fig. 1 are { 0, 1 } or { 00, 01, 1 } or even { }.They are in fact maximal finite antichains (MFA).A branch is a maximal chain in the tree .N o t e t h a t p is comparable to q only, if there exists a branch containing p and q simultaneously.Similarly, p is incomparable to q, when no branch contains both p and q.
To finish this section we prove a property of maximal finite antichains necessary for evaluating as numbers the degrees represented as sets of nodes.Clearly, there are 2 n nodes on the n-th level of the binary tree, so ∑ p∈ n 1 2 |p| = 1.This property may be generalized to arbitrary MFA. 2 |p| .For incomparable p and q, the corresponding intervals are disjoint: p ∩ q = ∅.Indeed, if p ∩ q = ∅, then there must exist some r ∈ such, that r ⊂ p ∩ q.S i n c er ⊂ p,t h enr ≤ p, and similarly r ≤ q.This implies p ≤ q or q ≤ p,so they are comparable.We now show, that the measure of p∈A p is equal 1.Clearly, it cannot be grater than 1, so if it is less, then let u ⊂ I \ p∈A p be an open interval.There must exist s ∈ such, that s ⊂ u.I f s is comparable to some p ∈ A,thens ∩ p = ∅,sos ∩ p∈A p is non-empty, what contradicts s ⊂ u.Thus, assuming that the length of p∈A p is less than 1 we found s incomparable to all elements of A, what contradicts its maximality.To complete the proof note, that the length of each p is 1 2 |p| ,t h em e a s u r eo f p∈A p is 1 and they are all pairwise disjoint.

Metasets
In the classical set theory a set either is an element of another set or it is not; there are no intermediate levels.This binary approach has many vital limitations which make it difficult to apply by representation of vague, imprecise data.Therefore, for the last decades there were several attempts to inventing a concept of set with partial membership relation.Among the most successful ones are fuzzy sets (Zadeh, 1965), intuitionistic fuzzy sets (Atanassov, 1986) and rough sets (Pawlak, 1982).The metaset idea is a new approach to the problem.One of the most significant characteristics of the metaset concept is its computer oriented design.Definitions of fundamental notions -like membership, equality or algebraic operations -may be formulated in the way which makes them easily implementable using programming languages (Starosta & Kosi ński, 2009).This facilitates fast and efficient computer representation and processing of vague data.Additionally, several important theoretical results may be obtained for the metasets which are representable in computers, because of their finite structure.Some of them -like the Lemma 3 -constitute the base for the discussed here mechanism.

Fundamental concepts
The concept of metaset is strictly based on the classical Zermelo-Fraenkel set theory (ZFC).We define metaset as a set of ordered pairs.The first element of a pair is a member of the metaset, which is another metaset.The second element of the pair is a node of the binary tree whichinformally speaking -specifies the membership degree of the first element in the metaset.Definition 1.A metaset is a crisp set which is either the empty set ∅ or which has the form: The definition is recursive, however it is founded by the empty set ∅,b yt h eA x i o mo f Foundation in ZFC (Kunen, 1980).First elements of ordered pairs contained in the metaset are called its potential elements.
From the classical set theory point of view, a meta set is a relation between a crisp set of other meta sets and a set of nodes of the tree .Therefore, we adopt some terminology associated with relations.For the given metaset τ the set of its potential elements: is called the domain of the metaset τ.I t srange is the following set: The reader may confirm that τ ⊂ dom(τ) × ran(τ) ⊂ dom(τ) × .For metasets τ and σ the set is called the image of the metaset τ at the metaset σ.T h e i m a g e τ[σ] is the empty set ∅, whenever σ is not a potential element of τ.
Example 1.The simplest metaset is the empty set ∅.It may be a potential element of other metasets: In this paper we do not deal with metasets in general.We focus here on very specific classes relevant to character recognition problem.Narrowing the domain of discourse simplifies formulations of some results too.We introduce now two classes of metasets used for representation of characters and patterns.
Let A be a maximal finite antichain in .A non-empty metaset of form is called A-sample metaset.Each non-empty subset S ⊂ A determines A-sample metaset { ∅ } × S. A-sample metasets are used for representing character samples.
Let P be a finite set of A-sample metasets.A non-empty metaset of form is called A-pattern metaset.In other words, A-pattern metaset has the form where χ i are A-sample metasets and P i ⊂ A,arenotemptyfori = 1,...,n.A-pattern metasets are used for representing character patterns.We now explain the fundamental technique of interpretation used for defining relations on metasets.Also, it allows to perceive a metaset as a "fuzzy" family of crisp sets.Each member of such family represents some specific, particular point of view on the metaset.
Definition 2. Let τ be a metaset and let C be a branch in the binary tree .The interpretation of the metaset τ,givenbythebranchC, is the following crisp set: Thus, branches in allow for producing crisp sets out of the metaset.The family of crisp sets { τ C : C is a branch in } consists of interpretations of the metaset τ.P r o p e r t i e so ft h e s e interpretations determine properties of the metaset.Any interpretation of the empty metaset is the empty set itself, independently of the branch: The process of producing the interpretation of a metaset consists in two stages.In the first stage we remove all the ordered pairs whose second elements are conditions which do not belong to the branch C. The second stage replaces the remaining pairs -whose second elements lie on the branch C -with interpretations of their first elements, which are other metasets.This two-stage process is repeated recursively on all the levels of the membership hierarchy.As the result we obtain a crisp set.
Example 2. Let p ∈ and let τ = { ∅, p }.I fC is a branch, then Depending on the branch the metaset τ acquires different interpretations.
An interpretation of A-sample metaset is either the empty set ∅ or the singleton Therefore, an interpretation of any A-pattern metaset is one of: We introduce now basic set-theoretic relations for metasets.All the relations are defined using the same scheme -by referring to interpretations.We start with the membership.Definition 3. Let τ, σ be metasets and let p ∈ .We say that σ belongs to τ under the condition p, if for each branch C containing p holds σ C ∈ τ C .We use the notation σǫ p τ.
Note, that in fact we define an infinite number of membership relations here -each designated with different condition.The membership under the root condition σǫ τ corresponds to the crisp, classical membership.The designates the highest membership degree, since it is the largest element in .Stronger conditions designate lower degrees of membership.We also define an independent set of non-membership relations.The reason for this lies in the fact, that ¬ σǫ p τ does not imply that for each branch C containing p holds σ C ∈ τ C .It merely means that not for each such branch holds σ C ∈ τ C , however, there may still exist branches for which it is true.Definition 4. Let τ, σ be metasets and let p ∈ .We say that σ is not a member of τ under the condition p, if for each branch C containing p holds σ C ∈ τ C .We use the notation σǫ / p τ.

19
Character Recognition with Metasets www.intechopen.comIt might occur strange to the reader that two metasets may be in membership and non-membership relations simultaneously.The relations must be qualified by incomparable conditions, though.
As we see, ¬ σǫ p τ does not completely exclude the membership of σ in τ,ev enforp = .The fact that ¬ σǫ τ does not contradict σǫ p τ for some p ∈ .It merely says that σ cannot belong to τ under the condition .For incomparable conditions p, q it is possible that σǫ / p τ and at the same time σǫ q τ.But it is not true that σǫ / p τ ∧ σǫ p τ for any p.Analogously -by referring to interpretations -we define two sets of equality relations.

Definition 5. Let p ∈
and let τ, σ be metasets.We say that σ is equal to τ under the condition p, if for each branch C containing p holds σ C = τ C .We use the notation σ ≈ p τ. Definition 6.Let p ∈ and let τ, σ be metasets.We say that σ is different than τ under the condition p, if for each branch C containing p holds σ C = τ C .We use the notation σ ≈ / p τ.
Similarly as for conditional membership, it is possible that σ ≈ p τ ∧ σ ≈ / q τ for some metasets σ, τ and p, q ∈ .The certainty grades for relations on metasets are represented by sets of nodes of the binary tree and they may be evaluated as real numbers.We do not develop the general theory here, the interested reader is referred to (Starosta, 2010).Instead, we show how to evaluate the degrees of membership, non-membership, equality and difference for A-sample metasets and A-pattern metasets, when the maximal finite antichain A is fixed.Let σ, η be A-sample metasets and let τ be A-pattern metaset.The following sets contained in A are called the membership, non-membership, equality and difference values of σ in τ (or η) respectively.Clearly, by the Lemma 1 all they range between 0 and 1, inclusive.
It is worth stressing, that A-sample metasets and A-pattern metasets have the following important property.
Lemma 3. Let A be a maximal finite antichain.Let σ, η be arbitrary A-sample metasets and let τ be arbitrary A-pattern metaset.The following equations hold: Proof.First, observe that M(σ, τ) ∩ N(σ, τ)=∅ and E(σ, η) ∩ D(σ, η)=∅.Indeed, it is not possible that for some p ∈ A simultaneously hold σǫ p τ and σǫ / p τ or σ ≈ p η and σ ≈ / p η. Therefore, by using the Lemma 1 we may reformulate the thesis as follows: To prove (18) it is enough to show, that for each p ∈ A either σǫ p τ or σǫ / p τ is true.In other words, either for all branches C containing p holds σ C ∈ τ C or for all such branches holds σ C ∈ τ C .Clearly, for any branch C either σ C is a member of τ C or not, the question is whether the (non-)membership is maintained for all interpretations determined by a p ∈ A.T h i si s true for A-sample metaset σ and A-pattern metaset τ,sinceran(σ) ⊂ A and ran(τ) ⊂ A and also η∈dom(τ) ran(η) ⊂ A. Therefore, there exist no conditions stronger than p which could affect the interpretations.In other words, if C ′ and C ′′ are different branches containing p ∈ A, The proof of ( 19) is analogous.
The lemma says that there is no hesitancy in membership or equality for such metasets.This is not true for metasets in general.There exist metasets α, β with infinite ranges such, that for any p ∈ neither αǫ p β nor αǫ / p β is true, see (Starosta, 2010) for details.When we translate this property into the language of character recognition, then it says that for each pixel of a character we may decide whether it matches some pattern (or another character) or not.There is not any doubt about it.

Properties relevant to character recognition
In this section we prove some technical facts strictly relevant to character recognition mechanism.We refer to them in the sequel.Proofs are not required for understanding the idea so they may be skipped on first reading.We supply them for mathematical completeness and clarity.
The following lemma tells that for two given A-sample metasets τ and σ, their conditional difference is determined by the elements of the symmetric difference of their ranges: ran(τ) △ ran(σ), whereas their conditional equality is determined by the complement to A of the symmetric difference: A \ (ran(τ) △ ran(σ)).
We may express this property in terms of character recognition as follows.When comparing two characters, then not only the pixels belonging to them simultaneously affect the result of comparison, but also the pixels that belong to background of both.If a pixel belongs to one of the characters and for another character the same pixel forms the background, then such pixel asserts the difference between the characters.To prove (21) note, that: The set R is the equality set for τ and σ,andA \ R is the difference set: The Lemma 4 enables evaluation of the equality degree of metasets representing character samples, i.e., the similarity of two characters.We now prove the main result which shows the construction of the membership and non-membership sets for the given A-sample metaset and A-pattern metaset.In other words,

22
Recent Advances in Document Recognition and Understanding www.intechopen.comit allows for evaluation of the similarity degree of a character testing sample (CTS) to the compound character pattern (CCP).
In the following theorem the metaset σ represents the testing sample (CTS), ρ is the compound character pattern (CCP) built up of potential elements π i representing characters.The sets P i and S constitute the structures of the pattern samples and the input sample.The sets Q i represent equality degrees of the CTS and CCP elements ans the sets R i represent the qualities of CCP members.

Theorem 5.
Let A be a maximal finite antichain in and let i = 1,...,k.
To prove (26) take u ∈ U.T h e r ee x i s t si Definition 3 we have π i ǫ u ρ.Thus, by the Lemma 2 we obtain σǫ u ρ.
To prove ( 27) let R = k i=1 R i and let Qi = A \ Q i .We may split each R i into two parts: Let u ∈ A \ U and let C be a branch containing u. Note, that U ⊂ R ⊂ A,soweconsidertwo cases: ..k } be the set of all those i,forwhichu ∈ R i ∩ Qi .S i n c eu ∈ C and for each i ∈ I the intersection R i ∩C contains at most one element (which is u), then by the Definition 2 The last equality is implied by the following (since u ∈ U):  The set U is the membership set for σ in ρ,andA \ U is the non-membership set: The sequence of equality sets Q i = E(σ, π i ) enables evaluation of equality degrees of A-sample metaset σ and potential elements π i of A-pattern metaset ρ.They show the distribution of the overall similarity degree among the pattern elements.

Character recognition with metasets
In this section we explain the core of the idea of applying metasets to recognition of characters.
We show how to represent characters and compound character patterns as metasets.Then we we show how to calculate appropriate membership and equality degrees and interpret them as quality grades of the input samples.The procedure we discuss here involves two stages.During the first stage we define the compound character pattern (CCP).It represents a single character and it is comprised of a number of different samples of the character.The samples are graded with quality grades.
In the second stage we supply character testing samples (CTS) and we calculate the result which is the similarity degree of CTS to CCP.The similarity degree tells how close is the CTS to the character represented by CCP.Besides the overall similarity degree we obtain also the sequence of similarity degrees of the CTS to each member of the CCP.These degrees show how close is the input sample to each element of the compound pattern.
The compound character pattern is represented by a metaset, whose potential elements represent particular character samples of the pattern.The testing sample is represented by a metaset too.The resulting similarity degree is the membership degree of CTS in CCP.The additional similarity degrees of CTS to pattern elements are partial equality degrees of CTS to potential elements of CCP.
One of the goals of this section is to convince the reader that partial membership and equality degrees of metasets encoding character samples properly reflect the human perception of similarity of characters.

Representing characters as metasets
Characters are displayed on the matrix X c r comprised of r rows and c columns (shortly: X).The natural numbers r and c may be arbitrary, however they must remain constant throughout the matching process: all the character samples in the CCP pattern as well as all the CTS input samples must use the same matrix dimensions.We focus on monochromatic images here, so the cells of the matrix acquire two states: selected ones belong to the character and deselected ones form the background.For the given character a displayed on the matrix, the set of selected cells is denoted by Xa.Prior to defining character samples, a mapping m : X → between matrix cells and nodes of the binary tree must be established.To each cell of the matrix a node of the binary tree must be assigned so that the set of assigned nodes -denoted by m(X) -forms a maximal antichain A in the tree .The assignment of nodes to cells is arbitrary -no special ordering is required.The antichain and the mapping are constant for the whole character matching process -all the CTS and CCP samples use the same A and m.Note, that since the nodes assigned to cells form a MFA, then any branch in the tree contains exactly one assigned node.

24
Recent Advances in Document Recognition and Understanding www.intechopen.com The simplest example of such assignment is when r • c = 2 k for some k.I ns u c hc a s et h e nodes of the k-th level of the binary tree may be assigned in an arbitrary way to the cells.We call such one-to-one mapping of matrix and some level in an even mapping.The Figure 2 demonstrates a sample 4 × 4 matrix with a mapping m : X 4 4 → 4 onto the level 4 of the tree.For simplicity, most examples will be based on this 4 × 4 matrix and the mapping.When r • c = 2 k for any k ∈ , then the cells of the matrix must be mapped to nodes from different levels of , since levels contain 2 k nodes.Anyway, the image m(X c r ) must be a MFA.We call such a mapping uneven.See Fig. 3 for an example of uneven 3 × 4 mapping.For an even mapping the placement of particular nodes is rather irrelevant.On the other hand, when the mapping is uneven, then the nodes from different levels assigned to cells impose the following interpretation.Parts of the matrix which are more important for the particular character, and which we want to stress somehow by distinguishing it from the rest, are associated with nodes which are closer to the root -the weaker conditions.The cells which are of less importance contain nodes from lower levels of the tree -the stronger conditions.
Weaker conditions have more impact on the resulting membership and equality degrees than stronger ones (cf.Equations 12-15).For instance we might be particularly interested in proper recognizing of the dot over the letter 'i'.In such case we may use the assignment depicted on the Fig. 4. The reader is encouraged to check that the nodes form a maximal antichain.The cells containing the nodes 10 and 110 are more sensitive to errors than other cells and they influence the resulting similarity degree more than others.Note, that even when r • c = 2 k for some k, then the mapping might be uneven too, since we may assign nodes from different levels to cells in order to stress some areas of the matrix and diminish the influence of others.Anyway, the requirement that the range forms a maximal antichain must be fulfilled.The assignment on the Fig. 5 shows how to stress the upper-left corner of the X 2 2 matrix.The impact of the lower row is much less than the impact of the upper row in this case.We now construct the metaset χ representing the character denoted by a displayed on the matrix X.The domain of the metaset consists of the empty set only: dom(χ)={ ∅ }.T h e set m(Xa) ⊂ A of nodes corresponding to the marked cells of the matrix forms the range of the metaset representing the sample: ran(χ)=m(Xa).Since the domain of χ contains exactly one element ∅,thenran(χ)=χ[∅].Thus, Note, that we interpret the membership degree of ∅ in χ as the set of selected cells of the character.This membership degree is irrelevant by itself, however, it determines the equality degree of this sample and any other CTS supplied during the recognition phase.It also affects the overall result which is the membership degree of the CTS in the CCP.
As an example, let us represent the character 'c'o nt h e4× 4 matrix with the standard assignment, like on the Fig. 6.The metaset representing this letter is

Defining the compound pattern
Defining the compound character pattern (CCP) is the essential step in the process of character recognition with metasets.The CCP consists of a number of character samples accompanied by quality grades.The samples describe some point of view on the character.The different somehow, and therefore they cannot be included in the representation of the character without causing any doubt.They may also be treated as a mask for excluding parts of the matrix from the matching process.Anyway, when calculating equality degrees the whole matrix area is taken into account, so excluded parts play role in determining the membership (similarity) degree to the CCP only.

Evaluating similarity degrees
Once the compound character pattern (CCP) is prepared we are ready to supply testing character samples (CTS) and evaluate their similarity degrees.The CTS is represented as a metaset in exactly the same manner as CCP elements, i.e., we use the same matrix X with the same mapping m of cells to some maximal finite antichain A.
The process of matching the input character sample represented by the A-sample metaset τ against the prepared compound character pattern represented by the A-pattern metaset π involves calculation of the membership degree of τ in π and the sequence of equality degrees of τ and potential elements χ i of π.The membership degree tells us to what measure the CTS resembles the character defined by the CCP and is represented by the set M(τ, π) (see Equation 8).The equality degrees play supplemental role and they show the similarity of τ and each pattern element separately, which -contrary to the CCP -are single characters.They are represented by the sets E(τ, χ i ) (see Equation 10).We apply here the Theorem 5 for determining the similarity degrees and also the Equations 12-15 for numerical evaluation of the degres.Let us make calculations for the sample letter 'c' shown on the Fig. 6.The metaset χ representing the character is defined by the Equation (34).First, we establish the notation.
The left hand sides of the following equations correspond to variables used in the Theorem 5 and in the right hand sides we use metasets defined in previous sections, i = 1, 2, 3.The results show, that the character on the Fig. 6 resembles the characters c 1 and c 3 on the Fig. 7 equally well, whereas the character c 2 a bit worse.Note, that we do not take into account the qualities q i of the samples c i when calculating the equality degrees.Now we calculate the membership set M(χ, π) (U in terms of Theorem 5) and the membership value m(χ, π) of χ in π.We apply the equations 46-48. 0111, 1010, 1011, 1101, 1110, 1111 This means, that the sample χ perfectly matches the pattern π.

Discussion of the results
The interesting question that arises is what are the similarity degrees of each pattern element χ i to the pattern π itself?It turns out that the pattern samples do not have to be of the best quality in order to assure that other input samples result in perfect matches.We show that the membership sets M(χ i , π) are proper subsets of 4 and therefore, the membership values m(χ i , π) are less than 1 for all i = 1, 2, 3. We present the results of calculations only, leaving the details to the reader.Let us start with equality sets: where D(χ i , χ j ) are the difference sets depicted on the Table 1.

31
Recognition with Metasets www.intechopen.com Thus, the similarity values of the characters c 1 , c 2 and c 3 to the CCP built on top of them are 0.94, 0.88 and 0.88, respectively.None of them matches the pattern to the highest degree.Even though the membership values of χ i in π are less than 1, there exist samples which match the CCP to the highest possible degree, with the membership value equal 1.Besides the character on the Fig. 6, the are three more -shown on the Fig. 8 -for which the similarity degree reaches the maximal value.Note, that the samples on the Fig. 6 and Fig. 8 differ only in pixels 0011 and 1100, which in at least one of the CCP elements on Fig. 7 belong to the sample and in at least another one belong to the background -being not excluded by the quality area at the same time.
The character samples c i and their quality grades q i were intentionally chosen so that they do not match the CCP to the highest degree, in order to demonstrate interpolation capabilities of the new mechanism.In typical cases, one constructs the CCP based on the good samples, which reflect most characteristics of the modelled pattern.

32
Recent Advances in Document Recognition and Understanding www.intechopen.com When creating a CCP one should bear in mind the following rule.Each pixel of the matrix must be covered by the foreground or background of at least one sample in the pattern.By covering we understand that it is included in at least one quality area.The reader may confirm that this rule is preserved in our example.If there exist a cell which is contained in exclusion area of each sample, then reaching the similarity value of 1 is not possible for any sample.

Conclusions
We demonstrated the method for character recognition based on metasets.The core of the idea lies in representing character samples and character patterns directly as metasets, as well as interpreting the membership and the equality degrees of corresponding metasets as the similarity degrees of characters.Although the idea is quite simple and straightforward, it seems to work fine.The experiments carried out with the computer application 1 implementing this model confirm that it adequately reflects human perception of similarity of characters.As we have seen, the mechanism requires some laborious calculations, however they are to be carried out by machines.So far, no comparisons with other techniques for character recognition have been made.It must be stressed that the presented method is not by itself competitive to commercial solutions yet.It is rather a sketch of an idea which -when applied in cooperation with other techniques used for data processing, like centering and sharpening of character images -may turn out to have some advantages over other solutions.
The main goal of this chapter was to convince the reader, that the idea of metaset is applicable to solving problems related to processing of vague, imprecise data.And moreover, that modelling of real world using metasets is quite natural and simple.We showed that metaset membership correctly mimics similarity when characters are appropriately encoded.It should be clear, that the discussed method has much wider scope of applications than recognition of letters.Although we presented the version for monochromatic (binary) images, it is not difficult to generalize it to color (many-valued) ones.The next step in research on the subject will focus on determining the characteristics of graphical data for which this method gives the best results.

Lemma 1 .
If A ⊂ is a maximal finite antichain in ,then∑ p∈A 1 2 |p| = 1.Proof.Each node p = is a binary sequence which represents a natural number #p.Therefore, each p = corresponds to an interval p =[ ) ⊂ [0...1] and corresponds to I =[ 0...1).The length of each interval is 1
) Thus, for i ∈ I we have π i C ∈ ρ C .However, for i ∈ I we also have u ∈ Q i ,s ob y( 2 5 )h o l d s σ ≈ / u π i and consequently σ C = π i C .S i n c eσ C is different than all the members of ρ C ,t h e n σ C ∈ ρ C for any branch C∋u,whatgivesσǫ / u ρ.This proves (27) for the case when u ∈ R \ U.If u ∈ A \ R,thenforC∋u we have ρ C = ∅,soσǫ / u ρ for any σ, what implies the second case for (27).

Fig. 2 .
Fig. 2. A standard mapping of the level 4 of the binary tree to cells of the 4 × 4 matrix.

Fig. 3 .
Fig. 3. Mapping of some antichain in to cells of the 3 × 4 matrix.

Fig. 8 .
Fig. 8. Three remaining samples with the best similarity to the pattern represented by χ.

Table 1 .
Difference sets D (χ i , χ j ) for compound pattern elementsThe matrix on the Table1is symmetric since χ i ≈ p χ j is equivalent to χ j ≈ p χ i .Empty sets on the diagonal confirm, that χ i ≈ p χ i ,foreachp ∈ 4 .We conclude that the equality values are as depicted on the Table2.e(χ i , χ j ) χ 1 χ 2 χ 3

Table 2 .
Equality values e(χ i , χ j ) for compound pattern elements Based on the above sets we calculate the membership sets and the membership values, similarly as before.