1. Introduction
Cellular Automata (CA) are spatiotemporal discrete systems (Neumann, 1966) that can model dynamic complex systems. A variety of problem domains have been reported to date in successful CA applications. In this regard, digital image processing is one of those as reported by Wongthanavasu et. al. (Wongthanavasu et al., 2003; 2004; 2007) and Rosin (Rosin, 2006).
Generalized Multiple Attractor CA (GMACA) is introduced for elementary pattern recognition (Ganguly et al., 2002; Maji et al., 2003; 2008). It is a promising pattern classifier using a simple local network of Elementary Cellular Automata (ECA) (Wolfram, 1994), called attractor basin that is a reverse tree-graph. GMACA utilizes a reverse engineering technique and genetic algorithm in ordering the CA rules. This leads to a major drawback of computational complexity, as well as recognition performance. There are reports in successful applications of GMACA in error correcting problem with only one bit noise. It shows the promising results for the restricted one bit noise, but becomes combinatorial explosion in complexity, using associative memory, when a number of bit noises increases.
Due to the drawbacks of complexity and recognition performance stated previously, the binary CA-based classifier, called Two-class Classifier Generalized Multiple Attractor Cellular Automata with artificial point (2C2-GMACA), is presented. In this regard, a pattern recognition of error correcting capability is implemented comprehensively in comparison with GMACA. Following this, the basis on CA for pattern recognition and GMACA’s configuration are presented. Then, the 2C2-GMACA model and its performance evaluation in comparison with GMACA are provided. Finally, conclusions and discussions are given.
2. Cellular Automata for Pattern Recognition
Elementary Cellular Automata (ECA) (Wolfram, 1994) is generally utilized as a basis on pattern recognition. It is the simplest class of one dimension (1d) CA with n cells, 2 states and 3 neighbors. A state is changed in discrete time and space (
For n-cell ECA, the next state function (
Let M(i,j) be an element of the matrix at the i^{th} (i=0,1,2,...,n-1) row and the j^{th} (j=0,1,2,...,7) column. The M(i,j) is contained b_{j} of the rule-R_{i}. For example, M(2,3) is b_{3} of the rule R_{2} (the rule-90) that is ‘1’. Consequently, the next state (
where;
j_{i
} is the 3 neighbouring values (
The next state
Suppose a system designed with a rule matrix (M) comprises a set of solutions Y=
For an input
3. Generalized Multiple Attractor Cellular Automata
This section gives the detailed configuration of GMACA and its application in ECC. Suppose an n-bit pattern is sent in a communication system. Let X be the sender‘s pattern and Y be the receiver’s pattern. Thus, the number of different bits between X and Y is determined by Hamming distance (r) defined as follows:
where
The number of possible error patterns (p_{r }) for a given r of n-bit communication can be expressed as follow:
Then, the number of all possible error patterns (p_{All
}) for a given r_{max
}, where r_{max
}
The maximum permissible noise (r_{max }) is the highest value of r allowed to occur in the communication system. The Hamming distance model of a message (pattern) and it errors are also represented by an attractor basin—that is, the messages is a pivotal point while the errors are transient states. Thus, the error correcting codes can be solved by the Generalized Multiple Attractor Cellular Automata (GMACA).
Suppose a communication system comprises k original messages of n-bit data and the maximum permissible noise r_{max}. If error messages are corrected using the GMACA, thus a satisfied rule vector is required. The rule vector is a result of a reverse engineering technique. Firstly, k attractor basins are randomly constructed with the number of nodes for each attractor basin equals p_{All }. Then original messages are randomly mapped into pivotal points while its possible errors are also randomly mapped into transient states at the same attract basin. Finally, the search heuristics, such as simulated annealing (SA) and genetic algorithm (GA) (Holland, 1992; Shuai, et al., 2007; Jie, et al., 2002) have been taken to explore the optimal structure. The search heuristics then iteratively changes directions and height of the attractor basins until the satisfied rule vector is acquired.
As reported in Ganguly, et al., 2002, Maji, et al., 2003 and Maji, et al., 2008, the GMACA provides the best performance of pattern recognition if it is trained with the r_{max } having a value of 1. Although percentage of recognition in testing is high when deals with the r_{max } equals 1, it sharply decreases the recognition performance when the r_{max } is greater than 1.
4. Proposed 2C2-GMACA Model
Due to the drawbacks of recognition performance resulting from the increasing r_{max } and search space complexity in rule ordering, the proposed method, called Two-class Classifier Generalized Multiple Attractor Cellular Automata with artificial point (2C2-GMACA) (Ponkaew, et al., 2011; Ponkaew, et al., 2011), is introduced. The 2C2-GMACA is designed based on two class classifier architecture basis. In this regard, two classes are taken to process at a time and a solution is binary answer +1 or -1, which is a pointer to the class label of solution. There are two kinds of attractor basins: a positive attractor basin that returns the +1 as the result and a negative attractor basin, otherwise.
Suppose a system consists of patterns
where
(.) denotes “AND” logical operator.
Finally, the x is considered to be a member of the positive attractor basin and returns L^{+
} if
Example 1: Consider two attractor basins of 4-bit recognizer of 2C2-GMACA with periodic boundary condition given in Fig. 2, they are designed by a rule vector <232,212,178,142> representing in a matrix M, and an artificial point (A) of ‘0001’. Suppose a class label of the positive (L^{+
}) and the negative attractor basins (
where j_{i
} is the 3 neighbour values (
Finally, the binary decision function will process the
The function returns 1 meaning that the input
4.1. 2C2-GMACA with Associative and Nonassociative Memories
Given a set of patterns
According to a binary classifier, 2C2-GMACA conducts multiclass classification by DDAG (Decision Directed Acyclic Graph), One-versus-All, One-versus-One, etc., for example. However, this paper focuses on DDAG approach [28]. Suppose that a set of three patterns {y_{1}, y_{2}, y_{3}}, where
4.2. Design of Rule Vector
A majority voting rule is utilized to synthesize a rule vector for two attractor basins. It is one time step process which is different from a reverse engineering technique (Maji, et al., 2003; Maji, et al., 2008) using in GMACA. Reverse engineering technique continues reconstructing attractor basins randomly until arriving at the rule vector with the lowest collision. In this regard, 2C2-GMACA’s time complexity for ordering the rule is simply O(1). However, it must search for an optimal artificial point which applies evolutionary heuristic search. The 2C2-GMACA synthesis scheme comprises three phases as follows.
Phase I--- Two attractor basins, namely, positive and negative attractor basins, are generated. In this phase, two patterns {
Example 1: Fig. 3(a) represents two attractor basins based on associative memory learning of 4 bit patterns with r_{max
}=1. Suppose Y={1101, 0010} is a set of learnt patterns. The 2C2-GMACA takes two patterns {y
_{1}=1101, y_{2
}=0010} to process according to the multiclass classification algorithm. Let a class label of the positive
Phase II--- Let
Example 2: As shown in Fig. 3(b), two matrices
Phase III--- Rule matrix M is determined. The matrix M with size |nx8| is the simplified form of the rule vector (RV), while an element M (i, j) represents the next state for the i
^{
th
} cell, where the 3 neighbor
1) if
2) if
Fig. 3(c) shows that a rule vector <232, 212, 178, 142> is obtained by the majority voting technique. The rule vector (matrix rule) is utilized to evolve the given pattern in one time step to the pattern at the next time step which becomes one of parameters of the binary decision function.
4.3. Design of Artificial Point
An artificial point (A) takes a major role in the binary decision function. It interprets the next state (
Selection is done by using a random pairing approach and a traditional single point crossover is also performed by random at the same point of the n element array of the selected two parents. Mutation makes a small change in the bits in the list of a chromosome with a small percentage. The fitness function is calculated as a cost for each chromosome. It is created from a true positive (TP) and a false positive (FP) of the confusion matrix (Simon, et al., 2010) calculated by the below equation (8). The fitness function is given as following
The search space complexity for rule ordering of the 2C2-GMACA is the all possible patterns of the artificial point, 000…000 to 111….111, which is 2 ^{ n }, i.e. O(2 ^{ n }).
5. Performance Evaluation
This section reports performance evaluation of the proposed method in comparison with GMACA on a set of measured matrices consisting of search space and classification complexities, recognition percentage, evolution time for rule ordering, and effects of the number of pivotal point, permissible noises, p-parameter, pattern size on error correcting problem.
5.1. Reduction of Search Space
Given a set of learnt patterns
Theorem 1: In training phase, a search space complexity of the GMACA (
Proof: From the set
where G is the number of learnt patterns in each attractor basin previously defined by Cayley ‘s formula (Maji, et al., 2003) as follows:
where p is the number of possible transient states calculated from (6). Therefore, the above equation is defined following
It shows that search space complexity of GMACA is factorial growth
The search space complexity in Maji, et al., 2003 and Maji, et al., 2008 is examined under the h_{max}=2 and the r_{max}=1 as described below.
For the proposed 2C2-GMACA, the search space is the number of possible patterns (G) of artificial point: 000…000 to 111….111—that is; 2^{n}. Due to DDAG approach for multiclass classification algorithm, the machine consists of k(k-1)/2 binary classifier. Thus, the search space complexity of the 2C2-GMACA (S_{2C2-GMACA}) is:
When comparing the search space complexity between GMACA and 2C2-GMACA, we found that GMACA can only be implemented if it is considered at the h_{max}=2 and r_{max} =1, while 2C2-GMACA can be implemented whatsoever with the exact solution through heuristic search. This corresponds to the reports in Maji, et al., 2003 and Maji, et al., 2008, the GMACA provides the best performance of pattern recognition when it is trained with the r_{max}=1 and h_{max}=2. However, the percentage of recognition in testing is also high if the Hamming distance of patterns is less than or equal to 1 and it is decreased sharply when the Hamming distance is greater than 1.
5.2. Reduction of Classification Complexity
Theorem 2: In worst case scenario of learning based on associative memory model, the classification complexity of n-bit pattern for GMACA is O(n^{2}), while 2C2-GMACA is O(n).
Proof: In general, time spent in classifying n nodes of GMACA depends on an arrangement of nodes in attractor basins. At worst, the attractor basin is a linear tree. Thus, time for classifying n nodes is the summation of the number of traversal paths from each node to a pivotal point. For example, the number of traversal paths of a pivotal point is 0 while the n^{th}-node is (n-1). This can be solved by arithmetic series (
As being designed the height of attractor basis of 2C2-GMACA is limited to 1, the time of classifying n nodes is n
5.3. Performance Analysis of 2C2-GMACA on Associative Memory
Pattern classifiers based on an associative memory is independent from the number of patterns to be learnt, because all possible distorted patterns are generated into learning system. Suppose a set of pivotal points
5.3.1. Recognition and Evolution Time
This section reports recognition rate and evolution time for rule ordering between 2C2-GMACA and GMACA based on associative memory. Table 1 presents the recognition rate at different sizes of bit patterns (n) and the number of attractor basins (k). It generates patterns with maximum permissible noise in training phase (r_{max}) and testing with different sizes of noise r;
5.3.2. Effects of Number of Pivotal Points and Pattern Size
A pivotal point in 2C2-GMACA represents an original message in communication systems. Fig. 4 shows the effects of the number of pivotal points (k) in the recognition performance of the proposed 2C2-GMACA based on associative memory learning at a particular r_{max} and bit pattern. It shows that if is trained by r_{max}= 3 the recognition rate is almost 100% when the number of bit noises (r) is not greater than 5 no matter of the number of classes (k), and declined sharply when the number of bit noises increases. The less the number of classes, the better the recognition performance. Fig. 5 shows the effects of the number of bit pattern in recognition performance of the 2C2-GMACA based on associative memory learning by fixing r_{max} and the number of classes (k). In this regard, when the number of bit noises in testing increases, the recognition of different number of bit patterns decreases in distinguishable manner. The more the number of bit patterns, the less the recognition performance.
5.4. Performance Analysis of 2C2-GMACA on Non-Associative Memory
The memory capacity becomes a serious problem of pattern classifier based on an associative memory learning if the classfier deal with the high values of n, r_{max
} and k. It generates a large number of transient states. In ordet to solve this problem, the 2C2-GMACA based on non-associtive memory is presented. The transient states will be generated by randomly choosing bit noise
5.4.1. Effects of Maximum Permissible Noise and P-Parameter
In order to examine the effects of the maximum permissible noise r_{max
} on the error correcting problem of 2C2-GMACA based non-associative memory, two pivotal points are randomly generated and then the number of transient states is limited to some number
The effects of the number of transient states (p
6. Conclusions and Discussions
This chapter presents a non-uniform cellular automata-based algorithm with binary classifier, called Two-class Classifier Generalized Multiple Attractor Cellular Automata with artificial point (2C2-GMACA), for pattern recognition. The 2C2-GMACA is built around the simple structure of evolving non-uniform cellular automata called attractor basin, and classify the patterns on the basis of two-class classifier architecture similar to support vector machines. To reduce computational time complexity in ordering the rules, 2C2-GMACA is limited the height of attractor basin to 1, while GMACA can have its height to n, where n is a number of bit pattern. Genetic algorithm is utilized to determine the CA’s best rules for classification. In this regard, GMACA designs one chromosome consists of k-genes, where k is a number of classes (target patterns) to be classified. This leads to abundant state spaces and combinatorial explosion in computation, especially when a number of bit noises increases. For the design of 2C2-GMACA, a chromosome represents an artificial point which is consists of n-bit pattern. Consequently, the state space is minimal and feasible in computation in general pattern recognition problem. The 2C2-GMACA reduces search space for ordering a rule vector from GMACA which is O(n^{n}) to O(1)+O(2^{n}). In addition, multiple errors correcting problem is empirically experimented in comparison between the proposed method and GMACA based on associative and non-associative memories for performance evaluation. The results show that the proposed method provides the 99.98% recognition rate superior to GMACA which reports 72.50% when used associative memory, and 95.00% and 64.30% when used non-associative memory, respectively. For computational times in ordering the rules through genetic algorithm, the proposed method provides 7 to 14 times faster than GMACA. These results suggests the extension of 2C2-GMACA to other pattern recognition tasks. In this respect, we are improving and extending the 2C2-GMACA to cope with complicated patterns in which state of the art methods, SVM, ANN, etc., for example, poorly report the classification performance, and hope to report our findings soon.