Selected Algorithms of Computational Intelligence in Gastric Cancer Decision Making

Due to the latest research the subject of Computational Intelligence has been divided into five main regions, namely, neural networks, evolutionary algorithms, swarm intelligence, immunological systems and fuzzy systems. Our attention has been attracted by the possibilities of medical applications provided by immunological computation algorithms. Immunological computation systems are based on immune reactions of the living organisms in order to defend the bodies from pathological substances. Especially, the mechanisms of the T-cell reactions to detect strangers have been converted into artificial numerical algorithms. Immunological systems have been developed in scientific books and reports appearing during the two last decades. The basic negative selection algorithm NS was invented by Stefanie Forrest to give rise to some technical applications. We can note such applications of NS as computer virus detection, reduction of noise effect, communication of autonomous agents or identification of time varying systems. Even a trial of connection between a computer and biological systems has been proved by means of immunological computation. Hybrids made between different fields can provide researchers with richer results; therefore associations between immunological systems and neural networks have been developed as well. In the current chapter we propose another hybrid between the NS algorithm and chosen solutions coming from fuzzy systems. This hybrid constitutes the own model of adapting the NS algorithm to the operation decisions “operate” contra “do not operate” in gastric cancer surgery. The choice between two possibilities to treat patients is identified with the partition of a decision region in self and non-self, which is similar to the action of the NS algorithm. The partition is accomplished on the basis of patient data strings/vectors that contain codes of states concerning some essential biological markers. To be able to identify the strings that characterize the “operate” decision we add the own method of computing the patients’ characteristics as real values. The evaluation of the patients’ characteristics is supported by inserting importance weights assigned to powerful biological indices taking place in the operation decision process. To compute the weights of importance the Saaty algorithm is adopted.

In the current chapter we propose another hybrid between the NS algorithm and chosen solutions coming from fuzzy systems (Rakus-Andersson, 2007, 2010a, 2010b, 2011Rakus-Andersson & Jain, 2009). This hybrid constitutes the own model of adapting the NS algorithm to the operation decisions "operate" contra "do not operate" in gastric cancer surgery. The choice between two possibilities to treat patients is identified with the partition of a decision region in self and non-self, which is similar to the action of the NS algorithm. The partition is accomplished on the basis of patient data strings/vectors that contain codes of states concerning some essential biological markers. To be able to identify the strings that characterize the "operate" decision we add the own method of computing the patients' characteristics as real values. The evaluation of the patients' characteristics is supported by inserting importance weights assigned to powerful biological indices taking place in the operation decision process. To compute the weights of importance the Saaty algorithm (Saaty, 1978) is adopted. We introduce the medical task to solve in Section 2. In order to establish the code systems for clinical data the fuzzification of biological markers is discussed in Section 3. In Section 4 we analyze the way of determining the patient characteristics, which should connect the mix of different codes in one value. The adaptation of the NS algorithm to surgery assumptions is made in Section 5. Finally, in Section 6 we test clinical data to prove the action of the model introduced in the paper as an applicable novelty.

The description of the medical objective in gastric cancer surgery
Gastric cancer patients are mostly cured by operating on them. Different types of surgery are taken into account. Two of them, namely, the partial resection surgery contra the radical surgery are considered by surgeons when evaluating biological markers in the context of their deviations from normal values (Do-Kyong Kim et al., 2009;de Mello et al., 1983).
Nevertheless, a surgeon often must decide if any operation on a patient is possible. The choice between the status "operate" and "do not operate" will constitute the main problem to solve by engaging different algorithms with their origins in Computational Intelligence. The selection will be made on the basis of three biological markers listed as X = age, Y = CRP-value (C reactive proteins), and Z = body weight (Do-Kyong Kim et al., 2009;de Mello et al., 1983). These are considered as the most important indices in gastric cancer surgery decision making.
As a leading method, which should provide us with decisions "operate" against "do not operate", we adapt the NS (Negative Selection) algorithm of immunological computation. To comprehend better some associations between the body immunological system and artificially invented algorithms based on the body protection system let us recall the most essential definitions of immunity.
Immunity refers to the condition, in which the organism can resist diseases. A broader definition of immunity is a reaction to foreign substances (pathogens). The biological immune system (BIS) has the ability to detect foreign substances and to respond them. One of the main capabilities of the immune system is to distinguish own body cells from foreign substances, which is called self/non-self discrimination (Dasgupta & Nino, 2008;Engelbrecht, 2007;Forrest et al., 1997).
This particular ability is assigned to a special kind of lymphocytes called T-cells produced in the bone marrow. The T-cells can differentiate own body cells from pathogenic cells; therefore they play the role of detectors. Both own cells belonging to the self region and foreign pathogen cells forming non-self domain have their special characteristics given in the form of vectors of coded or measured properties.
Let us adapt the meaning of distribution into self and non-self in the medical application sketched as follows. To make a decision concerning an individual patient we assign the immunological region of self to "operate", whereas the non-self field will be identified with "do not operate" (Rakus-Andersson, 2011).
To be able to use the self/non-self discrimination, accomplished by the NS algorithm, we need to create vectors of coded patient data. The own fuzzy technique will be involved to divide reference sets of X, Y and Z into subintervals assisting growth levels of these biological indices. To the subintervals, in turn, the codes are added. We will arrange a code vector assisting the patient's features after examining his/her values of X, Y and Z.
When implementing the NS algorithm we assume that vectors characteristic of self region are available as input data. We do not intend to test too many casual vectors to decide their similarity with self vectors since we want to use the NS algorithm in the effective way. We thus try to generate the strongest population of strings being representatives of the self region.
In order to select this population we provide another own algorithm that converts the code vector to a real value. This value will be recognized as "characteristics of the patient".

Fuzzification of X, Y and Z in the creation of code vectors
Before studying the technique of making the self/non-self discrimination to state if the patient can be operated or not we should first be able to compare different strings v = (x = age, y = CRP, z = body weight), xX, yY, zZ, to decide their grades of affinity (coverage). We thus should design sets of codes for each biological parameter.
The markers age, CRP, and body weight are measurable features. Hence, we intend to determine the collections of codes assisting intervals, which correspond to the markers' levels. We want to accomplish a process of fuzzification of the measurable markers in order not to decide lengths of the level intervals intuitively , 2010a, 2010b, Rakus-Andersson & Jain, 2009.
A fuzzy set, say, A in the universe X is a collection of elements followed by the membership degrees that are computed by means of the membership function : A is called normal if at least one element in the set A is assigned to the membership degree equal to 1. The support of A is a non-fuzzy set that consists of elements accompanied by membership degrees greater than 0.
The three quantitative markers X, Y and Z will be then differentiated into levels expressed by lists of terms. The terms from the lists are represented by fuzzy sets (Rakus-Andesson, 2007, 2010b In conformity with the physician's suggestions we introduce five levels of X , Y and Z as the collections X = "age" = { 1 X = "very young", 2 X = "young", 3 X = "middle-aged", 4 X = "old", 5 X = "very old"} Y = "CRP-value" = { 1 Y = "very low", 2 Y = "low", 3 Y = "medium", 4 Y = "high", 5 Y = "very high"} and Z = "body weight" = { 1 Z = "very underweighted", 2 Z = "underweighted", 3 Z = "normal", 4 Z = "over weighted", 5 Z = "very over weighted"}. To accomplish a formal mathematical design of level restrictions let us study the special own technique of their implementations (Rakus-Andersson, 2007, 2010b. In general, we suggest that the linguistic list of terms is converted to a sampling of fuzzy sets L 1 ,…,L m , where m is an odd positive integer. Each term is represented by the corresponding fuzzy set, whose restriction is supposed to be created as the common formula depending on the l th value, where l = 1,…,m. We assume that supports of restrictions ( ) l L w  , l = 1,…,m, will cover parts of the reference set L = [min(L 1 ),max(L m )], w  L. We introduce E = L as the length of L.
We divide all expressions L l in three groups, namely, a family of "leftmost" sets L 1 ,…, 1 ( 2) 2 2 1 2 2 ( 2) 2 ( 2) 2 2 ( For the "leftmost" family L 1 ,..., 1 2 m L  we make suggestions that the top segments of functions lying on the membership level 1 will have the same lengths. Moreover, the last "left" function ( 1) Since the beginning of 1 2 m L  is planned to be placed in (min(L 1 ), 1) then All constraints characteristic of the "leftmost" family of fuzzy sets will be given after inserting parameter Parameter (t) takes the value of 1 for t = 1 2 m , which means that ( 1 2 m ) in (5) has no influence on the shape of the last left function. However, the introduction of (t) in (5) induces the narrowing effects in the supports of the other left function shapes. To preserve the same lengths of upper segments corresponding to membership 1 and middle segments attached to membership 0.5 we adjust (t), assisting the left function L t , to be equal to 1 2 1 m multiplied by the function number t.
In order to start the implementation of the "rightmost" family functions let us note that the first right function Hence, the membership function To generate the "rightmost" family of sets m , which will be inserted in (6). The construction of (t), when comparing to the creation of (t), is authorized by the fact that t = 1 should be followed by (1) = 1, whereas t = 1 2 m is helped by ( 1 2 m ) = 2 1 m . Formula (7) constitutes a common base for deriving membership functions The functions of fuzzy sets L 1 ,...,L m intend to maintain the same distances on the membership level 0.5. This property allows assigning to L 1 ,...,L m the relevant parts of their supports possessing the same length. The relevant parts of fuzzy sets consist of the sets' elements that reveal the membership degree values greater than or equal to 0.5. When forming the supports of the same length, in turn, we warrant the partition of [min(L 1 ),max(L m )] in equal subintervals standing for L l levels, l = 1,...,m. Apart from that, the "leftmost" and "rightmost" functions also keep the same distances on the membership level 1. This feature provides us with a harmonious arrangement of function shapes.
All steps of the discussed algorithm, which initiates three sets of membership functions corresponding to a list of terms, can be sampled in the block scheme. We need to follow the steps of the scheme together with formulas (3), (5) and (7) to write the excerpt of a computer program. We emphasize that the only data, used in the algorithm, are the length of the reference set and the number of functions. We do not need to specify the sets' borders in the process of the program initialization, as most of programmers do, since the borders are computed automatically by formulas (3), (5) and (7). The steps of the algorithm flow chart are sampled in Fig. 1.
The procedure discussed above has started introduction of membership functions typical of levels of X, Y and Z which, in turn, represent age, CRP and body weight.
The "in the middle" X-level "middle-aged" has, in accord with (3) All levels of X are sketched in Fig. 2.
The parts of X 1 -X 5 supports should be consisted of elements, which have the strongest connections with the X 1 -X 5 fuzzy sets. Therefore we only select the elements having the membership degrees greater than or equal to 0.5.
We have chosen these intervals, which contain elements of X 1 furnished with membership degrees greater than or equal to 0.5. For t = 2, set in two first intervals of (8), we aggregate 33.33(0.5 2) x   and 33.33(0.  We are furnished with the same intervals after accomplishing the close analysis of Fig. 2 on the membership level 0.5.
Let us now initiate the associations among the terms of X, characteristic intervals of these terms and assigned to them codes due to the scheme name of X-level representative interval code X 1 0-20 0 X 2 20-40 1 X 3 40-60 2 X 4 60-80 3 X 5 80-100 4 We emphasize the role of an elegant mathematical design of X's membership functions, which allows making the partition of the X-domain in equal intervals. Definitely, we obtain the same results when dividing the length of X by the number of levels to get a length of one part but the effects computed by means of membership functions only confirm this intuitive calculation. Moreover we can modify the arbitrary lengths of X-subintervals by making changes in the formulas of (t) and (t). If we collect clinical data, concerning a patient examined then we will be now capable to create code vectors taking place in the discrimination NS algorithm.

Example 1
An eighty one-year-old man, whose CRP is 17 and weight is 91, will be given by the vector v = (4, 1, 3).
In order to measure the affinity (coverage) of two code vectors v 1 and v 2 of the same length over the same alphabet we are furnished with the r-contiguous bit matching rule, which provides us with a true match(v 1 , v 2 ) if v 1 and v 2 agree in r contiguous locations.

The selection of the most representative data vectors for the decision "operate"
We have already mentioned that we need the "operate" types of patient data vectors as the entries of the NS discrimination algorithm. We thus want to prepare typical data strings for the decision "operate" in advance.
Let us first treat the vector v = (x, y, z) as the string of integers v = (x y z), where x, y and z can take the code values 0, 1, 2, 3, 4. We form the function f(x y z) = x + y + z to measure the common code value of the data vector. To make the selection of "operate" type vectors even more accurate let us assign the weights of power-importance to the biological indices considered in the operation decision. In the gastric cancer operation decision we first concentrate our attention on the changes of CRP-values, which points out CRP as the most decisive factor. The analysis of CRP is followed by the judgment of age and, finally, we check the values of body weights. Hence, we state the ranking of the symptom importance as CRP age body weight   , provided that  means "more important than".
A procedure for obtaining a ratio scale of importance for a group of m elements (in the considered case -biological markers) was developed by Saaty (Saaty, 1978). Assume that we have m objects (symptoms) and we want to construct a scale, rating these objects as to their importance with respect to the decision. We ask a decision-maker to compare the objects in paired comparison. If we compare object j with object k, j, k = 1,...,m, then we will assign the values b jk and b kj as follows 2. If objective j is more important than objective k then b jk gets assigned a number according to the following scheme: Intensity of importance Definition expressed by the value of b jk 1 Equal importance of x j and x k 3 Weak importance of x j over x k 5 Strong importance of x j over x k 7 Demonstrated importance of x j over x k 9 Absolute importance of x j over x k 2, 4, 6, 8 Intermediate values If object k is more important than object j, we assign the value of b kj .
Having obtained the above judgments an m  m importance matrix is constructed. The importance weights are decided as components of this eigenvector that corresponds to the largest in magnitude eigenvalue of the matrix B.

Example 3
For priorities Y CRP X age Z body weight      we determine the contents of B as 1 3 The largest eigenvalue ( = 3.033) of B has the associated eigenvector V = (0.37, 0.92, 0.15). V is composed of coordinates that are interpreted as the importance weights w 1 , w 2 , w 3 sought for X, Y, Z.
Let us rearrange the form of function f by adding the weights of importance to the vector code values. The new pattern of f is designed as f(x y z) = w 1 x + w 2 y + w 3 z. The function value yields the patient's characteristics given by a combination of codes stated for different symptoms.

Example 4
The patient vector v = (3, 1, 2) has the characteristics (312) 0.37 3 0.92 1 0.15 2 2.33 Due to the physician's expertise we assume that we can operate patients who are characterized by codes of age equal to 1, 2 and 3, codes of CRP recognized as 0, 1 and 2 and codes of body weight determined as 1, 2 and 3. The minimal patient characteristics to be operated is thus (10 1) 0.37 1 0.92 0 0.15 1 0.52 f        , whereas the maximal data characteristics, classifying the patient for the operation is given by (3 2 3) f  0.37 3 0.92 2 0.15 3 3.4       . We conclude that the patient who is capable to be operated should have the characteristics f(x y z) included in the interval [0.52, 3.4]. It is worth emphasizing that the decisions are made with respect to the decisive power of biological markers age, CRP and body weight.
The flow chart, sketched in Fig. 3, will show the selection of vectors typical of the decision "operate".
The vectors v 1 , v 2 , v 3 and v 4 will be included in the experimental population of representative data strings for the positive decision of the operation. We intend to use them in the next part of the chapter, when discussing the action of the NS algorithm adapted to the operation model.

The negative selection algorithm
After coding the patient data and selecting the initial data, which are given into account for the decision "operate", we can make a choice between two alternatives concerning the cure START Introduce v =(x,y,z) Compute f (x,y,z) f (x,y,z) Fig. 3. The flow chart of the selection of "operate" type vectors of gastric cancer patients. We intend to adapt the technique of an immunological algorithm based on the T cell behaviour. We use the negative selection algorithm NS proposed by Forrest (Forrest et al., 1997).
The goal of NS is to cover the non-self space with a set of detectors. For the sake of the surgery aim, already outlined in Section 2, the algorithm should lead to discrimination of the statements "operate" and "do not operate" provided that vectors characteristic of type "operate" are available. This assumption is motivated by the surgeon's intention to cure the patient from his/her cancer disease by making surgery if the patient's state allows accomplishing it. The patient data reports, which register his/her parameters in the case of operating, are clearly interpretable. However, the physician can have some doubts when he denies an operation for the patient. Therefore we have used the strings confirming the "operate" decision as the more convincing vectors in the entrance of NS.
We distinguish two steps in the surgery NS algorithm prepared on the basis of the general NS (Dasgupta & Nino, 2008;Engelbrecht, 2007;Forrest et al., 1997): 1. Generation of detectors, which should possess the property vectors corresponding to the decision "do not operate" on a patient. These strings are not recognized as obviously as the strings of "operate"; that is why we get some help from the algorithm in generating their patterns. 2. Selection of the surgery settlements "operate" or "do not operate" for any patient data vector due to the matching criterion concerning detectors.
In the first step a set of detectors is generated. To accomplish this task we use as an input a collection of vectors found by the method of preparing "operate" strings, which have been discussed in Section 4. Candidate detectors that match any of the "operate" type vector samples are eliminated whereas unmatched ones are kept. We adopt the r-contiguous bit matching rule for the patient data vectors as a measure of "the distance" between the "operate" type and the "do not operate" decision.
In the second step of NS the stored detectors, generated in the first stage, are used to check whether new incoming samples of patient data vectors correspond to the "operate" type or to the "do not operate" type. If an input sample, characterizing a patient, matches any detector then the patient should not be operated. When we cannot find a match between detectors and the incoming patient data vector it will mean that the decision about the surgery should be made. Figure 4 collects all steps of the surgery NS algorithm in the flow chart.

The surgery decision based on the NS algorithm
We wish now to follow the steps of the surgery NS algorithm to study its action in practical decision cases concerning the operation decision.
Let us thus go through the following example.
The vectors emerge the clinical data concerning elderly patients whose the CRP-values are not very high. The patients' weights are not radically deviated from normal standards either. Hence, they have been operated in conformity with the surgeon's determination.
We now wish to generate the set D of four detectors d 1 , d 2 , d 3 , d 4 that should not match any of v j , j =1,…,4. At the beginning of the procedure D is an empty set.
To measure the match grade between v j and candidates to be detectors we state, e.g., r = 2 in the r-contiguous bit matching rule.
Since d matches v 3 and v 4 then it cannot be classified as a detector.
Step 3. Operation decision making In the second phase of the algorithm we test data strings to organize them in either the "operate" type or in the "do not operate" type decisions. If the data vector matches any detector from D then the decision is made as "do not operate" (the non-self region). Otherwise, for all false matches between the data vector and d k , k = 1,…,4, we accept the operation (the self region).
As all matches to detectors are false we conclude the performance of surgery (decision "operate").
Vector v converges to two detectors, which means the decision to be referred to "do not operate".
By setting r = 2 in the contiguous bit matching rule we have preserved a margin of imprecision in decision making, since we do not demand all contiguous vector codes to be equal. This gives a certain chance of operating for the patients whose mix of biological indices cannot be precisely judged. For r = 3 the decision will be quite strict.
The method of making medical decisions by means of immunological systems is an applicable novelty. The example has a more didactic and experimental meaning than a real medical investigation. If we really want to use the method for making decisions in the surgery discipline we should, at first, extend the length of data strings by introducing more biological markers. A very dense set of initial vectors from "self" ("operate") ought to be chosen by the algorithm belonging to Section 4. Nevertheless, the proposal of combining fuzzy systems and weighted characteristics of vectors with the NS algorithm to create the hybrid can start a new applied domain in medicine.

Conclusion
In the process of creation of a new medical application model we have inserted some elements of fuzzy systems into the negative selection immunological algorithm. This hybrid, attached to two disciplines of Computational Intelligence, has found a practical application in surgery decision making. As self and non-self constitute two regions of the NS partition of objects then we could identify these regions with decisions "operate" against "do not operate" in the case of curing gastric cancer patients. The action of the modified NS could help us to determine the surgery or its lack for individual patients with respect to their clinical data entry vectors.
To make the action of the NS algorithm more efficient we have complemented the method by preparing the population of the most representative vectors standing for the "operate" type. The vectors have been converted to real values giving the common characteristics of a patient.
In that characteristics the weights of importance, assigned to biological markers, will play the essential role in the final judgment of the vectors' influence on the decision "operate".
We wish to add that the excerpts from fuzzy systems, involved in NS, come from own research, which has been concentrated on the creation of compact parametric formulas. These formulas concern the generation of a family of membership functions without predetermining their borders in advance.
All parts of the methodology have been prepared in the form of numerical algorithms given by flow charts. This allows composing a common computer program to test large samples of vectors in a real clinical application.
We emphasize that the proposal is a novel contribution in medical applications and should be still tested on larger samples of data. We can expect that, in future investigations, an introduction of the neural artificial perceptron model instead of the NS algorithm will provide us with similar results concerning surgery decisions. As an extension of the model we also wish to adapt the real-value negative selection algorithm in order to insert measured values of biological markers in data vectors instead of codes. This procedure should improve the reliability of a decision. Having results from more models we can select the most efficient one to work on its further development.

Acknowledgment
The author thanks the Blekinge Research Board in Sweden for the grant funding the current research. The author is also grateful to Associate Professor Henrik Forssell for supporting these investigations with medical advice and data.