Research on Medication Rules of Chronic Gastritis and Allergic Rhinitis Based on the Complex System Entropy Clustering Method

1.1 Mining principle of herbal combinations The highlight of this research is providing a appropriate statistical method to find out medication rules of chronic gastritis and allergic rhinitis, which will help us to guarantee clinical effects for this two diseases. Five Viscera Tonifying Method (FVTM) was established by Prof. Gao Zhongying, a national prestigious and experienced practitioner of Traditional Chinese Medicine (TCM). This method extends the implication of tonfiying method while making a break-through in traditional TCM theory. With this featured method in pattern identification and herbal prescription, Prof. Gao is famed for his significant clinical effects by using well-prescribed formula. Modified Lung Tonifying Decoction (LTD), a representative formula of five viscera tonifying method, is indicated and effective for various lung diseases. We employed complex system entropy cluster technique to mine the data of prescriptions by this prestigious and experienced TCM practitioner. It is clinically significant to explore prescription rules of herbal medicine in the treatment of lung diseases to guarantee clinical effects. According to statistics from World Health Organization (WHO), lung diseases have been one of the four leading diseases that pose great threatens to human health. Conventional treatment using western drugs proves to be advantageous yet its side-effects are hindering patient compliances and thus the clinical effects. There are over 1000 formula indicated for lung diseases, which have been developed in the 2000-year history of Traditional Chinese Medicine (TCM). Modern prestigious and experience TCM practitioners also have developed numerous their own effective formula based on inheriting the essence of prescription rules by ancient practitioners. To study the rules of these formulas is of great significance to facilitate the prescription rules of effective herbal treatment and to explore new formula for lung diseases. There have been very few studies on prescription rules of herbal medicine for lung diseases by prestigious and experienced TCM practitioners. This study employed complex system entropy cluster technique to mine the data of prescriptions of the prestigious and experienced TCM practitioner and to explore prescription rules of herbal medicine in the treatment of lung diseases to guarantee clinical effects.


Mining principle of herbal combinations
The highlight of this research is providing a appropriate statistical method to find out medication rules of chronic gastritis and allergic rhinitis, which will help us to guarantee clinical effects for this two diseases. Five Viscera Tonifying Method (FVTM) was established by Prof. Gao Zhongying, a national prestigious and experienced practitioner of Traditional Chinese Medicine (TCM). This method extends the implication of tonfiying method while making a break-through in traditional TCM theory. With this featured method in pattern identification and herbal prescription, Prof. Gao is famed for his significant clinical effects by using well-prescribed formula. Modified Lung Tonifying Decoction (LTD), a representative formula of five viscera tonifying method, is indicated and effective for various lung diseases. We employed complex system entropy cluster technique to mine the data of prescriptions by this prestigious and experienced TCM practitioner. It is clinically significant to explore prescription rules of herbal medicine in the treatment of lung diseases to guarantee clinical effects. According to statistics from World Health Organization (WHO), lung diseases have been one of the four leading diseases that pose great threatens to human health. Conventional treatment using western drugs proves to be advantageous yet its side-effects are hindering patient compliances and thus the clinical effects. There are over 1000 formula indicated for lung diseases, which have been developed in the 2000-year history of Traditional Chinese Medicine (TCM). Modern prestigious and experience TCM practitioners also have developed numerous their own effective formula based on inheriting the essence of prescription rules by ancient practitioners. To study the rules of these formulas is of great significance to facilitate the prescription rules of effective herbal treatment and to explore new formula for lung diseases. There have been very few studies on prescription rules of herbal medicine for lung diseases by prestigious and experienced TCM practitioners. This study employed complex system entropy cluster technique to mine the data of prescriptions of the prestigious and experienced TCM practitioner and to explore prescription rules of herbal medicine in the treatment of lung diseases to guarantee clinical effects.

Data mining methods based symptoms clustering
The syndrome is the basic pathological unit and the key concept in traditional Chinese medicine (TCM), and the herbal remedy is prescribed according to the syndrome a patient catches. Nevertheless, few studies are dedicated to investigate the number of syndromes in chronic heart failure (CHF) patients and what these syndromes are. In this paper, we carry out a clinical epidemiology survey and obtain 317 CHF cases, including 62 symptoms in each report. Based on association delineated by mutual information, we employed a pattern discovery algorithm to discover syndromes, which probably have overlapped symptoms in TCM. A revised version of mutual information is presented here to discriminate positive and negative association. The algorithm self-organizedly discovers 15 effective patterns, each of which is verified manually by TCM physicians to recognize the syndrome it belongs to. Therefore, we conclude that the algorithm provides an excellent solution to chronic heart failure problem in the context of traditional Chinese medicine. Heart failure (CHF) is the most terminal stages of cardiovascular disease to the clinical development of overall performance, with the improvement of living standards and the popularity of interventional cardiology techniques, the proportion of heart failure was increased gradually become a cause of coronary heart disease The main basis for heart failure (Lu&Zhong,2010). Activities by the degree of heart failure symptoms of impaired heart function status was assessed clinically NYHA classification is more commonly used by the patients cardiac function contribute to determine the extent of the disease Qing Qian and treatment options. At present, this area on TCM Syndrome research is still small. In this study, clinical epidemiology, collected coronary heart disease signs and symptoms of heart failure patients, through the heap of entropy together to compare the cardiac function in different situations and syndromes four diagnostic elements of the evolution of features, designed to further grasp the law of the disease syndromes, diagnosis and treatment for the disease to provide a basis for Chinese medicine. Data mining is a systematic approach used not only to identify biomarkers for a disease but also to investigate the cellular interaction in the context of a disease to construct biological networks. Data mining also has a crucial role to play in TCM-related research activities. By text mining, a branch of data mining approaches, the biological networks underlying cold and hot syndromes phenotypes are constructed by NEI specifications (Li et al.,2007).Similarly, through a combination of Chinese literatures on TCM and related English counterparts on most diseases on PubMed database, biological networks for a syndrome in TCM in the context of a disease can be automatically generated through text mining approaches (Zhou et al.,2007).In addition, several novel data mining approaches were presented to deal with various kinds of clinical or in vivo animal data. An unsupervised cluster algorithm called pattern discovery algorithm was developed to discover syndromes in TCM in the context of a disease,which provides the targets for formulae or prescriptions since they are prescribed based on syndromes diagnosed (Chen et al.,2007). Furthermore, animal models for syndrome in TCM in the context of diseases were built by using supervised data mining approach to'clone' diagnosis criterion from clinics to animals, which paves a way for in vivo experimental validation of a prescription.8 However, when applying data mining approach in TCM, few research efforts are made in research activities of TCM, it is important to investigate the role of data mining approaches in them.
In information theory, mutual information (MI) of two random variables is a measure that scales mutual dependence of the two variables. It has been applied in many fields, in which researchers treat as divergence or distance between two distributions.

Mining principle of herbal combinations 2.1.1 Formula source
All the formula comes from clinical prescriptions by Prof. Gao Zhongying. Prof. Gao Zhongying is a national prestigious and experienced TCM practitioner. He was born in a TCM family of generations and has been working in the fields of TCM clinics, teaching and research for 55 years. He has worked as the director of TCM internal medicine department and herbs & formula department. He has studied TCM theories and applied them in clinic with care before establishing Five Viscera Tonifying Method (FVTM). This method is a breakthrough in traditional TCM theory. It extends the implication of tonifying method and represents featured pattern identification and herbal administration. His formulas are wellprescribed with significant clinical effects. LTD has proven to be effective as a representative of FVTM. To mine the data of formula by Prof. Gao is mainly to collect and categorize LTD prescriptions in this study.

Establishment of formula database
To meet the requirements of data mining and analysis, processing and categorizing the data is a perquisite. Standardized terms were used as 'Final Term' to code the symptoms, signs, tests in the case records. Independent 'Pattern Elements' were summarized or extracted from the cases after standardizing the diagnosis, patterns and treatment according to international and textbook criteria. Phrase databases of patterns and treatments were thus finally established. A database of the medicinal used in the formulas was categorized by their classes, functions, prosperities and meridian entry. The names of medicinals are consistent with those used in the current 21 century textbook. A module of structured case records based on Access platform was established to collect the clinical data of Prof. Gao Zhongying. All the information about the patients were included using a national standard case record format and access database platform. Clinical data of the 389 cases were carefully recorded in details based on Systemic and Structured Data Entry Criteria in Collecting Clinical Information of Prestigious and Experienced TCM Practitioners. The structured clinical information and other data was included into the system and a Prof. Gao's clinical database was thus formed.

Data analysis method
Before introducing the algorithm, we give a rigorous definition to mutual information The definition of mutual information Suppose system Here our objective is to obtain some subsets which have some close properties from set X .
Ci k   , kq  , and let i n be quantity for a X belong to i -th class, then entropy of a X is defined as The joint entropy of , ab XX is similarly defined as where ij n is quantity for a X belong to i -th class of a C simultaneously b X belong to j -th class of b C . For the convenience of application, expressions (1) and (2) can respectively be represented as Having had above-mentioned definition of entropy, in what follows, correlative measure by which statistical dependence between a X and b X is denoted is defined by their mutual information. Definition 1. Correlative measure between two variables For arbitrary is called correlative measure among 1 X , 2 X ,…and p X .
We can also extend the definitions of correlative measure among variables to that of subsets of complex system. In fact, the variable itself is also one particular subset. Definition 3. Correlative measure among multi-subsystems Suppose system X be partitioned into m subsystems 12 ,,, m ss s  , for arbitrary , is called correlative measure among 12 ,,, m ss s  .
Let us consider nonempty finite set X and set-family   EX consisted of its subsets, P is a set-function defined on   EX with properties: i. ii.
The complex entropy cluster algorithm www.intechopen.com The algorithm is detailedly presented in (Chen et al.,2007), we also present it here. Once association for each pair (every two variables) is acquired, we propose a self-organized algorithm to automatically discovery the patterns. The algorithm can not only cluster, but also realize some variables a p p e a r i n s o m e d i f f e r e n t p a t t e r n s . I n t h i s s e c t i o n , w e u s e t h r e e subsections to introduce the algorithm. The first introduce the concept of "Relative" set. Based on this, the pattern discovery algorithm is proposed in second subsection. The last subsection is devoted to presenting an n-class association concept to back up the idea of the algorithm. For a specific variable X , a set, which is collected by mean of gathering N variables whose associations with X are larger than others with regard to X , is attached to it and is denoted as () RX . Each variable in the set can be regarded as a "Relative" of X while other variables that not belong to the set are considered as irrelative to X , so we name () RX "Relative" set of X . The "Relative" sets of all 20 variables can be denoted by a 20 N  matrix. Based on the matrix, the pattern discovery algorithm is proposed. A pair (variable X and Y ) is defined to be significantly associated if and only if X belongs to It is convenient to extend this definition to a set with multiple variables. If and only if each pair of these variables is significantly associated, then we can call that the set is significant associated. A pattern is defined as a significantly associated set with maximal number of variables. All these kinds of sets constitute the hidden patterns in the data. Therefore, a pattern should follow three main criteria: (1) the number of variables within a set is no less than 2. (2) Each pair of the variables belong to a set is significantly associated. (3) Any variable outside a set can not make the set significantly associated. This means the number of variables within the set reaches maximum. We defined that two variables X and Y are correlated if and only if they are inter-relative, i.e., X is a 'relative' of Y and vice versa. It is convenient to extend this definition to the case with multi-variables, if each pair between these variables is correlated, then we called that they are correlated. A set that is comprised of maximal variables in which each pair is correlated is defined as a pattern and all sets constitute the hidden patterns in the data acquired above.

Diagnosis criteria
(1) coronary artery disease with reference to the International Society of Cardiology and the Society, named after the World Health Organization standardized clinical report of the Joint Task Team, "named after ischemic heart disease and diagnostic criteria"; (2) diagnosis of heart failure based on diagnosis of chronic heart failure in China in 2007 treatment guidelines; cardiac function with reference to the New York Heart Association (NYHA) 1928 annual standard.

Inclusion criteria
Coronary heart disease and chronic heart failure meet the above diagnostic criteria, older than 18 years of age and less than or equal to 80 years of age and informed consent, patients participated in this study. www.intechopen.com

Exclusion criteria
Exclusion criteria isBy the expansion of heart disease, pulmonary heart disease, rheumatic heart disease, cardiomyopathy, congenital heart disease and other heart disease due to heart failure patients; with acute myocardial infarction, cardiogenic shock, severe arrhythmias associated with hemodynamic changes in persons; concurrent infection : fever; blood increased, white blood cell count> 10 × 109 / L, neutrophils> 85%; chest X-ray shadows suggestive of sheet; with severe hepatic insufficiency (liver function values> normal 2 times), renal insufficiency (Ccr> 20%, Scr> 3mg/dl or> 265μmol / L), blood system, the primary disease, malignant tumor; pregnancy or breast-feeding women; mental illness, infectious diseases

Survey methods
Survey methods of clinical epidemiology, screening results in the literature and two rounds of preliminary questionnaire was developed based on expert clinical four diagnostic information collection form, in patients with heart failure collect demographic data, present illness, symptoms and signs, and tongue, veins and other information.

Quality control
Clinical Hospital, prior to the survey of the designated person responsible for, the research group to develop the work of the researchers involved in the study manual for the doctor and unified training. Establish Epidate3.1 database, all cases investigated by double data entry.

Mining principle of herbal combinations
By using the following data analysis methods, the corresponding results are given in Table 1 to Table 3 as well as Figure 1.   ) formulates a representative formula for tonifying lung qi. It was originally indicated for Lao Sou (consumptive cough) due to five viscera deficiency pattern manifested as afternoon tidal fever, spontaneous sweating or nigh sweating, cough with sputum, dyspnea with panting. In the formula, Ren Shen and Huang Qi are sweet and warm in nature. They are used to tonify qi, supplement the defense aspect and secure the exterior. The two medicinals are targeted at deficiency of spleen and lung. Di Huang is to supplement kidney essence to supply qi, tonify the lower to supplement the upper part of the body. It is also used to supply water to moisten the lungs and remove the deficiency dryness in the upper source. Wu Wei Zi and Zi Wan can astringe the lungs and moisten the dryness to relieve dryness and coughing. Sang Bai Pi clears the heat and calms down the reverse qi to resolve phlegm and stop coughing. These medicinals used together are to tonify the spleen and kidneys, moisten the dryness and stop coughing. That's how it works in most cases in the clinic.
In terms of the number of medicinal, Prof.Gao tends to use a few medicinals with specific targets. On average there are around 12 medicinals in each of his formula. Table 1 shows that Prof. Gao sticks to the original ingredients of the formula. The commonly-used medicinals ranked in the first 6 places are originally used in the formula. Modification is often used for specific symptoms. For example Ban Xia (Rhizoma Pinelliae) and Gua Lou (Fructus Trichosanthis) are combined to strengthen the effects of phlegm resolving. Table 2 shows that Prof. Gao tends to use Xin Yi (Flos Magnoliae) combined with Cang Er (Fructus Xanthii) to open the nasal orifice, Wu Wei Zi (Fructus Schisandrae Chinensis) combined with vinegar processed Chai Hu (Radix Bupleuri) to antagonize allergy. The results indicate that it is significant to mine data of prescriptions by prestigious and experienced TCM practitioners by using complex system entropy cluster technique. Modified LTD is effective for various lung diseases, yet it has not been included by the current formula textbook. There has been few literature and clinical studies on this formula as well. According to incomplete statistics, Prof. Gao has used the modified LTD to treat Chronic Obstructive Pulmonary Diseases (COPD), Idiopathic Pulmonary Fibrosis (IPF), anthrasilicosis, bronchiectasis, allergic asthma and rhinitis besides coughing and panting in the clinic. Most attention should be drawn to this formula by the clinicians to improve clinical effects.

Data mining methods based symptoms clustering
Information collection form will be provided by the signs and symptoms of each patient were statistically and found 317 cases of heart failure in patients with systemic symptoms Shenpi fatigue (100%), shortness of breath (93%), less gas lazy words (71 %), spontaneous (39%), chills (39%), five upset hot (27%) majority; head, face the common symptoms of dizziness (73%), lips cyanosis (58%), dark complexion (22%); mind and flank the chest symptoms (88%), palpitations (76%), wheezing (57%), chest pain (37%), expectoration (30%) were more; stomach and abdominal symptoms of bloating (48 %) was the most common; waist and limbs Yaoxisuanruan common symptoms (73%), limb trapped weight (51%), and edema (55%), hand, foot and not warm (39%); restaurants and taste of symptoms to loss of appetite (59%), dry mouth (48%), sticky mouth (27%) were more symptoms of sleep and the two will be to insomnia (65%), nocturia (38%) most common; tongue in order to sublingual vein abnormalities (61%), tongue dark (55%), crack the tongue (34%), tongue pink (24%), less or no moss moss (20%), fat large indentation tongue (19% ) is more common; pulse late in the common pulse (49%), pulse astringent (21%), promoting Pulse (17%), pulse knot generation (10%). By using pattern discovery algorithm, the 15 patterns were given in Table 4. In heart failure have been recorded, including early, carved in Chinese medicine is a "heart palpitations", "Tan Yin" and "asthma card", "edema", "accumulation" and other areas. On the characteristics of the basic pathogenesis of heart failure, there are different points of view, Wong (Li et al.,2007) that the disease is mainly responsible for deficiency in the heart failure, falling seedlings lung, spleen, kidney all dirty and wet phlegm from each breeding Results from the heart Qi-oriented, blood stasis, water to drink as standard; Wang (Wang,2005) that the deficiency and yang-oriented virtual, blood stasis, water resistance, phlegm as the standard implementation of the levy; Chen (Li&Chen,2006) will be virtual, stasis, water summarized as the basic pathogenesis of the disease. 200 patients from four diagnostic methods of frequency analysis results, the heart failure patients often show the signs and False or True, false to qi deficiency, yang deficiency, yin deficiency is a common, real to stasis, water, sputum-based, I believe that deficiency may be associated with their hypertension, diabetes and other primary diseases related to each other based on the root theory of yin and yang, yang deficiency and yin, yang to a certain extent patients can occur when the signs of deficiency, it also Coronary Heart Yin decline in the basic pathogenesis of the important part. Entropy algorithm for the clustering together of the heap in one, with the traditional method compared to non-supervised clustering, which is characterized by improved correlation between two variables method, effectively avoid the interference of negative data, through the calculate the two bivariate correlation coefficients between each variable designated the "friends group", after convergence, go to "friends group" within the limited number of variables, leaving the variables must be close to each other with higher levels of combination, this data extraction process and we are in the clinical symptoms and signs by a certain type of information gathering to designate property Panduan a similar card, therefore, become the method to explore between the clinical symptoms and Syndrome internal laws of the more common and objective mathematical method. In this study, patients with different cardiac function four diagnostic variables together elements of the heap after the analysis of the syndrome of basic and clinical match, but the law also has some defects, such as better together to make the variable into a class, we screened the raw data, which may lose some useful information. In conclusion, this study together after the data, while the heap is not fully reflect the clinical, but at least clinical symptoms of the disease trends in the evolution provide some reference.